Tuesday, December 4, 2012

hashcode() and equals() method

The methods hashCode() and equals() play a distinct role in the objects you insert into Java collections. The specific contract rules of these two methods are best described in the JavaDoc. Here I will just tell you what role they play. What they are used for, so you know why their implementations are important.

equals()

equals() is used in most collections to determine if a collection contains a given element. For instance:
List list = new ArrayList();
list.add("123");

boolean contains123 = list.contains("123");
The ArrayList iterates all its elements and execute "123".equals(element) to determine if the element is equal to the parameter object "123". It is the String.equals() implementation that determines if two strings are equal.
The equals() method is also used when removing elements. For instance:
List list = new ArrayList();
list.add("123");

boolean removed = list.remove("123");
The ArrayList again iterates all its elements and execute "123".equals(element) to determine if the element is equal to the parameter object "123". The first element it finds that is equal to the given parameter "123" is removed.
As you can see, a proper implementation of .equals() is essential for your own classes to work well with the Java Collection classes. So how do you implement equals() "properly"?
So, when are two objects equal? That depends on your application, the classes, and what you are trying to do. For instance, let's say you are loading and processing Employee objects stored in a database. Here is a simple example of such an Employee class:
public class Employee {
    protected long   employeeId;
    protected String firstName;
    protected String lastName;
}
You could decide that two Employee objects are equal to each other if just their employeeId's are equal. Or, you could decide that all fields must be equal - both employeeId, firstName and lastName. Here are two example implementation of equals() matching these criterias:
public class Employee {
  ...
  public boolean equals(Object o){
    if(o == null)                return false;
    if(!(o instanceof) Employee) return false;

    Employee other = (Employee) o;
    return this.employeeId == other.employeeId;
  }
}
public class Employee {
  ...
  public boolean equals(Object o){
    if(o == null)                return false;
    if(!(o instanceof) Employee) return false;

    Employee other = (Employee) o;
    if(this.employeeId != other.employeeId)      return false;
    if(! this.firstName.equals(other.firstName)) return false;
    if(! this.lastName.equals(other.lastName))   return false;

    return true;
  }
}
Which of these two implementations is "proper" depends on what you need to do. Sometimes you need to lookup an Employee object from a cache. In that case perhaps all you need is for the employeeId to be equal. In other cases you may need more than that - for instance to determine if a copy of an Employee object has changed from the original.

hashCode()

The hashCode() method of objects is used when you insert them into a HashTable, HashMap or HashSet. If you do not know the theory of how a hashtable works internally, you can read about hastables on Wikipedia.org.
When inserting an object into a hastable you use a key. The hash code of this key is calculated, and used to determine where to store the object internally. When you need to lookup an object in a hashtable you also use a key. The hash code of this key is calculated and used to determine where to search for the object.
The hash code only points to a certain "area" (or list, bucket etc) internally. Since different key objects could potentially have the same hash code, the hash code itself is no guarantee that the right key is found. The hashtable then iterates this area (all keys with the same hash code) and uses the key's equals() method to find the right key. Once the right key is found, the object stored for that key is returned.
So, as you can see, a combination of the hashCode() and equals() methods are used when storing and when looking up objects in a hashtable.
Here are two rules that are good to know about implementing the hashCode() method in your own classes, if the hashtables in the Java Collections API are to work correctly:
  1. If object1 and object2 are equal according to their equals() method, they must also have the same hash code.
  2. If object1 and object2 have the same hash code, they do NOT have to be equal too.
In shorter words:
  1. If equal, then same hash codes too.
  2. Same hash codes no guarantee of being equal.
Here are two example implementation of the hashCode() method matching the equals() methods shown earlier:
public class Employee {
  protected long   employeeId;
  protected String firstName;
  protected String lastName;

  public int hashCode(){
    return (int) this.employeeId;
  }
}
public class Employee {
    protected long   employeeId;
    protected String firstName;
    protected String lastName;

  public int hashCode(Object o){
    return (int)this.employeeId *
                firstName.hashCode() *
                lastName.hashCode();
  }
}
Notice, that if two Employee objects are equal, they will also have the same hash code. But, as is especially easy to see in the first example, two Employee objects can be not equal, and still have the same hash code.
In both examples the hash code is the employeeId is rounded down to an int. That means that many employee id's could result in the same hash code, but these Employee objects would still not be equal, since they don't have the same employee id.

More Detail in the JavaDoc

For a 100% precise description of how to implement equals() and hashCode() you should check out the official JavaDoc's. The purpose of this text was mostly to explain how they are used by the Java Collection classes. Understanding this makes it easier to implement them to suit your purposes.

1 comment: