Wednesday, January 22, 2014

How HashSet is Implemented or works Internally in Java

Not many programmer know that HashSet is internally implemented using HashMap in Java, so if you know How HashMap works internally in Java, more likely you can figure out how HashSet works in Java. But, now a curious Java developer can question that, how come HashSet uses HashMap, because you need a key value pair to use with Map, while in HashSet we only store one object. Good question, isn't it? If you remember some functionality of earlier class, then you know that HashMap allows duplicate values and this property is exploited while implementing HashSet in Java. Since HashSet implements Set interface it needs to guarantee uniqueness and this is achieved by storing elements as keys with same value always. Things gets clear by checking from JDK source code. All you need to look at is, how elements are stored in HashSet and how they are retrieved from HashSet. Since HashSet doesn't provide any direct method for retrieving object e.g. get(Key key) from HashMap or get(int index) from List, only way to get object from HashSet is via Iterator. See here for code example of iterating over HashSet in Java. When you create an object of HashSet in Java, it internally create instance of backup Map with default initial capacity 16 and default load factor 0.75 as shown below :

  * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
  * default initial capacity (16) and load factor (0.75).

public HashSet() {
   map = new HashMap<>();

Now let's see the code for add() and iterate() method from java.util.HashSet in Java to understand how HashSet works internally in Java.

How Object is stored in HashSet
As you can see below, a call to add(Object) is delegate to put(Key, Value) internally, where Key is the object you have passed and value is another object,  called PRESENT, which is a constant in java.util.HashSet as shown below :

private transient HashMap<E,Object> map;

// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();

public boolean add(E e) {
   return map.put(e, PRESENT)==null;

Since PRESENT is a constant, for all keys we have same value in backup HashMap called map.

How Object is retrieved from HashSet
Now let's see the code for getting iterator for traversing over HashSet in Java. iterator() method from java.util.HashSet class returns iterator for backup Map returned by map.keySet().iterator() method.

     * Returns an iterator over the elements in this set.  The elements
     * are returned in no particular order.
     * @return an Iterator over the elements in this set
     * @see ConcurrentModificationException

    public Iterator<E> iterator() {
        return map.keySet().iterator();

How to use HashSet in Java

Using HashSet in Java is very simple, don't think it is Map but think more like Collection i.e. add elements by using add() method, check its return value to see if object already existed in HashSet or not. Similarly use iterator for retrieving element from HashSet in Java. You can also use contains() method to check if any object already exists in HashSet or not. This method use equals() method for comparing object for matching. You can also use remove() method to remove object from HashSet. Since element of HashSet is used as key in backup HashMap, they must implement equals() and hashCode() method. Immutability is not requirement but if its immutable then you can assume that object will not be changed during its stay on set. Following example demonstrate basic usage of HashSet in Java, for more advanced example, you can check this tutorial.    

import java.util.HashSet;
import java.util.Iterator;

 * Java Program to demonstrate How HashSet works internally in Java.
 * @author

public class HashSetDemo{
   public static void main(String args[]) {

      HashSet<String> supportedCurrencies = new HashSet<String>();              

      // adding object into HashSet, this will be translated to put() calls

      // retrieving object from HashSet
      Iterator<String> itr = supportedCurrencies.iterator();




That's all about How HashSet is implemented in Java and How HashSet works internally. As I said, If you how HashMap internally in Java, you can explain working of HashSet provided,  you know it uses same values for all keys. Remember to override equals() and hashCode() for any object you are going to store in HashSet, since your object is used as key in backup Map, it must override those method. Make your object Immutable or effective immutable if possible.


  1. Little confusion.
    At the starting you said elements are stored as keys with same value always but later you said the value for all keys is constant which is PRESENT.
    Which of these is correct?

    1. @Hi, He is right. Object we are adding to the set is internally added as a Key to the HashMap and value for the HahsMap in the internal implementation of HashSet will be same "PRESENT". Hence we are only responsible for our object value that we are adding to the set.
      Kumar Shorav

  2. As per my understanding I guess values are stored as null for all keys.

  3. why would anyone like to store null values ?.
    And in fact values are never stored ,only the references pointing to those values(ie objects) are stored in any type of collection.


Java67 Headline Animator