The java collections framework for newbies
I don't consider myself a java expert by any measure, but there's a disturbing thing I've noticed. There are a LOT of people who claim to be "java developers", but they have zero clue what the "java collections framework" is. This post is designed for folks who keep getting stumped on interview questions or are mystified when someone starts talking about the difference between a Set and a List (for example).
If you google "java collections framework for dummies" you'll find this link which has a more complete, if fairly dense explanation. I'm going to do you one better and give a rule of thumb that you can use without thinking about it.
At it the root of things, a collection is something you can store other things inside. Just like in real life, a collection of marbles is just a "bunch" of marbles. The big difference in the collections framework is that the different implementations have different things they DO with the marbles that you need to understand.
For example, let's consider the ArrayList... everybody and their brother should know this... if not, you are not a java developer, go read a book. Some special things about an array list: It stores the entries in order of when they are added, you can select an element by it's index, it can contain duplicates of the same element. From a performance perspective, it is VERY fast to lookup and add things by index and add things to an ArrayList, on average, it is slow to see if a particular object is present because you must iterate the elements of the list to see if it's there.
Next, let's talk about HashSet... I realize that this might sound vaguely drug related to the uninitiated, but a hashset has some interestingly different characteristics from a list. First off, a HashSet has no concept of order or index, you can add things to it, you can iterate over it, but you cannot look things up by index nor are there any guarantees of what order things will be presented to you when it loop over members. Another interesting characteristic is that it cannot contain duplicates, if you try to add the same object twice, it will NOT fail, it will just return false and you can happily move on.
Last but not least, there is the Hashtable (or his slightly more dangerous cousin, the HashMap). This is used to store key/value pairs. Instead of keying things by an index (like an arraylist), you can key them by just about anything you want. You can do things like myMap.put("foo","bar") and then myMap.get("foo") will return bar...
There is a LOT more to this, but with this quick reference you can at least begin to do useful things in java.
Examples of using a List
More importantly, the following will be likely be much faster for LARGE collections:
If you google "java collections framework for dummies" you'll find this link which has a more complete, if fairly dense explanation. I'm going to do you one better and give a rule of thumb that you can use without thinking about it.
At it the root of things, a collection is something you can store other things inside. Just like in real life, a collection of marbles is just a "bunch" of marbles. The big difference in the collections framework is that the different implementations have different things they DO with the marbles that you need to understand.
For example, let's consider the ArrayList... everybody and their brother should know this... if not, you are not a java developer, go read a book. Some special things about an array list: It stores the entries in order of when they are added, you can select an element by it's index, it can contain duplicates of the same element. From a performance perspective, it is VERY fast to lookup and add things by index and add things to an ArrayList, on average, it is slow to see if a particular object is present because you must iterate the elements of the list to see if it's there.
Next, let's talk about HashSet... I realize that this might sound vaguely drug related to the uninitiated, but a hashset has some interestingly different characteristics from a list. First off, a HashSet has no concept of order or index, you can add things to it, you can iterate over it, but you cannot look things up by index nor are there any guarantees of what order things will be presented to you when it loop over members. Another interesting characteristic is that it cannot contain duplicates, if you try to add the same object twice, it will NOT fail, it will just return false and you can happily move on.
Last but not least, there is the Hashtable (or his slightly more dangerous cousin, the HashMap). This is used to store key/value pairs. Instead of keying things by an index (like an arraylist), you can key them by just about anything you want. You can do things like myMap.put("foo","bar") and then myMap.get("foo") will return bar...
There is a LOT more to this, but with this quick reference you can at least begin to do useful things in java.
Examples of using a List
ArrayListwill outputmyList = new ArrayList (); myList.add("Second Thing"); myList.add("Second Thing"); myList.add("First Thing"); System.out.println(myList.get(0));
Second ThingAn interesting thing to note is that the size of this is 3
System.out.println(myList.size());will output
3The following:
for (String thing: myList) { System.out.println(thing); }will always output:
Second Thing Second Thing First ThingNext lets look at a set:
HashSetThe first difference we can see is thatmySet = new HashSet (); mySet.add("Second Thing"); mySet.add("Second Thing"); mySet.add("First Thing");
System.out.println(mySet.size());returns
2Which makes complete sense if you understand that sets cannot contain duplicates (and you understand how the equals method of String works...;) Another interesting thing is that: The following:
for (String thing: myList) { System.out.println(thing); }might output:
Second Thing First Thingor it might output:
First Thing Second ThingIt so happens that it returns the second version on my machine but it's really JVM/runtime specific (it depends on how the HashSet is implemented and how hashcode is implemented and a bunch of other variables I don't even fully understand).
More importantly, the following will be likely be much faster for LARGE collections:
System.out.println(mySet.contains("Third Thing"));Finally, the grandDaddy of all the entire framework, hashtable.
HashtableWill outputmyMap = new Hashtable (); myMap.put("a", "Second Thing"); myMap.put("b", "Second Thing"); myMap.put("c", "First Thing"); System.out.println(myMap.get("a"));
Second Thingand the following:
for (Map.Entrywill outputentry: myMap.entrySet()) { System.out.println(entry.getKey() + "=" + entry.getValue()); }
b=Second Thing a=Second Thing c=First ThingHopefully with these examples, you can get an idea of the capabilities of the collections framework. There is much much more to it and I encourage ANYONE doing java development to spend time playing around and learning the different characteristics of the various components as I've only lightly skimmed the surface.
Comments
I agree with Emily, concurrency is a huge issue with HashMap, particularly when the map is re-sizing.
What does that sentence mean? Lookup is fast but seeing if particular element exists is slow? Seems like you are mixing up ArrayList with LinkedList. You should take your own advice and study on the collections a bit more...
Stackoverflow has a good summary on the differences: http://stackoverflow.com/questions/322715/when-to-use-linkedlist-over-arraylist
I do appreciate folks taking the time to clarify and comment, every little bit helps...
It doesn't go into depth of using each Collection class. But it includes a PDF file that compares the popular classes in the Collection Framework..