Sunday, February 17, 2013

java static fields

A great many people starting out with java development have only a vague understanding of the difference between a "public static String", "public String", and the difference between a class and an object. As this was confusing to me at first, I thought I would give a quick overview.

A class defines a template for what data and operations are available when you tell the JVM to create an object.

So, for example:

class BlogPost {
    public BlogPost(String inString) {
       text = inString;
       BlogPost.latest = this;
    public String text = "";
    public static BlogPost latest;

When you do the following

BlogPost myPost = new BlogPost("Hello");

You're telling the JVM to allocate some memory on the heap to store a reference to a memory location and from now on, when I refer to myPost, it means that memory location. BlogPost is a class, myPost is an Object that is a reference to a memory location that is an instance of the BlogPost Class. When I'm trying to create a new instance of a BlogPost, the JVM searches it's classpath for a compiled Class definition of Type BlogPost with a constructor that takes a single argument which is a String object.

The important detail some people miss at first is that "static" fields and methods work on the Class definition, not on the Object instance. This means that if you create an instance of a BlogPost, and read the "latest" field, you will not get 5 different values, you'll only going to get the reference to the last one. This has implications for thread safety and other situations where multiple objects may read or write the same static field.

Further complicating the issue is that java can have multiple classloaders which means if "BlogPost" is found in different places (or even in different threads potentially) there might actually be multiple instances of the Class definition. This can make for interesting debugging situations where a static field is updated twice, but only one of them is visible from a particular perspective at a given time. As an illustration, suppose the following two snippets of code appear in different parts of a system:

BlogPost myPost1 = new BlogPost("Hello1");
BlogPost myPost2 = new BlogPost("Hello2");

If myPost1 happens first and myPost2 happens second, most people would expect BlogPost.latest to always refer to myPost2. This is not always the case though, in a web container, if myPost1 was created in a different classloader than myPost2, it's quite possible that certain parts of the system will ALWAYS see BlogPost.latest as myPost1 and other parts will ALWAYS see MyPost2, no matter what order these calls were made. Worse yet, it's possible that, if the class was garbage collected, you might see neither, even though most people would never expect that to happen.

Examples of how this might happen are when you deploy a jar file to a web container in it's parent classpath (like xml processing libraries) and also deploy them inside the web application (or in two different web applications in the same container). Depending on how the container handles the situation, you may very well get different results (even from what I described).

For a more complete set of examples and a better explaination, particularly in regard to implementing Singletons in java, see this

Wednesday, February 13, 2013

Database pagination on mySql and Oracle

Having studiously avoided Oracle for over 20 years, I'm now working in a shop that uses it almost exclusively. Aside from the general overall expense of the product I'm routinely amazed at how many features other DBMS's I've used (DB2, MSSQL, MySQL, PostGres) are either missing or syntactically difficult to understand.

The most recent example is server side pagination… or more specifically, having the DBMS limit the results returned to for specific subsets of rows. In oracle to do this, one must run a query something like this:

select * from (select name, rownum rn from 
        (select name
          from users order by name)
      where rownum <= 10) where rn > 5;

I realize that this is a legacy syntax, but I personally find the new way just as obtuse. The new way (I guess) is supposed to be:

select * from (select name,
        row_number() over
        (order by name) rn
  FROM users) where rn between 5 and 10 order by rn

Compare this with the syntax for mySql (also now I guess technically part of the Oracle corporation):

select * from users order by name limit 5,5;

I find mySql's syntax to be more concise and don't really understand why Oracle's syntax is so convoluted other than perhaps some dogmatic insistence on following some sort of standard or an internal engineering group who was all hopped up on set theory drugs of some sort ;)