Running on Linux, Building with Java

Time measurement accuracy in Java

An interesting article about time measurement under Java

http://www.simongbrown.com/blog/2007/08/20/millisecond_accuracy_in_java.html

Beware if you are doing any performance measurements on Windows.

java.library.path and LD_LIBRARY_PATH

As everyone might know, you have to set the java.library.path system property for making sure that you can load some JNI libraries in your java application. But often it wont be enough for you to specify java.library.path. The libraries which are referencing other shared libraries, would depend on standard way to resolve the libraries. Which means that if you have a foo.so which you are trying to load in your Java application, and foo.so references or is dynamically linked to bar.so, foo.so will look for the bar.so using the LD_LIBRARY_PATH.

java.library.path only works to resolve the immidiate native library that you are loading in your code. Loading of the other dependent libraries is left to the first library. The JNI library that you load will rely on the OS dependent way to resolve its references.

The catch is that, you will have to provide both java.library.path and LD_LIBRARY_PATH in such case for the libraries to load successfully. So its reliable to depend on the OS dependent way (LD_LIBRARY_PATH) of loading library files than java.library.path system property.

How to get the byte array Class instance?

Have you ever wondered how you can get handle to the Class instance for bye array (byte [])? Its easier to get handle to instances of other primitives like Integer by just doing Integer.class. I don't know if there is any better and standardized way to do it, but I just did (new byte [0]).class.

Any one knows a better way?

Ant / Junit and Class Loaders

Have you ever come across situations where you need to load external resources in your Junit tests? I generally prefer loading external resource files from the classpath. So I add the resources on the classpath and in the test case I look up the resource using the ClassLoader.getResource(). If you are running the Junit tests using ant, you can run into all sorts of problems if you are trying to load resources from the classpath.

Ant runs each Junit test in its own class loader. This makes sure that any libraries on the ant's classpath wont interfere with the test's environment.

If you are using a class loader to load the resources, always use the immediate class loader to load the resource. Generally you can get the handle to class loader through ClassLoader.getSystemClassLoader(). But if you try to load a resource using this class loader in the test case, JVM will try to load the resource using the system class loader. System class loader in this case is class loader for Ant. But unfortunately since Ant created a special class loader for the test, the system class loader does not contain the same classpath that you have configured in ant's build.xml for running tests (Ideally you would have added the directories containing the resources that you need to dynamically look up in test cases here). You would end up with a resource not being found.

To overcome this, always use immediate class loader, how to dot it?

use MyTest.class.getClassLoader()

This ensures that we are getting the class loader that loaded test class. Which infact is the immediate class loader.

This also applies for any framework / API you write. Using immediate class loader makes sure that your framework would be testable in the multi classloader environements.

Continuous Integration Servers

Just came across this

I have tried TeamCity already and it looks convincing. I would want to evaluate the Apache and Sun products as those are completely free.

My realization with synchronization

Recently I had to do major code refactoring on a component to make it thread safe and also to improve its performance. The component is a persistence layer that is capable of persisting and querying data. The component was already thread safe but the synchronization levels were on rather higher level. Older synchronization levels would block the query if the component is already executing some other query. So despite being multi threaded, queries executed from multiple threads will only execute one after the other. What we wanted now was to be able to execute queries simultaneously. Regardless of how many queries are executed in parallel. Two queries executed in parallel should block each other at rather micro level (like record level).

This component involves extensive use Java Collections. And at most of the places (while querying) you need to copy from these Collections. A particular Collection can contain more than a million object sometimes. So copying naturally takes up a lot of time in this case. The bad part is you have to synchronize on the Collection for the time you are copying it. So you are stopping any inserts and updates into the database (add, remove, on Collection). When you need to synchronize access to the Object you always have to synchronize on all accesses to the Object. Similarly you have to synchronize all the actions on the Collections. To make the synchronization simpler, one can always use synchronized versions of Collections. You can obtain synchronized version of the underlying Collection instance by using methods Collections.synchronizedXXXX(). These synchronized versions of Collections can get rid of hassle of explicitly synchronizing on every access to the Collection. But there is a thing to remember with the synchronized Collections. Wherever you obtain any sort of Iterator, you need to externally synchronize the Collection. If you iterate on Map.keySet of synchronized map or you iterate on a synchronized Collection, you must externally synchronize the Collection (More explanation of why, coming later)

So in our case when I am copying from the collection I am actually iterating on the Collection and adding items to the other Collection. This access is synchronized. Which avoids any other access to the underlying Collection. Which means only one thread can iterate on a Collection at a time (There can be multiple iterators on a Collection at a certain time. But then we assume that no one modifies the Collection). To avoid all this blocking iteration, I came up with a rather stupid idea of using a synchronized List instead. The idea behind synchronized List was to avoid iteration (Remember you can do index base access to the List) So,

Collection src = Collections.synchronizedCollection(new ArrayList());
Collection dest = new ArrayList();
synchronized (src) {
        Iterator itr = src.iterator();
        while (itr.hasNext()) {
                 dest.add(itr.next());
}
}

will be replaced with

List src = Collections.synchronizedList(new ArrayList());
List dest = new ArrayList();
for (int i = 0; i <= src.size(); i++) {
         dest.add(src.get(i));
}

With this approach, the synchronization is even finer now. We dont lock the Collection for all the time we spend iterating it. But we just lock it for the period we are doing src.get(i) operation. This means individual List.get() operations block each other instead of whole iteration process blocking the other one. For a while I considered this as a fabulous idea. But if you have noticed, we are breaking synchronization here. We can get into all sorts of problems using this approach.

For example:

Thread 1 : starts iterating List calculates List size as 10
Thread 2 : Removes an item from the List
Thread 1 : Reaches step List.get(i) where i=9. This will result into ArrayIndexOutOfBoundsException.

Since Thread 2 has removed one object from list by the time Thread 1 reaches 9th iteration of its for loop, we have broken the synchronous access to the List.

This lead me to an obvious conclusion that, Iteration is one logical operation on the Collection and it should block to all add, remove, get operations on corresponding Collection.

Well, thats not all! This rule can be generalized for all the objects. If you are making certain object as completely synchronized internally then the same rule applies to all such objects. Any logical operation to such object should block other logical operation (Maybe not always but most of the times). I encountered the same problem with a custom object. Which is suppose to a database index. I tried to synchronize on the entry level in index. The entries are actually stored on a Map. IndexEntry is mapped with its key. This did not help me because by the time i get an IndexEntry (Map.get(key)) out and as I operate on it. Some other thread can completely remove the entry from the Map (Map.remove(key)). Then all the operations by original thread on this IndexEntry are invalid. Same principle of the logical operation applies here. Each logical operation like read, add, update on index should be synchronous.

The other catch with synchronization is, Anything going out of scope of internally synchronized object needs to be externally synchronized. As we saw in iterators of synchronized collections. Even if the object is internally synchronized, if we return part of it as reference (internal structure, object), this part breaks the synchronization limits of the object and then needs explicit external synchronization from the user of this part. Most of the times we can avoid this scenario by making defensive copies of the internal structures before returning. But sometimes this can hit your performance.

Quite a long one! What do you guys say?

kdesu, su and sudo

Recently I upgraded to SUSE 10.2 First thing I did after install was configured sudo. I find it handy when doing operations requiring root privileges. But I always faced problems running GUI applications which require to connect back to X server. For example sudo /sbin/yast2 would always revert to the text based yast in suse, and others would give up with error. After I had done my installation I tried running yast (this time clicking on a link in launch bar.) It showed me the routine pop up asking for root password. And to my surprise, root password wouldn't work. So I typed in my password and Bingo! it worked! Now I was puzzled why would it not accept root password (despite asking me for root password) and would succeed using my password.

So I looked up for the link launching the yast. I found out that the application which asked for root password in GUI mode was kdesu, and the command looked like kdesu /sbin/yast2. After reading brief documentation of kdesu, I learned that kdesu is kde equivalent for su. But then why would it not take my root password and still accept my password? Then I realized that I configure my sudo with targetpw attribute disabled. This will ask me for my password instead of root password to get root privileges. So for my curiosity I ran command kdesu /sbin/yast2 again, and did pstree. The tree showed that kdesu spawned sudo (and not su) This clarified why my password was accepted and not root password.

kdesu is very handy running KDE / GUI applications. You can also run applications as some other user (as in sudo) with -u switch. Say you want to run konqueror as user bozzo you can run kdesu -u bozzo konqueror from Run Command dialog (pressing Alt + F2).

Operating System in the browser.

In my previous post I had mentioned about Operating System being provided as service in future. In fact some steps towards it are already taken. YouOS has started a project which runs whole operating system inside your browser. This is the most innovative use of Ajax I have seen till date. It has sure made me rethink about my opinions about JavaScript.

Go "Find Bugs" in your software.

Life was going smoothly for me last couple of weeks. Hardly anything to
do. So I started reading articles and blogs and I ended up on a blog on
The Serverside which mentioned about the tool called "Find Bugs" I found
the name very interesting and downloaded that tool. The very first thing
I did was that I ran it on the code base of software I was working on. I
found it funny that Find Bugs reported around 700 bugs. Obviously I had
at least some faith in my code which made me believe that it was just crap!

Next thing I did was I started looking at each issue individually and
read through the explanation on why that thing was reported as a Bug.
Suddenly it started making a lot of sense to me. Believe me, 80% of them
were bugs! It efficiently detected some blunders like empty catch blocks
with catch(Exeption e) Assignments to static in non static methods of
class, Dead local store, Class casting problems, Possible Null pointer
dereference etc. And at last I found some real work to do! A lot of
work! It taught me a lot about good coding.

Find Bugs is a must have tool in your swiss army knife for code review.
It takes off a lot of hassles of manual code reviews. It comes with a
Eclipse plugin too which works well but the swing front-end provided by
Find Bugs is really good. It lets you address problems by category.
Where as in Eclipse you have to go to individual source file to find out
bugs in your code. Eclipse also has summary view to see the problems but
it looks really cluttered. Its a must use tool for all Java programmers.

My "dream" development environment

My boss keeps telling me to leave my laptop and get onto a desktop
machine. I agree, desktops are more powerful and rather sturdier than
laptops. But my laptop gives me freedom to work from anywhere. I can
work from home if i want (Well.. maybe at least sometimes! :D) I have
all the resources that i need wherever I go. But its really frustrating
when I try running heavy applications on my laptop. I feel I really need to get
on to a powerful machine. But then I loose all my freedom of working
from anywhere.

Recently I just gave this whole idea a thought! Well, what do i really
need is a powerful hardware! What if I can carry my whole Operating
System with me wherever I go? such that I use only the computing power
of the hardware but the operating system (and the whole environment) is
what remains the same. Then I came across a USB disk, one sent to me by
my Dad. The specialty with this USB disk is it actually uses memory
card. So that you can plug in any memory card into it and use it as a
pen drive. For long I have known a Linux distribution aimed at being
really small. Its called Damn Small Linux (DSL). It has also got XFace
windowing system and is really damn small in size. Major feature of DSL
is that it can start from within Windows or even within Linux. So what
has this all to do with my dream development environment?

Well, if I can put all these pieces together, I can build my "dream"
development environment! Say I have a memory card of 5G. I use my funky
USB Disk to connect that to my computer. I can install DSL or in fact
any other distribution on this USB disk. I can setup my whole
development environment on the same. Since I have got 5G which I suppose
is plenty of space, (Even if you dont agree) I can setup all the stuff I
want like Java, Eclipse, Browser etc. I just have to make sure that I
have driver for most of the hardware. I take this little USB disk to any
machine which can boot from USB (Not necessary if I install DSL), I boot
to my very own portable development environment!

Sounds amusing isn't it? But reading or writing stuff to this memory
card maybe very slow and loading, saving applications can also be
considerably slow. But I wouldn't bother to compromise on that if I have
so very portable environment with me. Surely comfortable and easy to
carry. Well only thing which can bother me now is my USB disk crashes. I
think we are not very long from an era where computing power will be
provided like service, and your software (my USB disk in this case) will
be something you store on Internet. The computer will be a very light
front-end on top of all these services.