Monday, 26 December 2011

Extend Thread vs implement Runnable

There are two ways to create your own thread type: subclass java.lang.Thread class, or implementing java.lang.Runnable and pass it to Thread constructor or java.util.concurrent.ThreadFactory. What is the difference, and which one is better?
1, The practical reason is, a Java class can have only one superclass. So if your thread class extends java.lang.Thread, it cannot inherit from any other classes. This limits how you can reuse your application logic.
2, From a design point of view, there should be a clean separation between how a task is identified and defined, between how it is executed. The former is the responsibility of a Runnalbe impl, and the latter is job of the Thread class.
3, A Runnable instance can be passed to other libraries that accept task submission, e.g., java.util.concurrent.Executors. A Thread subclass inherits all the overhead of thread management and is hard to reuse.
4, Their instances also have different lifecycle. Once a thread is started and completed its work, it's subject to garbage collection. An instance of Runnalbe task can be resubmitted or retried multiple times, though usually new tasks are instantiated for each submission to ease state management.

Thursday, 22 December 2011

O(n) for operations on Java collections


Collection Interfaces - Java Collection Diagrams


And when we include maps:

One java - 2 compilers (Javac and JIT)

It seems like that for students there is a lot of confusion regarding how Java/The JVM works because there are TWO compilers involve, so when someone mentions a compiler or the Just In Time compiler some of them would imagine it’s the same one, the Java Compiler..

So how does it really works?

It’s simple..

1) You write Java code (file.java) which compiles to “bytecode“, this is done using the javacthe 1st compiler.

It’s well known fact that Java can be written once get compiled and run anywhere (on any platform) which mean that different types of JVM can get installed over any type of platform and read the same good old byte code

2) Upon execution of a Java program (the class file or Jar file that consists of some classes and other resources) the JVM should somehow execute the program and somehow translate it to the specific platform machine code.

In the first versions of Java, the JVM was a “stupid” interprater that executes byte-code line by line….that was extremely slow…people got mad, there were a lot of “lame-java, awesome c” talks…and the JVM guys got irratated and reinvented the JVM.

the “new” JVM initially was available as an add-on for Java 1.2 later it became the default Sun JVM (1.3+).

So what did they do? they added a second compiler.. Just In Time compiler(aka JIT)..

Instead of interpreting line by line, the JIT compiler compiles the byte-code to machine-code right after the execution..

Moreover, the JVM is getting smarter upon every release, it “knows” when it should interpat the code line-by-line and what parts of the code should get compiled beforehand (still on runtime).

It does that by taking real-usage statistics, and a long-list of super-awesome heuristics..

The JVM can get configured by the user in order to disable/enable some of those heuristics..

To summarize, In order to execute java code, you use two different compilers, the first one(javac) is generic and compiles java to bytecode, the second(jit) is platform-dependent and compiles some portions of the bytecode to machine-code in runtime!

Optimizing and Speeding up the code in Java

Finalizers: object that overwrites the finalize() method (“Called by the garbage collector on an object when garbage collection determines that there are no more references to the object.”) is slower!(for both allocation and collection) if it’s not a must, do clean ups in other ways ( e.g. in JDBC close the connection using the try-catch-finally block instead)

New is expensive: creating new heavy object() on the heap is expensive!, it’s recommended to recycle old objects (by changing their fields) or use the flyweight design pattern.

Strings : (1) Strings are immutable which mean that upon usage of the + operator between 2 strings the JVM will generate a new String(s1+s2) on the heap (expensive as I just mentioned), in order to avoid that, it’s recommended to use the StringBuffer.(Update) since JDK 1.5 was introduced  StringBuilder is a better option than Stringbuffer in a single-threaded environment.

(2) Don’t convert your strings to lower case in order to compare them, use String.equalIgnoreCase() instead.

(3) String.startswith() is more expensive than String.charat(0) for the first character.

Inline method: inline is a compiler feature, when you call a method from anywhere in your code, the compiler copies the content of the inline method and replace the line that calls the method with it.

Obviously,It saves runtime time: (1) there is no need to call a method (2) no dynamic dispatch.

In some languages you can annotate a method to be inline, yet in Java it’s impossible, it’s the compiler decision.

the compiler will consider inline a method if the method is private.

My recommendation is to search in your code for methods that are heavily used(mostly in loops) and annotate those method as private if possible.

Don’t invent the wheel: the java api is smart and sophisticated and in some cases use native implementation, code that you probably can’t compete with. unless you know what you are doing (performance wise) don’t rewrite methods that already exists in the java API. e.g. benchmarks showed that coping and array using a for loop is at least n/4 times slower than using System.arraycopy()

Reflection: reflection became much faster for those of you who use the most recent JVMs, yet using reflection is most certainly slower than not using it. which mean that you better avoid reflection if there is no need.

Synchronization: Some data structures auto-support concurrency, in case of a single thread application don’t use those to avoid overhead e.g. use ArrayList instead of a Vector

Multithreads: in case of a multi processor use threads, it will defiantly speed up your code, if you are not a thread expert some compilers know how to restructure your code to thread automatically for you. you can always read a java threads tutorial as well

Get familiar with the standard data structures:   e.g. if you need a dast for puting and retriving objects use HashMap and not an ArrayList. (O(1))

Add an id field, for faster equals(): object that contains many fields are hard to compare ( equals() wise), to avoid that add an id(unique) field to your object and overwrite the equals() method to compare ids only.

Be careful, In case  your code already works, optimizing it is a sure way to create new bugs and make your code less maintainable!

it’s highly recommended to time your method before and after an optimization.