Java | A blog about Programming and other random stuff

There was one time when I had finished coding and verifying my project module in my local computer. Passed it to the UAT. The UAT team had also verified the functionalities. Everything seemed to work pretty well. “So, it is now ready to be deployed to Production, right?” was what I thought until they mentioned such thing that there is still another test called Load / Stress Testing. What the heck is that?

I wasn’t very prepared with such terms while I did my coding since I was still fairly new as a programmer (about a year) at that time. To make the story short, it failed. It’s supposed to be strong enough to handle 140 users at the same time. My project module could handle 5. Only FIVE users at the same time! Oh dear…

I learned then that it’s not just about making sure that all the functionalities are to work properly, no critical bug, no error page, and so that’s it. There is another dimension that should be considered to make sure that the project is ready to face the world. You’re doing no good if you show your Masterpiece Project to the WORLD in the end just to know that only five people can actually use it at the same time.

I learned that Performance is an important factor.

There are many tips & tricks for performance boosts for J2EE scattered all over the web. It might be too huge to compile them all into one blog post. Some of them are no longer relevant considering all the tweaks / fixes done by the marvelous Java Team to Java community. Nevertheless I am still eager to share those that I have got and learned so far.

I’d like to start with my favourite principle from Kevlin Hennely. I personally consider it a very basic principle of performance optimisation:

There is no code faster than no code
Kevlin Hennely

That is, don’t put unnecessary lines into your program. You have to regularly review your code or have someone else to review your code to make sure that there is a purpose for the existence of every line.

Other than the basic principle mentioned by Kevlin Hennely above, these are the list of some of the rules that I have learned so far:

1. String operations.

What is the best way to concatenate Strings together? Is it by using ‘+’ operator or by using StringBuffer, or even StringBuilder?

The answer is: All of them. We just need to know how and when to use them properly, based on these rules below:

Use ‘+’operator to combine two String constants together, e.g.:

"SELECT * FROM SHOP_STOCK " 
+ "WHERE QTY<QTY_THRESHOLD;" // good example.

Don’t use ‘+’operator to combine a constant with a variable, or a variable with a variable, e.g.:
```
"SELECT * FROM SHOP_STOCK " 
+ "WHERE QTY<" + var_qty_threshold; // bad example
```
Use either StringBuffer or StringBuilder instead.
Use StringBuffer when one of those Strings to be combined are variables and you want to make it thread-safe, e.g.:
```
StringBuffer sb = new StringBuffer("SELECT * FROM SHOP_STOCK " 
+ "WHERE QTY<").append(var_qty_threshold);
```
Use StringBuilder when one of those Strings to be combined are variables and there is no thread-safety issue involved. It’s very suitable for most cases, e.g.:
```
StringBuilder sb = new StringBuilder ("SELECT * FROM SHOP_STOCK " 
+ "WHERE QTY<").append(var_qty_threshold);
```

2. Combine several EJB method calls into one method call.

EJB method invocation is a heavy process. Try not to call the EJB methods too many times.

So, instead of doing this:

// Don't do this!!!
for (Document doc: documents) {
  getRemoteEJB().saveDocument(doc); // the EJB method is invoked for every document.
}

do this instead:

// the EJB method is invoked only once for all documents.
getRemoteEJB().saveDocuments(documents);

3. Avoid subqueries. Use “JOIN” instead.

The subqueries will be executed for each row. This is not a good idea especially if there are huge data to be processed in the database. By joining to a table, there will be only a query executed to join the two tables.

This is a bad example:

select DT.DESCRIPTION, DT.QTY,
  (select MASTER.DESCRIPTION from MASTER_INFO MASTER where CODE = DT.EQ_CODE and MASTER.CATEGORY='EQUIPMENT') EQUIPMENT,
  (select MASTER.DESCRIPTION from MASTER_INFO MASTER where CODE = DT.BRAND_CODE and MASTER.CATEGORY='BRAND') BRAND
from DETAIL DT;

and this is the better example:

select DT.DESCRIPTION, DT.QTY, MASTER_EQ.DESCRIPTION EQUIPMENT, MASTER_BRAND.DESCRIPTION BRAND
from DETAIL DT
  left join MASTER MASTER_EQ on DT.EQ_CODE = MASTER_EQ.CODE and MASTER_EQ.CATEGORY='EQUIPMENT'
  left join MASTER MASTER_BRAND on DT.BRAND_CODE = MASTER_BRAND.CODE and MASTER_BRAND.CATEGORY='BRAND';

4. Put a value into a variable if it’s going to be used several times.

Avoid putting any computation in the conditional part of the for-loop if you know the value is always fixed throughout the iterations.

Bad example:

// Let's say there are 100 items in the database.
for (int i=0; i<getAllItemsFromDatabase().size(); i++) { // 100 times
  Item item = getAllItemsFromDatabase().get(i); // 100 times
  // Do whatever it is..
}

Note that the conditional section in the for-loop, “i<getAllItemsFromDatabase().size()”, will be evaluated for each iteration.

If the getAllItemsFromDatabase() method will return, for example, 100 items. Then the code above will in the end execute the getAllItemsFromDatabase() method 200 times! Not counting if that method will involve EJB or any database process.

It’s like going back and forth between heaven and hell 200 times. Not my favourite exercise. I mean, seriously…

and for the better example:

List<Item> items = getAllItemsFromDatabase(); // called only once
for (int i=0; i<items.size(); i++) {
  Item item = items.get(i);
  item.setTempIdx(i+1);  // Do whatever it is..
}

In this sample, the method getAllItemsFromDatabase() will be called just once, and the result will be reused for all the iterations.

5. Use “cheap” objects for EJB method calls.

All the parameters and return objects used in an EJB method invocation will be serialized. The heavier the object is, the longer it takes for the system to serialize the object.

When a client invokes an EJB method, this is what happens in sequence:

Client invokes the method.
The parameters are serialized and then sent as a stream to the server.
The streams are de-serialized to be parameters again.
Inside the EJB, after all the processes are completed and the result object is returned
This return object will be serialized into a stream and then sent to the client.
The EJB will de-serialize the stream to become the result object again.

As seen above, there are a couple of serialization and de-serialization processes going on for all the objects used in the method invocation. And it takes time! That’s why it’s good to observe the objects again (both the parameters and the return objects) whether they are really necessary or not, or if it’s possible to simplify them.

Take a look at the sample code below:

// Don't do this!!!
public List<SuperComplexDocument> updateDocumentSiblings(SuperComplexDocument doc) {
  // I'm oversimplifying here for the sake of clarity.
  return getDocumentDAO().updateSiblings(doc.getDocumentCode());
}

For example, in the example shown above, we see that in fact we only need to access the documentCode attribute of the Document object. If that’s the case, the method parameter should be simplified as follows:

public List<SuperComplexDocument> updateDocumentSiblings(String documentCode) {
  // I'm oversimplifying here for the sake of clarity.
  return getDocumentDAO().updateSiblings(documentCode);
}

After the modification, the serialization & de-serialization processes will be lighter since the EJB doesn’t serialize the SuperComplexDocument parameter object parameter anymore. It’s just serializing the String object.
The same thing also applies for the return objects (List<SuperComplexDocument> in this example). If you notice that the return objects can be simplified, then simplify it! You should even remove it if it’s unnecessary.

**6. System.Out.Println < Logger < no-log-at-all.**

Printing out the variables to the Production console for the sake of debugging is not a good idea. It’s time-consuming, especially when the application is about to be used by a number of users.

If you are deploying a program into a Production environment and using the System.out.println command in your application, you will likely get into a problem. If, for example, there are 100 users accessing the same module at the same time, then this very simple command will be executed 100 times. This is very inefficient because the server is spending its resources just to do what is actually not to be done in Production environment.

If you do need to print out a value for the sake of debugging a bug in your local environment but not in Production environment, use a Logger instead. Logger will filter the command and analyze whether it’s valid to print something to the console or not. So, not everything will be printed to the console for all users. The popular one for now is Log4J, and it’s sufficient enough for me so far.

The best scenario is definitely by not using a Logger at all (take a look at the initial quote I put in the beginning of this post). Whether the log is to be filtered or not by the Logger, this filtering process itself takes time although not as much as the System.out.Println method. That’s why doing excessive logging will also affect the performance of the application.

7. Favor Stateless Bean over Stateful Bean.

Stateful Bean is heavier than Stateless Bean since the system will store the transaction object of every user into the memory. If there are 100 users calling an EJB method and triggers the transaction, there will be 100 unique Beans created by the server.

By using Stateless Bean, the transaction is not unique for every user. They are all sharing the same transaction object. So, if there are 100 users calling an EJB method and triggers the transaction, there will be only one Bean created by the server.

8. Adjust the Transaction attributes as necessary.

One aspect of EJB that makes it heavier is the usage of transaction. If some EJB method is meant only to retrieve a result without modifying anything, than transaction is not necessary. Adjust the transaction attribute accordingly for all the EJB methods based on this behaviour.

You can set the transaction attributes in the ejb-jar.xml deployment descriptor file. If you don’t need a transaction for a certain method call, you can set the value to be either ‘NotSupported’ or ‘Never’. You can check the manual for more details.

9. Use ‘transient’ modifier to reduce serialization overheads.

As mentioned above, serialization and de-serialization is a heavy task. One way to lighten the burden is by showing the system which attributes that do not need to be transferred to a file / over the network, etc. Basically we’re telling the system which attributes that do not need to undergo serialization process.

10. Don’t reinvent the wheel.

If you’re using a HashMap object that is frequently accessed by concurrency actions, you might as well replace it with ConcurrentHashMap instead. That why you don’t need to manually set how to lock / unlock the key, where to synchronize, etc. The Classes / APIs provided by Java community has been tweaked, tested and optimized again and again to work the best it can possibly be.

Another example is when you try to check whether a String constant starts with a certain characters. Instead of using your own substring, iteration, etc., you can actually use method startsWith(String prefix).

Make sure that you’re not reinventing the wheel. Check first whether such functionality has been natively invented or not before deciding to make one.

Source:

1. http://www.javaperformancetuning.com/

2. http://javaboutique.internet.com/tutorials/tuning/

3. http://www.javaworld.com/javaworld/jw-05-2004/jw-0517-optimization.html

A blog about Programming and other random stuff

Category Archives: Java

Tips for J2EE optimisation