Improving Hibernate performance using second level cache

There is a trick that pays off when there are ResultSets or entities in a web application that are not updated so often: second-level caching. Getting second-level cache to work means that entities or
ResultSets can be stored in the cache and not retrieved from the database in future calls.

There are many cache providers, among them is EhCache which we will be using in this example. In order to get second-level cache to work, first we have to tell hibernate about it. Add this in the hibernate configuration file (hibernate.cfg.xml):

<property name="cache.provider_class">org.hibernate.cache.EhCacheProvider</property>
<property name="cache.use_query_cache">true</property>
<property name="cache.use_second_level_cache">true</property>
These properties tell hibernate that we will be using EhCache for entity and query caching. Then, create a file named ehcache.xml (the default name of the EhCache configuration file) and put it in the same location with hibernate.cfg.xml, typically in the classpath root. A sample configuration for ehcache.xml is the following:

<ehcache>
<diskStore path="java.io.tmpdir"/>

<defaultCache
maxElementsInMemory="20000"
eternal="false"
timeToIdleSeconds="300"
timeToLiveSeconds="300"
overflowToDisk="true"
diskPersistent="false"
diskExpiryThreadIntervalSeconds="300"
memoryStoreEvictionPolicy="LRU" />

<cache name="com.app.entities.City"
maxElementsInMemory="15000"
eternal="true"
overflowToDisk="false" />
...
</ehcache>

In short, this tells EhCache to use the java temp folder and to store a maximum of 20000 entities in the cache (cities for instance). The configuration for each entity overrides the global configuration in the ehcache.xml file. In this example, the global setting defines a TimeToLiveSeconds value which is overriden for the City entities, causing them to be stored eternally in the cache. The same holds for the cache location: global configuration allows the usage of the hard disk but for the entities cached, only the memory will be used (overflowToDisk=false).

But, what happens when an entity changes? There is no problem as long as the entity is configured through hibernate:

@Entity
@Cache(usage=CacheConcurrencyStrategy.READ_WRITE)
public class City {
private Long id;
private String name;
....
}

The @Cache annotation in this example uses a READ_WRITE strategy, which tells hibernate to update the cache when the entity is updated. If the entity is not so probable to be changed, a read-only strategy will suffice (CacheConcurrencyStrategy.READ_ONLY).

The good part is that ResultSets can be stored as well. Suppose you have a query that returns the most popular cities by country. Define it as cacheable and it will be evaluated against the database only once:

public List<City> findPopularCitiesByCountryId(Long countryId) {
List<City> cities = session.createQuery("FROM City AS c WHERE
c.country.id=? ORDER BY c.popularity DESC")
.setCacheable(true)
.setLong(0, countryId)
.setMaxResults(20)
.list();
return cities;
}

The trick is the setCacheable property of the query. Running the application in debug mode, you will see in the output for the first time

[net.sf.ehcache.Cache]-[DEBUG] org.hibernate.cache.StandardQueryCache cache - Miss
[org.hibernate.cache.StandardQueryCache]-[DEBUG] query results were not found in cache

[org.hibernate.cache.ReadWriteCache]-[DEBUG] Cache miss: com.app.entities.City#1412
[org.hibernate.event.def.DefaultLoadEventListener]-[DEBUG] object not resolved in any cache

…but in subsequent calls you will be happy to see, that even if the query cache fails:

[net.sf.ehcache.Cache]-[DEBUG] org.hibernate.cache.UpdateTimestampsCache cache - Miss
[org.hibernate.cache.EhCache]-[DEBUG] Element for cities is null

the entities will be retrieved from the cache:

[org.hibernate.cache.ReadWriteCache]-[DEBUG] Cache lookup: com.app.entities.City#1412
[org.hibernate.cache.ReadWriteCache]-[DEBUG] Cache hit: com.app.entities.City#1412

There you go, you just boosted your application’s performance. Take care though, common sense orders that you should only cache ResultSets or entities that are frequently accessed and rarely modified. You can find more info at the official hibernate manual:
http://www.hibernate.org/hib_docs/v3/reference/en/html/performance.html

Batch inserts in hibernate

We recently faced a performance issue with the insertion of a few thousand records in the database. Instead of the expected few seconds, the insertions took more than a minute! Checking the logs we realised that the problem was caused by the number of statements sent to the server than to any other bottleneck – server load, network bandwidth or other lack of resources.

The solution was to enable batch inserts and we first tried doing it by just setting the attribute hibernate.jdbc.batch_size to a relatively high number, 100 in our case. Looking at the logs though that did not have any result, the number of insert statements sent to the database was still the same as before.

Then, we noticed that all our statements with the database had the autocommit set to true; that was disabling by default the creation of batch statements, therefore we had to change it to false; instead of the simple statement
try {
  for (MyObject object : objectList){
     getSession().saveOrUpdate(object);
  }
}
catch (.. ..){
...
}
we had to code the following
  try {
  autoCommit = getSession().connection().getAutoCommit();
  getSession().connection().setAutoCommit(false);
  Transaction tx = getSession().beginTransaction();
  for (MyObject object : objectList){
     getSession().saveOrUpdate(object);
  }
  tx.commit();
  }
  catch (.. ..){
  ...
  }

Still, the behaviour persisted! The number of insert statements sent to the database server was related to the number of objects to saveorupdate!

Checking again the documentation – and occasionally the source code of hibernate – we realised that batch inserts were disabled if the id column of the table was of type Identity. At first we thought this was not a problem as we are using MySql and the generator type is native! The reality is that native maps to Identity for MySql. Overall, I think the lesson learned is that if hinernate needs to get the value of the introduced record from the server, then batch inserts are disabled – even if you do not need those ids anymore!

Changing the generator to hilo with a corresponding table to hold the generated ids, we seemd to have solved the problem; the number of insert statements was divided by the batch_size parameter, specified above. Be careful: looking at the log4j logs of hibernate the number of Insert statements remains the same, but if you look at the mysql logs, then you see the truth!

Then, we faced the final problem: the batch inserts worked perfectly while running the jUnit test, failed completely when the same code was run in the web application.

The problem we faced was java.sql.SQLException: Can't call commit when autocommit=true

We thought of a problem with jBoss and the wrapping datasource! Changed it to DBCP and the problem remained! Then, by googling we found quotes mentioning a problem with autocommit and Tomcat. Setting the attribute relaxAutoCommit=true in the jdbc url we solved our problem.

jdbc:mysql://dbserver/database?rewriteBatchedStatements=true&relaxAutoCommit=true

The above setting works well with the DBCP pool, but raises the same exception as above when used with a jboss hibernate configuration file, e.g. mysql-ds.xml.