Improve Performance By Using Batch Gets on Google App Engine


We are porting a JPA application to Google App Engine. One of the challenges we faced during this exercise was performance. There are simple and effective ways by which we can increase the performance of our application. Datastore batch get is one such optimization we used during porting the application.

If you are used to low level datastore api you must have used batch get. First of all what is it?

Batch gets are super-effecient way to load multiple entities, when you already have the keys of entities you want to load. Here is an example of low-level datastore API:

public Map getById(List keys) {
      DatastoreService ds = DatastoreServiceFactory.getDatastoreService();
      return ds.get(keys);
}

Can we issue a batch get using JPA? yes we can.

Suppose you have an Entity TimesheetEntry and a JpaTimesheetEntryDao data access object which has methods to fetch data from datastore.

@Entity
public class TimesheetEntry {
	@Id
	@GeneratedValue(strategy = GenerationType.IDENTITY)
	private Key timesheetEntryKey;

	private Key projectAssignmentKey;

	private Float hours;

	private Date entryDate;
	. . .
}


If you already have a list of keys of TimesheetEntry entities then you can do a batch get on the datastore. Here is the code listing where we do a batch get on TimesheetEntry entities.

public class JpaTimesheetEntryDao {
   . . .
   public List<TimesheetEntry> fetchTimesheetEntries(List<String> timesheetEntryKeys) {
      Query query = getEntityManager().createQuery("Select from TimesheetEntry where timesheetEntryKey = :timesheetEntryKeys");
      query.setParameter("timesheetEntryKeys", timesheetEntryKeys).getResultList();
      return query.getResultList();
   }
   . . .
}

It does not really matter that the Entity have primary attribute as Key class for issuing a batch get from datastore. As long as it is a primary attribute such as Long, String it does not matter. The only requirement is that the Keys in query has to belong to the same entity group and list of primary attribute is being passed in the Query.

One of the way we might have implemented fetchTimesheetEntries() is by issuing a JPA IN query. It would have been inefficient and will also suffer from limitations. You can read a bit more about JPA IN query here.

Datastore batch get is super-effecient way of loading multiple entities. We used Jpa batch get in our application whenever we had a list of entities keys. It is definitely a simple and effective performance optimization.

Advertisements
This entry was posted in Cloud, Java and tagged , , . Bookmark the permalink.

2 Responses to Improve Performance By Using Batch Gets on Google App Engine

  1. >The only requirement is that the Keys in query has to belong to the same entity group and list of primary attribute is being passed in the Query.

    So does this mean that it would not work for root entities or that it would just work for the root entities and not the child entities under it?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s