We are porting a JPA application to Google App Engine. One of the challenges we faced during this exercise was performance. There are simple and effective ways by which we can increase the performance of our application. Datastore batch get is one such optimization we used during porting the application.
If you are used to low level datastore api you must have used batch get. First of all what is it?
Batch gets are super-effecient way to load multiple entities, when you already have the keys of entities you want to load. Here is an example of low-level datastore API:
[sourcecode language=”java”]
public Map getById(List keys) {
DatastoreService ds = DatastoreServiceFactory.getDatastoreService();
return ds.get(keys);
}
[/sourcecode]
Can we issue a batch get using JPA? yes we can.
Suppose you have an Entity TimesheetEntry and a JpaTimesheetEntryDao data access object which has methods to fetch data from datastore.
[sourcecode language=”java”]
@Entity
public class TimesheetEntry {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Key timesheetEntryKey;
private Key projectAssignmentKey;
private Float hours;
private Date entryDate;
. . .
}
[/sourcecode]
If you already have a list of keys of TimesheetEntry entities then you can do a batch get on the datastore. Here is the code listing where we do a batch get on TimesheetEntry entities.
[sourcecode language=”java”]
public class JpaTimesheetEntryDao {
. . .
public List<TimesheetEntry> fetchTimesheetEntries(List<String> timesheetEntryKeys) {
Query query = getEntityManager().createQuery("Select from TimesheetEntry where timesheetEntryKey = :timesheetEntryKeys");
query.setParameter("timesheetEntryKeys", timesheetEntryKeys).getResultList();
return query.getResultList();
}
. . .
}
[/sourcecode]
It does not really matter that the Entity have primary attribute as Key class for issuing a batch get from datastore. As long as it is a primary attribute such as Long, String it does not matter. The only requirement is that the Keys in query has to belong to the same entity group and list of primary attribute is being passed in the Query.
One of the way we might have implemented fetchTimesheetEntries() is by issuing a JPA IN query. It would have been inefficient and will also suffer from limitations. You can read a bit more about JPA IN query here.
Datastore batch get is super-effecient way of loading multiple entities. We used Jpa batch get in our application whenever we had a list of entities keys. It is definitely a simple and effective performance optimization.
>The only requirement is that the Keys in query has to belong to the same entity group and list of primary attribute is being passed in the Query.
So does this mean that it would not work for root entities or that it would just work for the root entities and not the child entities under it?
Each root entity will belong to a different entity group. So, batch get will not work there.