Custom Versioning for Google Docs: Working with Google Docs on Google App Engine

Table of contents

Reading Time: 5 minutes

Inphina, as an expert on Google App Engine and Google Apps has enabled many medium to large organizations leverage the cloud by building, migrating or re-engineering complex line of business applications to the cloud making significant reductions in their capex expenditure. Contact us at cloud@inphina.com

As a last post in series of posts for building custom versioning for Google Docs, in this post, we would look at the interaction with Google Docs from Google App Engine. This piece is particularly interesting because we are talking cloud to cloud. Google Docs is hosted in the cloud and so is the app engine.

For working with Google Docs, there is a list of libraries that must be packaged with your application. The core set of libraries come from gdata-java-client and in this we are particularly interested in Google Documents List Data API.

First things first, you need to download the client library. Now, there are a select list of jars that we are interested in for Google docs, these are

[sourcecode language=”xml”]
<!– Adding <span class="hiddenSpellError" pre="Adding ">Gdata</span> dependencies –>
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>google-collect</artifactId>
<version>1.0-rc1</version>
</dependency>
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>google-jsr</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>google-media</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>google-client</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>google-client-meta</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>google-core</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>google-docs</artifactId>
<version>3.0</version>
</dependency>
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>google-docs-meta</artifactId>
<version>3.0</version>
</dependency>
[/sourcecode]

You would also need the mail, activation and servlet APIs.

[sourcecode language=”xml”]
<!– Mail, Activation and <span class="hiddenSpellError" pre="and ">Servlet</span> –>

<dependency>
<groupId>javax.mail</groupId>
<artifactId>mail</artifactId>
<version>1.4.1</version>
</dependency>
<dependency>
<groupId>javax.activation</groupId>
<artifactId>activation</artifactId>
<version>1.1.1</version>
</dependency>

<dependency>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
<version>2.5</version>
</dependency>
[/sourcecode]

For the complete listing of my pom.xml click here pom

There are some libraries which are not present on the maven repository. You could use the following script to install them as a part of your local repo or in nexus.

[sourcecode language=”bash”]
if [ "$1" == "" ]
then
echo "usage: $0 /path/to/gdata"
exit 1
fi

GDATA_PATH=$1

mvn install:install-file -Dfile=$GDATA_PATH/java/deps/google-collect-1.0-rc1.jar -DgroupId=com.google.gdata -DartifactId=google-collect -Dversion=1.0-rc1 -DgeneratePom=true -Dpackaging=jar
mvn install:install-file -Dfile=$GDATA_PATH/java/deps/jsr305.jar -DgroupId=com.google.gdata -DartifactId=google-jsr -Dversion=1.0 -DgeneratePom=true -Dpackaging=jar
mvn install:install-file -Dfile=$GDATA_PATH/java/lib/gdata-media-1.0.jar -DgroupId=com.google.gdata -DartifactId=google-media -Dversion=1.0 -DgeneratePom=true -Dpackaging=jar
mvn install:install-file -Dfile=$GDATA_PATH/java/lib/gdata-client-1.0.jar -DgroupId=com.google.gdata -DartifactId=google-client -Dversion=1.0 -DgeneratePom=true -Dpackaging=jar
mvn install:install-file -Dfile=$GDATA_PATH/java/lib/gdata-client-meta-1.0.jar -DgroupId=com.google.gdata -DartifactId=google-client-meta -Dversion=1.0 -DgeneratePom=true -Dpackaging=jar
mvn install:install-file -Dfile=$GDATA_PATH/java/lib/gdata-core-1.0.jar -DgroupId=com.google.gdata -DartifactId=google-core -Dversion=1.0 -DgeneratePom=true -Dpackaging=jar
mvn install:install-file -Dfile=$GDATA_PATH/java/lib/gdata-docs-3.0.jar -DgroupId=com.google.gdata -DartifactId=google-docs -Dversion=3.0 -DgeneratePom=true -Dpackaging=jar
mvn install:install-file -Dfile=$GDATA_PATH/java/lib/gdata-docs-meta-3.0.jar -DgroupId=com.google.gdata -DartifactId=google-docs-meta -Dversion=3.0 -DgeneratePom=true -Dpackaging=jar
[/sourcecode]

Ok, now we are all set to start coding.

The logic for the versioning component is simple. When there is a new file upload using the custom versioning component, it would

1. Check, if a file by the same name exists on the Google Docs for the user with whose credentials we are accessing the Google Docs. You could use oauth for authentication.
2. If there is no file with the name then upload the new document
3. else, version the earlier document with an incremented counter version number and assign it to an archival folder, then
4. upload the new file

The exact logic is written in the method below

[sourcecode language=”java”]
private static final String FOLDER_URL = "https://docs.google.com/feeds/default/private/full/-/folder";
private static final String MY_ARCHIVAL_FOLDER = "Doc-Archive";
private static final String DOCUMENT_URL = "https://docs.google.com/feeds/default/private/full/";
private static final String USERNAME = "<your username>";
private static final String PASSWORD = "<your password>";
private static final String VERSION_SEPERATOR = "_v";

public boolean uploadFileToGoogleDocs(FileItem file, String description) throws MalformedURLException, IOException, ServiceException {
boolean uploadStatus = false;

DocsService docsService = getDocumentService();
URL documentUri = new URL(DOCUMENT_URL);
String documentName = file.getName();

DocumentListEntry documentEntryFound = fetchEntryWithNameMatch(documentName, documentUri, docsService, true);

if (documentEntryFound == null) {
uploadNewFile(docsService, file, description);
} else {
versionTheExistingDocument(docsService, documentUri, documentName, documentEntryFound);
uploadNewFile(docsService, file, description);
}

uploadStatus = true;
return uploadStatus;

}

[/sourcecode]

Let us look at individual methods now. The getDocumentService is responsible for getting the DocumentService which would be used to access the docs.

[sourcecode language=”java”]
private DocsService getDocumentService() throws AuthenticationException {
// TODO Replace this with credentials from logged in user
DocsService docsService = new DocsService("My-Document-Service");
docsService.setUserCredentials(USERNAME, PASSWORD);
return docsService;
}
[/sourcecode]

The fetchEntryWithNameMatch method is used to find the matching entries (note the entry could be a document entry or a folder entry as well) from the docs.

[sourcecode language=”java”]
private DocumentListEntry fetchEntryWithNameMatch(String searchTitle, URL url, DocsService docsService,
boolean exactMatch) throws IOException, ServiceException {
DocumentListEntry documentListEntry = null;

System.out.println("Printing a list of matching files … n");
DocumentQuery query = new DocumentQuery(url);
query.setTitleQuery(searchTitle);
query.setTitleExact(exactMatch);
// TODO revisit, this would allow versioning only 10 docs, we should be
// able to sort the list somehow
query.setMaxResults(10);
DocumentListFeed feed = docsService.getFeed(query, DocumentListFeed.class);
int numberOfEntriesRetrieved = feed.getEntries().size();
System.out.println("Number of entries retrieved " + numberOfEntriesRetrieved);
if (numberOfEntriesRetrieved > 0) {
printDocuments(feed);
List<DocumentListEntry> list = feed.getEntries();
Collections.sort(list, new EntryComparator());
documentListEntry = list.get(0);
}
return documentListEntry;
}
[/sourcecode]

As you would notice, we pass an exactMatch boolean to the method. If the boolean is set to true then the method does an exact match with the title of the entry, else it does a close match.

So the first time we do an exact match and see if the document already exists or not. If it does not then we do the upload, which is something that we saw in the last post as well

[sourcecode language=”java”]
private void uploadNewFile(DocsService docsService, FileItem file, String description) throws AuthenticationException, IOException,
ServiceException, MalformedURLException {
String mimeType = DocumentListEntry.MediaType.fromFileName(file.getName()).getMimeType();

DocumentListEntry newDocument = new DocumentListEntry();
newDocument.setMediaSource(new MediaByteArraySource(file.get(), mimeType));
newDocument.setTitle(new PlainTextConstruct(file.getName()));
newDocument.setDescription(description);
System.out.println("Uploaded document with description " + description);
docsService.insert(new URL(DOCUMENT_URL), newDocument);
}
[/sourcecode]

If a document with the same name exists, then we need to get to the versioning logic. As you can see from the method versionTheExistingDocument, we do three things here. We get the version number for the existing document, then we change the name of the existing document with the latest version number and then also assign it the archival folder. Once all of this is done, we can now upload the new document as per the new document upload routine.

[sourcecode language=”java”]
private void versionTheExistingDocument(DocsService docsService, URL documentUri, String documentName,
DocumentListEntry documentEntryFound) throws IOException, ServiceException, MalformedURLException {
System.out.println("Entering the versioning logic for document: " + documentEntryFound.getTitle());
int newVersionNumber = getNewVersionNumberForTheDocument(docsService, documentUri, documentName);
String newDocumentName = documentName + VERSION_SEPERATOR + newVersionNumber;
changeNameOfExistingDocument(newDocumentName, documentEntryFound);

assignOriginalDocumentToArchivalFolder(docsService,
fetchEntryWithNameMatch(newDocumentName, documentUri, docsService, true));
}

private void assignOriginalDocumentToArchivalFolder(DocsService docsService, DocumentListEntry documentEntryFound)
throws IOException, ServiceException, MalformedURLException {
URL folderfeedUri = new URL(FOLDER_URL);
DocumentListEntry archivalFolderEntry = fetchEntryWithNameMatch(MY_ARCHIVAL_FOLDER, folderfeedUri, docsService,
true);
String archivalFolderUri = ((MediaContent) archivalFolderEntry.getContent()).getUri();
System.out.println("Archival folder URI is " + archivalFolderUri);

docsService.insert(new URL(archivalFolderUri), documentEntryFound);
}

private void changeNameOfExistingDocument(String newName, DocumentListEntry documentEntryFound) throws IOException,
ServiceException {
documentEntryFound.setTitle(new PlainTextConstruct(newName));
documentEntryFound.update();
}

private int getNewVersionNumberForTheDocument(DocsService docsService, URL feedUri, String documentName)
throws IOException, ServiceException {
String lastVersionNumber = fetchCounterForArchivedDocument(documentName,
fetchEntryWithNameMatch(documentName + VERSION_SEPERATOR, feedUri, docsService, false));
System.out.println("Got the counter as " + lastVersionNumber);
int newVersionNumber = new Integer(lastVersionNumber) + 1;
return newVersionNumber;
}

private String fetchCounterForArchivedDocument(String documentName, DocumentListEntry documentFound) {
String counter = "0";
if (documentFound != null) {
String title = documentFound.getTitle().getPlainText();
System.out.println("Title is " + title);
counter = title.substring(documentName.length() + VERSION_SEPERATOR.length());
}
return counter;
}
[/sourcecode]

The interesting thing to notice here is the fetchEntryWithNameMatch method, in which you can do a DocumentQuery

[sourcecode language=”java”]
DocumentQuery query = new DocumentQuery(url);
query.setTitleQuery(searchTitle);
query.setTitleExact(exactMatch);
// TODO revisit, this would allow versioning only 10 docs, we should be
// able to sort the list somehow
query.setMaxResults(10);
DocumentListFeed feed = docsService.getFeed(query, DocumentListFeed.class);
int numberOfEntriesRetrieved = feed.getEntries().size();
[/sourcecode]

As you can see, we can do a document query on the URL, which in our case could either be the document URL or the folder URL. Then we set the title with which we want the match to be done. We can restrict the number of results being returned by setting the max results. The return type is a DocumentListFeed which can be iterated over.

As you can see, it is easy to interact with Google Docs from the app engine using gdata-java-client. The business logic for your component could be as complex as you want it to be once you have the communicating infrastructure in place.

Once your application is deployed on the app engine, embedding it within your website of Google Sites is a matter of including an iframe which does the magic. For us, the iframe entry was

[sourcecode language=”language=”]
<iframe src="http://mypactpoc.appspot.com/genieUpload.html"></iframe>
[/sourcecode]