In a previous blog post I wrote about integration testing with H2, an in-memory database. I mentioned that H2 does not perfectly emulate other databases such as Oracle. This means that H2 cannot execute all queries meant to be executed by an Oracle database. This has been a problem for me as I have been writing integration tests for code that makes many calls to an Oracle database. I outline here some problems I have run into and their solutions. Most of the problems are solved by modifying the queries. This is non-ideal, unless the change in query is an actual improvement to the query.
H2 reports a syntax error when it encounters a query of the form
"SELECT * FROM ((SELECT * FROM favorite_food) T )"
H2 reports the following error
org.h2.jdbc.JdbcSQLSyntaxErrorException: Syntax error in SQL statement "SELECT * FROM ((SELECT * FROM FAVORITE_FOOD) T[*] ) "; expected "UNION, EXCEPT, MINUS, INTERSECT, ORDER, OFFSET, FETCH, LIMIT, SAMPLE_SIZE, FOR, )"; SQL statement: [info] SELECT * FROM ((SELECT * FROM favorite_food) T ) [42001-199]
In this simple case the query can be easily fixed by getting rid of the unnecessary parenthesis.
"SELECT * FROM (SELECT * FROM favorite_food) T"
However, there are cases when it is not so simple. Consider a query of the form
SELECT * FROM ( (<subquery1>) table1 INNER JOIN (<subquery2>) table2 ON <join>)
H2 will report a syntax error for queries of this form, but Oracle can execute such queries. On the other hand if any of the parenthesis are removed Oracle will report a syntax error. One solution is to switch to Sqlite, another in-memory database, but Sqlite typically does a worse job of emulating Oracle than H2, so Sqlite might fail as well. I have seen multiple instances were both H2 and Sqlite fail to execute a query that Oracle can execute. Some other in-memory databases such as Hsql also fail on queries of this form. However, there is a solution to this problem. The query can be rewritten as
WITH table1 AS (<subquery1>) SELECT * FROM (table1 INNER JOIN (<subquery2>) table2 ON <join>)
WITH table1 AS (<subquery1>), WITH table2 AS (<subquery2>) SELECT * FROM (table1 INNER JOIN table2 ON <join>)
Now both Oracle and H2 can execute this query. In this case I argue that the query is now more readable, especially if the subqueries themselves are large and complicated. Hence, I think it is a good idea to make this change, even if you didn’t need to do it for testing purposes. You need to be careful when changing queries. In my case in addition to writing integration tests that showed that I got the expected results when querying the H2 database, I wrote several Postman tests and ran requests to the API which executed the query of interest, and ran the Postman tests on the unmodified and modified query.
Problem 2: Subtracting dates
In Oracle when two dates are subtracted the result is an integer representing the number of days between the two dates. For example
would give 5. Unfortunately H2 does not have the same behavior. Instead H2 would give an interval instead of an integer. This can cause problems, for example when the query compares the result of subtracting one day from another to an integer. Here is a concrete example of that. Say we have the following table and query
CREATE TABLE SALES_HISTORY ( PRODUCT_NAME varchar2(255) NOT NULL, DATE_IN_INVENTORY date NOT NULL, DATE_SOLD date NOT NULL, CONSTRAINT SALES_HISTORY_PK PRIMARY KEY (PRODUCT_NAME) ); SELECT product_name FROM sales_history WHERE date_in_inventory-date_sold <=7;
Here we are selecting all products where an item was in inventory for a week or less before being sold. This query works in Oracle, but fails in H2. Fortunately this query can be changed to work in both H2 and Oracle, by taking advantage of the fact that in both Oracle and H2 subtracting an integer, n, from a date results in a date corresponding to the date that came n days before. For example
TO_DATE(‘2019-06-30’,‘YYYY-MM-DD’) - 5
Returns TO_DATE(‘2019-06-25’, ‘YYYY-MM-DD’) in both Oracle and H2. Thus the previous query can be rewritten as
SELECT product_name FROM sales_history WHERE date_in_inventory – 7 <= date_sold
In this case I believe that the first query is clearer than the second, so I leave it up to you to decide if you want to make this kind of change if you come across this kind of situation, to get your integration tests to work with H2.
I have also seen queries similar to
SELECT product_name FROM sales_history WHERE date_in_inventory – date_sold = 0
This query selects products that were sold the day they were in inventory. This query works in Oracle and fails in H2. It can be rewritten as
SELECT product_name FROM sales_history WHERE date_in_inventory = date_sold
This query works in both Oracle and H2 and is better than the previous query. Again, changes to queries have to be tested carefully and I tested changes that I made using integration and Postman tests.
There are some Oracle built in functions that H2 does not have. One of those is regexp_count, which counts the number of matches of a regular expression in a string. I have encountered a query similar to
SELECT * FROM favorite_food WHERE REGEXP_COUNT(food, ‘,’) = 0
In this case we are using regexp_count, but we are counting the number of commas in food, not the number of times a regular expression is matched in food. We can write this query as
SELECT * FROM favorite_food WHERE INSTR(food, ‘,’) = 0
In this case INSTR returns the index of the first appearance of the substring ‘,’ in food and 0 if it does not appear. This is an improvement to query and this change should be made even if it were not needed for the integration test with H2 to work. Of course integration and Postman tests are needed to insure that the change was not harmful.