How to emulate a Java map using FlatBuffers

TL;DR if you just want to see the code for the Java map in a FlatBuffer, see this Stack Overflow answer I posted 

 Recently, I have been working on an application where starting latency is very important. At the same time, there is a lot of embedded data that needs to be accessed at runtime, and access latency is also important. One other aspect is that the runtime data had a central source located on a remote database, however, this application is a standalone embedded application and therefore needs the runtime data packaged alongside it. Communicating with a remote database is out of the question. Another constraint on the application is that the size of the jar on disk could not be enormous. 

So gathering all the constraints we have:
  1. startup latency needs to be low (sub 500ms)
  2. no communication with remote database or services
  3. size on disk has to be as small as possible

Naturally, my first attempt at a storage solution was an in-memory database with the same schema that was used on the remote database. That ended up with slow startup times, a large footprint on disk, and slow queries due to lots of joins. My second attempt was quite archaic, using files to store the data and loaded those into concurrent Hashmaps at runtime. Start-up times improved (20 seconds down to 1 second), size on disk was also smaller (a couple hundred megabytes to 20 or so megabytes) and data access times were much faster. 

Still, a 1 second start-up time was too slow, enter FlatBuffers. Part of the reason Java maps were too slow was due to the front-loaded work of reading the files and creating all the data-structure objects necessary to hold the data. Once the data was loaded, however, everything worked like a charm. Flatbuffers allow access to serialized data without having to parse or unpack the data. Without the need to unpack the data, this would certainly lower the startup time of the application but we still need to be able to "query" the data like a map. Out of the box, FlatBuffers does not support Java maps but you can _emulate_ a java map using their constructs. What follows is an example of how to use FlatBuffers and emulate a java map.
The first thing you need for FlatBuffer is a schema file. This gives "shape" to the data and allows the FlatBuffer compiler to generate Java files necessary for accessing data within the buffer and for constructing buffer to save to a binary file.

Schema.fbs file


With this file, we can use the FlatBuffers compiler to generate the Java files necessary to access data in a buffer that was created using this schema. I will omit those Java files as they can easily be created with the compiler and are quite dense/verbose. The compiler would generate two files, one called DictionaryRoot.java and another called DataItem.java.

In your Java application

Using those two generated Java files you will need to construct the buffer. 

 A couple of notes:
  • you need to use the FlatBufferBuilder to create strings, vectors and other non-scalar data items. The return values from a call to the FlatBufferBuilder create methods, is the offset to the data it created. This needs to be saved to later add to an object.
  • start from the innermost data and work your way out. So in our case you would...
    1.  create the data (and their offsets) for the fields in DataItem
    2. create the DataItem adding the offsets to it
    3. use the FlatBufferBuilder to create a vector
    4. add the DataTiem offset to the vector
    5. create the DictoryRoot
    6. add the vector to the DictionaryRoot.
  • For the map to work, you need to use the FlatBufferBuilder and create a sorted vector of Tables. If you do not, then you won't be able to find values by the key. 
The following code uses the classes generated by the FlatBuffer compiler to create a buffer and write it to a file. There are inline comments to clarify what is going on. NOTE: I know this method violates single responsibility but it's an example only to get people started =).

I hope this helps someone out, when I was trying to figure this out, I could not find a complete example and it was a frustrating, slow journey to figure out but a fun one nonetheless!

Comments

Popular Posts