How to emulate a Java map using FlatBuffers
TL;DR if you just want to see the code for the Java map in a FlatBuffer, see
this
Stack Overflow answer I posted
With this file, we can use the FlatBuffers compiler to generate the Java files
necessary to access data in a buffer that was created using this schema. I will
omit those Java files as they can easily be created with the compiler and are
quite dense/verbose. The compiler would generate two files, one called
DictionaryRoot.java and another called DataItem.java.
Recently, I have been working on an application where starting latency
is very important. At the same time, there is a lot of embedded data that
needs to be accessed at runtime, and access latency is also important. One
other aspect is that the runtime data had a central source located on a remote
database, however, this application is a standalone embedded application and
therefore needs the runtime data packaged alongside it. Communicating with a
remote database is out of the question. Another constraint on the application
is that the size of the jar on disk could not be enormous.
So gathering all the constraints we have:
- startup latency needs to be low (sub 500ms)
- no communication with remote database or services
- size on disk has to be as small as possible
Naturally, my first attempt at a storage solution was an in-memory database
with the same schema that was used on the remote database. That ended up with
slow startup times, a large footprint on disk, and slow queries due to lots of
joins. My second attempt was quite archaic, using files to store the data and
loaded those into concurrent Hashmaps at runtime. Start-up times improved (20
seconds down to 1 second), size on disk was also smaller (a couple hundred
megabytes to 20 or so megabytes) and data access times were much faster.
Still, a 1 second start-up time was too slow, enter
FlatBuffers. Part of the reason Java maps were too slow was due to the front-loaded work
of reading the files and creating all the data-structure objects necessary to
hold the data. Once the data was loaded, however, everything worked like a
charm. Flatbuffers allow access to serialized data without having to parse or
unpack the data. Without the need to unpack the data, this would certainly
lower the startup time of the application but we still need to be able to
"query" the data like a map. Out of the box, FlatBuffers does not support Java
maps but you can _emulate_ a java map using their constructs. What follows is
an example of how to use FlatBuffers and emulate a java map.
The first thing you need for FlatBuffer is a schema file. This gives "shape"
to the data and allows the FlatBuffer compiler to generate Java files
necessary for accessing data within the buffer and for constructing buffer to
save to a binary file.
Schema.fbs file
In your Java application
Using those two generated Java files you will need to construct the buffer. A couple of notes:
I hope this helps someone out, when I was trying to figure this out, I could not find a complete example and it was a frustrating, slow journey to figure out but a fun one nonetheless!
- you need to use the FlatBufferBuilder to create strings, vectors and other non-scalar data items. The return values from a call to the FlatBufferBuilder create methods, is the offset to the data it created. This needs to be saved to later add to an object.
- start from the innermost data and work your way out. So in our case you would...
- create the data (and their offsets) for the fields in DataItem
- create the DataItem adding the offsets to it
- use the FlatBufferBuilder to create a vector
- add the DataTiem offset to the vector
- create the DictoryRoot
- add the vector to the DictionaryRoot.
- For the map to work, you need to use the FlatBufferBuilder and create a sorted vector of Tables. If you do not, then you won't be able to find values by the key.
The following code uses the classes generated by the FlatBuffer compiler to create a buffer and write it to a file. There are inline comments to clarify what is going on. NOTE: I know this method violates single responsibility but it's an example only to get people started =).
I hope this helps someone out, when I was trying to figure this out, I could not find a complete example and it was a frustrating, slow journey to figure out but a fun one nonetheless!
Comments
Post a Comment