Hi everybody.
Again, I have been involved in several projects lately, so I neglected the blog. Shame on me! 😛
But today I can speak about something new for me. I wanted to create a REST(ful) API in a fast way, few libraries, eventually with good performance, reusable and… ok, you got the point.
I have already spoken about Netty VS GRPC as two libraries (GRPC is based on Netty, indeed) to do RPC, today I will speak about Finagle, a Twitter library that, among the other usages, gives you the possibility to start a full REST API with a server: it is easy to use it, lightweight and apparently very fast. Moreover it is protocol-agnostic, you can plug protobuf in front of it if you want (by using this library, for example).
The project I did is very simple: given a datamodel (a json object describing a Listing) we want to create a REST API that performs some basic requests: GET a listing by id (or GET all of them), POST a listing with its data but no id (the server will generate one), DELETE a listing by id (or DELETE all of them, an unsafe endpoint available only for test) or PUT a new version of the listing.
To make it more interesting, I added timing constraints on the development: between 2 or 3 hours, that, for a component with tests and libraries you have never used before, it is a good challenge.
First step was to create the data model and the data store. For the data model I decomposed the json into different case classes (some of them optional, also for making tests easier to be written) that you may find here. I needed a fast way to serialize and unserialize objects (objects constructed from case classes) into and from json strings. For that, Jackson library is always the best idea, and I found an utility here, very useful if you don’t want to waste time in copying code from somewhere. It needed only a test suite, for avoiding to spend time on errors when all would be ready, and to test the behaviour with null attributes.
The data store, in my mind, should be the only place in which a state is stored. I thought about using Redis, but this would have wasted a lot of time in configuring, integrating, testing… Time was passing by, I had to implement a fast solution. Let’s say that an in-memory store based on a Synchronised Map (yes, we want scalability and multithreading) could work for this. I took an old test suite I had for some Collections, I re-adapted them for my DataStore interface and I check the red phase of my TDD approach. Then I created a factory object (for improving the reusability) and the implementation with the synchronised map.
Sometimes on Twitter I have some nice observations on details: I know, the delete all is made in two steps, and what we return could be different from what we deleted in the clear, for concurrency. But it is not a real problem: the delete all is not a safe rest api, and if you have modified something in between the two calls you should also have this change in the system. Anyway, I think it was a waste of time to take care of a so particular case, considering that you are using a delete all (so, you are removing everything).
Now we have our data layer, a way to create json strings for our model, we are just missing the server side, and here enters Finagle. In the main of our entry class there are the 3 lines of code to setup a server. It needs a Service object, so I created one class that extends the Service class and takes one Request Handler per each HTTP Request Method (GET, POST, PUT and DELETE). My RequestHandlers are generic and passed by a factory method (of the object RequestHandler), so you may reuse all of these to implement other business logic and you have only to change the return object of the factory method to have the new behaviour propagated in your service.
In the same way the RequestHandlers receive, as constructor parameter, a DataStore from the factory object, so that it is easy to change data store in the project by implementing a new one and returning it in the factory method.
Just the time to create tests for the different handlers and voilà, the project is ready. Documentation, a piece of bash script to try to make a kind of integration test (not finished yet) and all on GitHub.
I copy here a piece of the documentation I wrote that I feel is really nice to read:
In this chapter I want to try to give my point of view on the topic that were important for the test
Concurrency
I try to use the best functional principles for approaching the concurrency problem. Everything is purely functional and uses immutable objects, except the DataStore, that is unique per instance and synchronized. In the implementation I gave, only the delAll method is made by two requests: the first because I wanted to give back the list of elements we cleaned, the second is the clear itself. This means that if we receive a request in the middle, the data deleted could be different from the one returned by the REST service. I could have synchronized the method but I don’t see an issue in that, also because the delAll was not required by the test and I would not leave it in a real scenario, too dangerous (I coded it only to help me with tests).
Reuse
All the relationships among objects are kept abstract, meaning that RequestHandlers are using DataStores, and not the implementation SimpleMapStore. Also RestService is using generic RequestHandlers for handling the requests. This means that the code is less coupled, more testable (easier to mock) and more reusable. The use of factories (DataStoreobject or RequestHandler object) makes it easy to change the implementation of the different behaviour without having to change the objects that use it (for example, I could implement a DataStore based on redis or a database, and it is enough to change the factory method to have the whole application use the new object). At an architectural level, the REST api could easily be used to store different type of data because some of its functions, like the ones related to creating or parsing JSON, have generic implementations. It is enough to create a different set of case classes to change the data stored by the component.
Performance
The REST API relies on Finagle to the performance. The bottle neck is the synchronisation of the map. It is easy to substitute the SimpleMapStore with a store based on Redis or on a database, if needed. Using factories (like the DataStore object) it is easy to adapt the full application to different stores, with more complex behaviour and different capabilities.
Scalability
The REST API is generic and reusable. You can replicate it in several instances and you will have only the problem of scaling the DataStore. If you use redis or a database, this is easily done. If you want to keep the in memory store you have to provide a scalable policy for the data: for example you can run several instances and keep on each of them a subset of the data. In this case the function delAll can become useful for scaling again: the router based on data can change its policy and simply request to clean one instance (this gives back the list of Listing contained before the clean), start more instances and then resubmit the Listings to the new instances with the new policy.
When I will have time I will try to test Finagle against GRPC and Netty. It is all for today. Stay tuned!
Leave a Reply