Experimenting with Pulsar in Clojure

By | August 22, 2013

I’ve now started using Pulsar instead of trying out core.async as I needed a lightweight barrier to entry.

The problem: My database of choice (Neo4j) takes batch insertions in a single thread only, but clojure is by its very nature multi-threaded/concurrent/parallel (the exact wording of which I am no longer certain!). I process many files when building the database and use an agent to manage access to the BatchInserter. However, for relationships I must pass both node ID’s to the inserter, and because the agent handles node creation, I simply have the create-node fn return a promise, which the function passed to the agent delivers when the time come. Thus was born another agent, called relationship-queue, which waited until promises for both nodes were fulfilled and then delivered the create-relationship fn to the db agent.

However, this means only 1 relationship was waiting at a time, even if the next several were all fulfilled and could theoretically go. Enter Pulsar, which has “fibers” that are ForkJoin threads(I believe) and are lightweight, thus I can have a pool of fibers waiting for promises to be delivered, and speed up the relationship handling. I believe core.async could handle this as well, but quasar had a lower barrier to entry to understand (it could be the examples). I’m not saying I won’t investigate further, and replace quasar, but at this point it works, and works well.

I plan to investigate the Pulsar’s actor system as well to see if it could replace my agent system. The database agent is just an easy way to provide single threaded access to the BatchInserter, but agent’s are designed to hold data, but in my system that is not the case.


(send-off rel-queue -wait-for-rel start end rel-type data)

becomes


(p/spawn-fiber -wait-for-rel start end rel-type data :fj-pool "RelationshipQueue")

Which is very easy to follow and understand. But where the agent will act one at a time, the fibers will function with multiple times.

The -wait-for-rel code goes from:


(defn- -wait-for-rel
  "Waits for both start and end promises to be fulfilled, then submits the -create-rel to the db agent"
  [o start end rel-type data]
  (send-off db -create-rel @start @end rel-type data)
  o)

to become:


(defn- -wait-for-rel
  "Waits for both start and end promises to be fulfilled, then submits the -create-rel to the db agent"
  [start end rel-type data]
  (send-off db -create-rel @start @end rel-type data))

The “o” variable is just a placeholder for when the function is working within an agent. This is unnecessary in fibers and can be removed. This also shows fibers continue to work with clojure’s agent system, which is excellent.

Pulsar has one additional “feature”, or syntactic sugar, that improves promises. You can see the examples here:
Pulsar’s Promises.