Basic Event Sourcing in Clojure

2017-07-17

Today I am going to talk about basic event sourcing without all the buzz and fuzz. You don’t need Kafka, opaque containers, external providers, and/or a fancy distributed, fault-tolerant cluster setup to do it.

In Clojure and Clojurescript, it’s very commmon to store app state in a single atom. What do you do when you want persistence? The default is to start integrating with some SQL database. This requires additional software, you have to setup schemas and the data model works is different from the native Clojure one.

What else can you do? For things that run on a single server, we can use files. Files are great and are generally underestimated. Hacker News runs on files, for example. One approach is to persist the entire atom to file whenever we change it. It might look something like this:

Basic atom db persistence

;; initial db
(def db (atom {"foo@bar.com" {:cookies 1}}))

;; update db
(swap! db #(assoc-in % ["foo@bar.com" :cookies] 5))

;; persist db to file
(spit "app.db" (prn-str @db))

We would probably want to do the swap! and spit in some form of transaction function. We can verify that the db has been persisted correctly:

$ cat app.db
{"foo@bar.com" {:cookies 5}}

On startup we simply run something like:

;; load db into memory
(reset! db (read-string (slurp "app.db")))

One thing that is missing so far is atomicity, ensuring that the file doesn’t get corrupted. We can solve this by writing to a temporary file and then renaming it once we know the data is on disk. See Brandon Bloom’s post Slurp and Spit.

Event sourcing

Event sourcing simply means that the source of truth consists of a series of events. We apply these events using some function that creates an aggregate view of what we are interested in. In Clojure code:

;; two events
(def e0 [:add-cookie {:email "foo@bar.com" :cookies 1}])
(def e1 [:add-cookie {:email "foo@bar.com" :cookies 5}])

;; state transition for adding cookies
(defn add-cookie [state {:keys [email cookies]}]
  (assoc-in state [email :cookies] cookies))

;; how we do state transitions, f: s0 => s1
(defn next-state [state [type data]]
  (cond
    (= type :add-cookie) (add-cookie state data)
    :else                 state))

;; update in-memory db as events happen
(swap! db #(next-state % e0))

;; also persist events to disk
(spit "events.db" (prn-str e0) :append true)
(spit "events.db" (prn-str e1) :append true)

;; on startup, load and aggregate events into in-memory db
(reset! db (reduce next-state {} [e0 e1]))

We can now easily see the history of a user’s cookie count using grep:

$ cat events.db | grep foo@bar.com
[:add-cookie {:email "foo@bar.com", :cookies 1}]
[:add-cookie {:email "foo@bar.com", :cookies 5}]

This information would’ve been lost if we just kept mutating the database. Another feature this enables is is that we can change our schema easily. Let’s say we want to put all the user data in a :users key. All we have to do is change add-cookie:

(defn add-cookie [state {:keys [email cookies]}]
  (assoc-in state [:users email :cookies] cookies))

and re-build our aggregate state. We don’t risk losing any data because we never touch our real source of truth, events. In fact, you can have multiple aggregate states for multiple purposes, if you so wish.

Trade-offs

On the other hand, you get something quick to work with, you keep the entire history, you don’t need additional integrations, you can easily grep your data, and it’s very flexible.