How I Use Component

May 01, 2016

The past is a foreign country: they do things differently there

I have enjoyed reading a number of posts this year on how people use (or not) Component.

At first blush Component was a welcome chord of familiarity in an unfamiliar language. I adopted it as a Java developer might, more than managing lifecycle and dependency injection my system map resembled a graph of objects, records and protocols blurring the line between state and behaviour. This was fine.

In time I have come to use protocols sparingly, and then only as a construct for polymorphism. The Prismatic guidelines on namespace organisation (particularly the public pragma block) and namespace aliasing allow an abstraction to be defined without reflexively reaching for a protocol.

Here is an example of how I use Component (and Schema) to manage the state of a Cassandra connection in a library that is shared between a number of projects, a few points:

Drop the Record

Stuart explains that any type of object can be a component, not just records and maps.

Each Cassandra connection requires a cluster, a simple Java object created with some configuration requiring minimal lifecycle management. Rather than wrapping this in a record it’s perfectly reasonable to extend the Cluster type with the Lifecycle protocol.

(extend-type Cluster 
  (start [this]
    (log/info "initializing cluster")
    (.init this))
  (stop [this]
    (log/info "stopping cluster")
    (.close this)
    (log/info "cluster stopped")

Drop the Protocol

My Cassandra connection has no protocol beyond cp/Lifecycle, no behaviour here only state:

(s/defrecord Connection [keyspace :- String
                         queries :- (s/maybe Queries)
                         prepared :- (s/maybe Prepared)
                         session :- (s/maybe Session)
                         cluster :- (s/maybe Cluster)
                         default-fetch-size :- (s/maybe Number)]
  (start [this]
    (assert (and cluster keyspace) "cluster and/or keyspace cannot be nil")
    (assoc this :session session
                :prepared (->prepared session queries)
                :default-fetch-size default-fetch-size)))
  (stop [this]

I intend that connection state to be used in a particular way, and I define two functions (execute and execute-batch) in the public pragma block of that namespace:

;;; Public

(s/defn execute
  ([connection executable]
    (execute connection executable nil))
  ([connection :- Connection
    executable :- Executable

(s/defn execute-batch
  ([connection commands]
    (execute-batch connection commands :logged))
  ([connection commands batch-type]
    (execute-batch connection commands batch-type nil))
  ([connection :- Connection
    commands :- [Command]
    batch-type :- Keyword

In general I alias that namespace as ‘cassandra’, giving further context as used:

(cassandra/execute connection :statement {:values {:field "value"}})

That was not my initial approach however, there was a time I was so pleased with the protocol that I provided it as an example on the mailing list:

(defprotocol Connection
  (bind [this stmt vals] "bind values to a prepared statement")
  (prepare [this query] "provide a prepared statement from a query")
  (execute [this stmt] [this stmt opts] "execute a query or statement")
  (execute-batch [this stmts] [this stmts opts] "execute a batch of statements")

So why did I kill it? Malcolm Sparks strikes on the main reason in the conversation linked above:

“It can be a lot of work maintaining a stub/mock implementation”

Protocols come at a cost. Perhaps you settle on the wrong abstraction, or, in more practical terms:

  1. They don’t always play nicely with the REPL
  2. You can’t use with-redefs in your tests to change the behaviour of a protocol function.

My protocol came with a stub implementation that grew in functionality as tests and edge cases abounded. My tests felt increasingly divorced from the simple lexical scope they defined, eventually I threw it out in favour of a surgical with-redefs where required, I haven’t looked back.

(is (= 1 (with-redefs [cassandra/execute (fn [connection executable opts] (async/to-chan [1 2 3]))]
           (async/<!! (time.series/read test-connection :read-event-by-date {})))))

I prefer to keep as much within lexical scope as possible, even at the cost of a little repetition. I want to be able to grok the full context of a test from an s-expr without having to navigate to a namespace that provides a mock implementation of a protocol whose only polymorphic use is between live and test code.

I have no need to run tests in parallel, but in that case with-redefs would be problematic. At that point I could choose to reify a protocol that wraps my functions, but if I’ve started with a protocol I preclude using with-redefs in my tests entirely.

As an old Java developer it felt natural, perhaps even slightly more official, to describe an abstraction via a protocol, remember that protocols in Clojure describe an interface but don’t provide any enforcement:

(defprotocol TwoFunctions
  (one [this] "one")
  (two [this] "two"))
=> TwoFunctions
(defrecord NaughtyConcreteImpl []
  (one [_] (prn "hello")))
=> user.NaughtyConcreteImpl
(one (->NaughtyConcreteImpl))
=> nil
(two (->NaughtyConcreteImpl))
AbstractMethodError Method user/NaughtyConcreteImpl.two()Ljava/lang/Object; is abstract  user.NaughtyConcreteImpl (form-init9019193871841315968.clj:-1)

The Bigger Picture

I build web-applications in Clojure that are served via a framework based on Netty, Component, and core.async. It isn’t open-source, and likely would fill any prudent Clojurist with horror.

The server is a component, it routes requests to its dependencies marked with a simple Handler protocol and manages responses. As each Handler is also a component they have strictly the dependencies required to complete their task, each does as little work as possible - co-ordinating with and providing state to a mixure of pure and side-effecting functions. This handler-as-component idea is one that Stuart Sierra mentions obliquely in one of his talks.

Occasionally I spin up multiple servers in the single REPL in development or JVM in production, for this reason I can’t use Mount but I do appreciate the intent behind it.