Clojure and Decomplecting data analysis for business users

TL;DR Data analysis has a lot of incidental complexity. Composable abstractions can help reduce this. Clojure solves this for developers, but can this be extended to business users who are non-coders? Developers want to continue writing programs in code, but business teams need ways to discover, configure and run them directly and independently. Could a Yahoo Pipes-like approach to data analysis bridge this gap?

Analyzing data with software tools is rife with incidental complexity 1 - by which I mean the myriad data formats and shapes, algorithms, APIs and programming languages that one has to wrangle with.

Clojure's functional abstractions and composability help dramatically reduce many of these incidental complexities - at least for developers. We are able to quickly build robust and reusable pipelines of data processing code and put them together in different combinations to tackle each, seemingly one-off, problem.

That's great for us. But, in most companies, Data Analysis cuts across many functional boundaries. We need to extend the flexibility in abstraction and composition beyond developers to business teams.

The average off-the-shelf data analytics tool for business users is either severely feature constrained in the name of ease-of-use, or profusely leaks its thinly wrapped programming language abstractions in the name of power. A savvy business user who needs something more expressive than Excel, without having to become a software engineer, doesn't have very many options. 2

I care deeply about this problem, because I have personally experienced this pain point. As a dev lead supporting a business team I've seen simple report requests turn into multi man-week software projects. The business users were frustrated waiting for weeks to see their minor report requests. And the devs were unhappy having to work on pointless reports all the time.

Essentially, developers want to continue creating their core value as software abstractions in code, and business teams need to be able to discover, configure and run them directly and independently. The UNIX shell is a classic example of this from a bygone era when business users actually wrote shell scripts. But, we can do better than the stream of bytes abstraction of the UNIX pipes, and fix some of other shell problems along the way. 3

At, my co-founder Panch and I are working on this exact problem, which had dogged us in our past jobs. Our approach is to give business users a powerful and extensible platform, into which developers can directly contribute their code abstractions as content. Using a visual and interactive UX business users can drag and drop functional components onto a design canvas and wire them together and compose higher order functionality. Interactive Data Analysis Workbench builds upon ideas from previous systems like Yahoo Pipes, Apple Quartz Composer and MIT Scratch to create an interactive data analysis workbench in which Analytics, ML and Web API components can be composed together, using Clojure as the extension language and Clojure's rich data structures.

That's the really high level picture. I'll dive into more of the details about the implementation stack, and some of the challenges and learning in upcoming posts.

Thanks to Panch Chandrasekaran and Sujatha Jagannathan for their feedback on early drafts of this post.



Incidental (or accidental) complexity as opposed to the essential complexity of any given problem. See Fred Brooks' No Silver Bullet (wikipedia).


Yes, Excel has a Turing complete macro language but it's a huge leap from the worksheet's visual UX. There's also Excel-REPL, which is a nice idea but doesn't address the Excel side of the problem.

clojure-mode and slime

As a long time user of SLIME I was a bit disappointed to see clojure-mode 2.0 drop support for it in favor of nrepl. I looked into nrepl but found it to be not as feature complete as SLIME, at present. Also, I still work on some sizeable Common Lisp code, which relies entirely on SLIME, and I want to be able leverage any tooling work I do across all my projects – so SLIME wins.

As it turns out, it wasn't at all difficult to resurrect the SLIME integration code from clojure-mode 1.x and load it alongside the newer clojure-mode.

I've committed the clojure-mode-slime.el Emacs Lisp code into the following repo, along with some other Clojure/Emacs hacks:

I hope this is useful to other SLIME die-hards in the Clojure community as well. Feedback and bug reports are most welcome.

clojure, emacs, and docs redux

A while back I'd written about looking up Javadocs from Clojure mode buffers. I got some good feedback on that post, so I thought I'd try and expand on that and see if I could integrate other Clojure documentation sources into a similar workflow.

view javadocs in emacs/w3m (click to view full size)

Looking up Clojure doc strings within Emacs is really easy. In any Clojure code buffer, you can place your cursor at a symbol and use C-c C-d d or M-x slime-describe-symbol to bring up the function or var doc string.

Previously, I'd made a slime-javadoc command that could be configured to search external Javadoc sources only. However, that mechanism could be applied more generally to more sources.

The new command M-x clojuredocs, can show either Javadocs for Java classes, or goes to the excellent site for documentation specific to clojure.core and a few other namespaces (such as ring), or eventually fallsback to a simple Google search.

So, after connecting to a Clojure instance (via clojure-jack-in or slime-connect), I can invoke M-x clojuredocs on any symbol and get back some relevant documentation or at the least som helpful pointers from Google.

I've committed an initial version of the Emacs Lisp code into the following github repo:

Feedback and bug reports are most welcome.

New Site

This is my shiny new site hosted on I used to previously blog at and, but I think github may be a more natural fit for my edit and publish workflow. After all, most of my professional and hobbyist programming work lives here, so why not my blog.

Incidentally, I'm using org-mode in Emacs to author the posts and a modified version of the static blog generation tool to generate the HTML and RSS.

Emacs for Clojure - Part 2

This is the second in a two part post about a Clojure programmer workflow entirely within Emacs.

Editing Clojure

Some useful navigation key bindings in Clojure-mode, actually any Lisp code editing mode in Emacs, are as follows:

Keybinding Command
C-M-f forward-sexp
C-M-b backward-sexp
C-M-a beginning-of-defun
C-M-e end-of-defun
C-M-x slime-compile-defun
C-x C-e slime-eval-last-expression

Some of these key bindings get redefined when a buffer is in slime-mode to SLIME enhanced equivalents, but mostly they behave the same.

And, don't forget the exponential effect of the C-u prefix key.

Some other key bindings that are also useful are:

Keybinding Command Doc
C-M-q indent-sexp A lot of times when copying and pasting or otherwise modifying large blocks of s-expressions, the indentation of the code can get out of whack. indent-sexp can help restore the balance.
C-M-h mark-defun  
C-M-k kill-sexp  

Also worth knowing, the magic of dynamic abbrevs bound to the M-/ key binding. Dynamic abbrevs are a quick way to complete a long function or var name from a minimal prefix. It's very brute force (i.e, just searches for a match in all the open buffers), but since it's very fast, it comes in handy when you're working with partially evaluated Clojure code.

Clojure REPL

Everything begins with a Clojure instance which has SWANK loaded. Again, there are lots of ways of starting one of these, and the most common use case is with a Leiningen project setup. Setting up Leiningen is beyond the scope of this post, but the docs on Leiningen's github page are quite helpful in getting you started.

Once Leiningen is setup and you have a project.clj file for your project, you can invoke clojure-jack-in.

Once SLIME is connected, it's helpful to know the following commands:

Keybinding Command Doc
  slime-repl This is a quick way to jump to the *slime-repl clojure* buffer.
  slime-reset When your SLIME connection goes out of whack.
C-M-i slime-complete-symbol  
C-x e slime-eval-last-expression Makes every Clojure buffer into a REPL. Plus, it is very handy when iterating on tests.
C-c C-c, C-M-x slime-compile-defun This is convenient for compiling a defn or other top-level form, without having to put the cursor at the end of the expression.
  slime-list-connections If you find yourself having to connect to multiple SWANK servers this command is helpful in switching between them.
  slime-list-threads Show the list of scheduled JVM threads, and can provides an interactive way to kill running threads. Use with caution.