My talk at Clojure Conj 2015

I presented my "Clojure for Business Teams - Decomplecting Data Analysis" talk at Clojure Conj 2015 on Monday. I had some technical issues with my laptop in the beginning, but managed to get past them. The audience was incredibly patient and supportive. I really appreciate that.

The video of the talk has already been published on Youtube, and I've included a PDF of the slide deck I used below 1.

Thanks to the awesome folks who organize Clojure Conj for inviting me to present. Do check out all the other great talks that they've posted also.

Clojure and Decomplecting data analysis for business users

TL;DR Data analysis has a lot of incidental complexity. Composable abstractions can help reduce this. Clojure solves this for developers, but can this be extended to business users who are non-coders? Developers want to continue writing programs in code, but business teams need ways to discover, configure and run them directly and independently. Could a Yahoo Pipes-like approach to data analysis bridge this gap?

Analyzing data with software tools is rife with incidental complexity 1 - by which I mean the myriad data formats and shapes, algorithms, APIs and programming languages that one has to wrangle with.

Clojure's functional abstractions and composability help dramatically reduce many of these incidental complexities - at least for developers. We are able to quickly build robust and reusable pipelines of data processing code and put them together in different combinations to tackle each, seemingly one-off, problem.

That's great for us. But, in most companies, Data Analysis cuts across many functional boundaries. We need to extend the flexibility in abstraction and composition beyond developers to business teams.

The average off-the-shelf data analytics tool for business users is either severely feature constrained in the name of ease-of-use, or profusely leaks its thinly wrapped programming language abstractions in the name of power. A savvy business user who needs something more expressive than Excel, without having to become a software engineer, doesn't have very many options. 2

I care deeply about this problem, because I have personally experienced this pain point. As a dev lead supporting a business team I've seen simple report requests turn into multi man-week software projects. The business users were frustrated waiting for weeks to see their minor report requests. And the devs were unhappy having to work on pointless reports all the time.

Essentially, developers want to continue creating their core value as software abstractions in code, and business teams need to be able to discover, configure and run them directly and independently. The UNIX shell is a classic example of this from a bygone era when business users actually wrote shell scripts. But, we can do better than the stream of bytes abstraction of the UNIX pipes, and fix some of other shell problems along the way. 3

At juxt.io, my co-founder Panch and I are working on this exact problem, which had dogged us in our past jobs. Our approach is to give business users a powerful and extensible platform, into which developers can directly contribute their code abstractions as content. Using a visual and interactive UX business users can drag and drop functional components onto a design canvas and wire them together and compose higher order functionality.

juxt.io Interactive Data Analysis Workbench

Juxt.io builds upon ideas from previous systems like Yahoo Pipes, Apple Quartz Composer and MIT Scratch to create an interactive data analysis workbench in which Analytics, ML and Web API components can be composed together, using Clojure as the extension language and Clojure's rich data structures.

That's the really high level picture. I'll dive into more of the details about the implementation stack, and some of the challenges and learning in upcoming posts.

Thanks to Panch Chandrasekaran and Sujatha Jagannathan for their feedback on early drafts of this post.

Footnotes:

1

Incidental (or accidental) complexity as opposed to the essential complexity of any given problem. See Fred Brooks' No Silver Bullet (wikipedia).

2

Yes, Excel has a Turing complete macro language but it's a huge leap from the worksheet's visual UX. There's also Excel-REPL, which is a nice idea but doesn't address the Excel side of the problem.

clojure-mode and slime

As a long time user of SLIME I was a bit disappointed to see clojure-mode 2.0 drop support for it in favor of nrepl. I looked into nrepl but found it to be not as feature complete as SLIME, at present. Also, I still work on some sizeable Common Lisp code, which relies entirely on SLIME, and I want to be able leverage any tooling work I do across all my projects – so SLIME wins.

As it turns out, it wasn't at all difficult to resurrect the SLIME integration code from clojure-mode 1.x and load it alongside the newer clojure-mode.

I've committed the clojure-mode-slime.el Emacs Lisp code into the following repo, along with some other Clojure/Emacs hacks:

http://github.com/kriyative/clojure-emacs-hacks

I hope this is useful to other SLIME die-hards in the Clojure community as well. Feedback and bug reports are most welcome.

clojure, emacs, and docs redux

A while back I'd written about looking up Javadocs from Clojure mode buffers. I got some good feedback on that post, so I thought I'd try and expand on that and see if I could integrate other Clojure documentation sources into a similar workflow.

view javadocs in emacs/w3m (click to view full size)

Looking up Clojure doc strings within Emacs is really easy. In any Clojure code buffer, you can place your cursor at a symbol and use C-c C-d d or M-x slime-describe-symbol to bring up the function or var doc string.

Previously, I'd made a slime-javadoc command that could be configured to search external Javadoc sources only. However, that mechanism could be applied more generally to more sources.

The new command M-x clojuredocs, can show either Javadocs for Java classes, or goes to the excellent clojuredocs.org site for documentation specific to clojure.core and a few other namespaces (such as ring), or eventually fallsback to a simple Google search.

So, after connecting to a Clojure instance (via clojure-jack-in or slime-connect), I can invoke M-x clojuredocs on any symbol and get back some relevant documentation or at the least som helpful pointers from Google.

I've committed an initial version of the Emacs Lisp code into the following github repo:

http://github.com/kriyative/clojure-emacs-hacks

Feedback and bug reports are most welcome.

New Site

This is my shiny new site hosted on github.com. I used to previously blog at funcall.posterous.com and cynojure.posterous.com, but I think github may be a more natural fit for my edit and publish workflow. After all, most of my professional and hobbyist programming work lives here, so why not my blog.

Incidentally, I'm using org-mode in Emacs to author the posts and a modified version of the static blog generation tool to generate the HTML and RSS.