Clojure Logo Relevance

Contact Us


phone 919.283.2748

info@clojure.com

303 South Roxboro St

Suite 20

Durham, NC 27701



Browse Our Archives »

Last time, I blogged about Clojure's new reducers library. This time I'd like to look at the details of what constitutes a reducer, as well as some background about the library.

What's a Reducing Function?

The reducers library is built around transforming reducing functions. A reducing function is simply a binary function, akin to the one you might pass to reduce. While the two arguments might be treated symmetrically by the function, there is an implied semantic that distinguishes the arguments: the first argument is a result or accumulator that is being built up by the reduction, while the second is some new input value from the source being reduced. While reduce works from the 'left', that is neither a property nor promise of the reducing function, but one of reduce itself. So we'll say simply that a reducing fn has the shape:

(f result input) -> new-result

In addition, a reducing fn may be called with no args, and should then return an identity value for its operation.

Transforming Reducing Functions

A function that transforms a reducing fn simply takes one, and returns another one:

(xf reducing-fn) -> reducing-fn

Many of the core collection operations can be expressed in terms of such a transformation. Imagine if we were to define the cores of map, filter and mapcat in this way:

(defn mapping [f]
  (fn [f1]
    (fn [result input]
      (f1 result (f input)))))

(defn filtering [pred]
  (fn [f1]
    (fn [result input]
      (if (pred input)
        (f1 result input)
        result))))

(defn mapcatting [f]
  (fn [f1]
    (fn [result input]
      (reduce f1 result (f input)))))

There are a few things to note:

  • The functions consist only of the core logic of their operations
  • That logic does not include any notion of collection, nor order
  • filtering and kin can 'skip' inputs by simply returning the incoming result
  • mapcatting and kin can produce more than one result per input by simply operating on result more than once

Using these directly is somewhat odd, because we are operating on the reducing operation rather than the collection:

(reduce + 0 (map inc [1 2 3 4]))
;;becomes
(reduce ((mapping inc) +) 0 [1 2 3 4])

Reducers

We expect map/filter etc to take and return logical collections. The premise of the reducers library is that the minimum definition of collection is something that is reducible. reduce ends up using a protocol (CollReduce) to ask the collection to reduce itself, so we can make reducible things by extending that protocol. Thus, given a collection and a reducing function transformer like those above, we can make a reducible with a function like this:

(defn reducer
  ([coll xf]
   (reify
    clojure.core.protocols/CollReduce
    (coll-reduce [_ f1 init]
      (clojure.core.protocols/coll-reduce coll (xf f1) init)))))

Now:

(reduce + 0 (map inc [1 2 3 4]))
;;becomes
(reduce + 0 (reducer [1 2 3 4] (mapping inc)))

That's better. It feels as if we have transformed the collection itself. Note:

  • reducer ultimately asks the source collection to reduce itself
  • reducer will work with any reducing function transformer

Another objective of the library is to support reducer-based code with the same shape as our current seq-based code. Getting there is easy:

(defn rmap [f coll]
  (reducer coll (mapping f)))

(defn rfilter [pred coll]
  (reducer coll (filtering pred)))

(defn rmapcat [f coll]
  (reducer coll (mapcatting f)))

(reduce + 0 (rmap inc [1 2 3 4]))
;=> 14

(reduce + 0 (rfilter even? [1 2 3 4]))
;=> 6

(reduce + 0 (rmapcat range [1 2 3 4 5]))
;=> 20

From Reducible to (Parallel) Foldable

While it is an interesting exercise to find another fundamental way to define the core collection operations, the end result is not much different, just faster, certainly something a state-of-the-art compilation and type system (had we one) might do for us given sequence code. To stop here would be to completely miss the point of the library. These operations have different, fundamentally simpler semantics than their sequence-based counterparts.

How does one define parallel mapping/filtering/mapcatting etc? We already did! As long as the transformation itself doesn't care about order (e.g. as take does), then a reducer is as foldable as its source. As with reduce, fold bottoms out on a protocol (CollFold), and our reducer can extend that:

(defn folder
  ([coll xf]
     (reify
      ;;extend CollReduce as before

      CollFold
      (coll-fold [_ n combinef reducef]
        (coll-fold coll n combinef (xf reducef))))))

Note that:

  • folder has the same requirements as reducer - collection + reducing function transformer
  • when fold is applied to something that can't fold, it devolves to reduce

Thus the real definitions of reducers/map et al use folder (while take uses reducer):

(defn rmap [f coll]
  (folder coll (mapping f)))

(defn rfilter [pred coll]
  (folder coll (filtering pred)))

(defn rmapcat [f coll]
  (folder coll (mapcatting f)))

Thus a wide variety of collection transformations can instead be expressed as reducing function transformations, and applied in both sequential and parallel contexts, across a wide variety of data structures.

The library deals with several other details, such as:

  • the transformers all need a nullary arity that just delegates to the transformed reducing function
  • the transformers support a ternary arity where 2 inputs are supplied per step, as occurs with reduce-kv and map sources
  • all of the reducers are curried

These additions are all mechanical, and are handled by macros. It is my hope that the above will help illuminate the core logic underlying the library.

Background

Much prior work highlights the value of fold as a primary mechanism for collection manipulation, superior to iteration, although most of that work was done in the context of recursively defined functions on lists or sequences - i.e. fold implies foldl/foldr, and the results remain inherently sequential.

The two primary motivators for this library were the Haskell Iteratee library and Guy Steele's ICFP '09 talk.

Haskell Iteratees

The Haskell Enumerator/Iteratee library and its antecedents are an inspiring effort to disentangle the source of data and the operations that might apply to it, and one of the first I think to reify the role of the 'iteratee'. An enumerator makes successive calls to the iteratee to supply it items, decoupling the iteratee from the data source. But the iteratee is still driving in some sense, as it is in charge of signaling Done, and, it returns on each step the next iteratee to use, effectively dictating a single thread of control. One benefit is that even operations like take can be defined functionally, as they can encode their internal state in the 'next' iteratee returned. OTOH, and unlike reducers, the design wraps the result being built up in a new iteratee each step, with potential allocation overhead.

Being an automaton in a state, an iteratee is like a reified left fold, and thus inherently serial. So, while they form quite a nice substrate for the design of, e.g. parsers, iteratees are unsuitable for defining things like map/filter etc if one intends to be able to parallelize them.

Guy Steele's ICFP '09 talk

Organizing Functional Code for Parallel Execution or, foldl and foldr Considered Slightly Harmful

This talk boils down to - stop programming with streams, lists, generators etc if you intend to exploit parallelism, as does the reducers library.

Where reducers diverges from that talk is in the structure of the fork/join parallel computation. Rather than map+reduce, reducers uses reduce+combine. This reflects 2 considerations:

  • It is accepted fork/join practice that at some point you stop splitting in half and handle the leaves 'sequentially'
    • if the best way to do that at the top is reduce, why not at the bottom as well?
  • map forces a result per input

You can see the awkwardness of the latter in the map/reduce-oriented definition of parallel filter in the talk, which must 'listify' items or return empty lists, creating a bunch of concatenation busy-work for the reducing step. Many other collection algorithms suffer similarly in their map/reduce-oriented implementations, having greater internal complexity and wrapping the results in collection representations, with corresponding creation of more garbage and reduction busy-work etc vs the reducing function transformer versions of same.

It is interesting that the accumulator style is not completely absent from the reducers design, in fact it is important to the characteristics just described. What has been abandoned are the single initial value and serial execution promises of foldl/r.

Summary

I hope this makes reducers easier to understand, use and define.

Rich

Comment

I'm happy to have pushed today the beginnings of a new Clojure library for higher-order manipulation of collections, based upon reduce and fold. Of course, Clojure already has Lisp's reduce, which corresponds to the traditional foldl of functional programming. reduce is based upon sequences, as are many of the core functions of Clojure, like map, filter etc. So, what could be better? It's a long story, so I'll give you the ending first:

  • There is a new namespace: clojure.core.reducers
  • It contains new versions of map, filter etc based upon transforming reducing functions - reducers
  • It contains a new function, fold, which is a parallel reduce+combine
  • fold uses fork/join when working with (the existing!) Clojure vectors and maps
  • Your new parallel code has exactly the same shape as your existing seq-based code
  • The reducers are composable
  • Reducer implementations are primarily functional - no iterators
  • The model uses regular data structures, not 'parallel collections' or other OO malarkey
  • It's fast, and can become faster still
  • This is work-in-progress

Basics

The story starts best at the bottom.

Clojure and other functional languages have a function called map that takes a function and a collection/list.

  • What does it mean to map a function on a collection?
  • What are the common signatures?
  • Do they complect what to do with how to do it?

The classic recursive functional definition of map is to apply f to the first thing in the collection, then cons the result onto the result of mapping f on the rest of the collection. This definition includes plenty of 'how':

  • How: mechanism - recursion
  • How: order - sequentially
  • How: laziness - (often) lazily
  • How: representation - making a list/seq, or other concrete collection

Newer OO frameworks will often remove some of these problems by having map be a function of fn * Coll -> Coll for any type of Coll, removing the sequentiality but also losing the laziness, and they still specify a concrete collection result.

Semantically, and minimally, map means "apply-to-all" e.g. (map inc coll) means give me a (logical) collection where every item is one greater than it was in coll. But, map doesn't know how to navigate around every collection - the use of seqs/lists/iterators/streams etc forces a shared known representation. Nor does inc (or any function) know how to apply itself to every collection representation, else we could just say (inc coll).

The only thing that knows how to apply a function to a collection is the collection itself.

What is the generic gateway to a collection applying things to itself? In Clojure, it is (internal) reduce.

We now have a new super-generalized and minimal abstraction for collections - a collection is some set of things that, when given a function to apply to its contents, can do so and give you the result, i.e. a collection is (at minimum) reducible. In other words, you can call reduce on it.

Thus, core.reducers/map is a function of fn * reducible -> reducible. (Whereas core/map is a fn of fn * seqable -> seqable)

Now, how? If someone is going to ask the result of (map inc coll) to reduce itself with some function f, map must ultimately ask coll to do the job. Rather than pass coll f, map passes coll a new, transformed, reducing function that takes what coll supplies, calls inc on it, and then calls f on that.

(reduce + (r/map inc [1 2 3])) === (reduce (fn [ret x] (+ ret (inc x))) (+) [1 2 3])

i.e. the core work of map f looks like this:

(fn [f1]
  (fn [ret v]
    (f1 ret (f v))))

It takes a reducing function f1, and returns a new reducing function that calls f1 after applying f to its input.

Thus you can define map as a function of fn * reducible -> reducible by merely transforming the reducing function. Mapping is semantically a function of the function of one step of a reduction. This transformation is decomplected from both representation and order. We call functions such as this map, that take a reducible, and in turn return something reducible via transformation of the reducing function, reducers.

Now let's revisit the hows above...

  • How: mechanism - functional transformation of reducing function
  • How: order - doesn't know
  • How: laziness - doesn't know
  • How: representation - doesn't build anything

It is important to note that now, when (map f coll) is called nothing happens except the creation of a recipe for a new collection, a recipe that is itself reducible. No work is done yet to the contained elements and no concrete collection is produced.

The beautiful thing is that this 'transformation of reducing function' mechanism also works for many of the traditional seq functions, like filter, take, flatten etc. Note the fact that filter is (potentially) contractive, and flatten is (potentially) expansive per step - the mechanism is general and not limited to 1:1 transformations. And other reducer definitions are as pretty as map's - none of the imperativeness of iterators, or generators with yield.

Ok, So Where's My Cake?

If map doesn't do the work of mapping, but merely creates a recipe, when does the work get done? When you reduce its result:

(require '[clojure.core.reducers :as r])
(reduce + (r/filter even? (r/map inc [1 1 1 2])))
;=> 6

That should look familiar - it's the same named functions, applied in the same order, with the same arguments, producing the same result as the Clojure's seq-based fns. The difference is that, reduce being eager, and these reducers fns being out of the seq game, there's no per-step allocation overhead, so it's faster. Laziness is great when you need it, but when you don't you shouldn't have to pay for it.

The reducer fns are curried, and they can be easily composed:

;;red is a reducer awaiting a collection
(def red (comp (r/filter even?) (r/map inc))) 
(reduce + (red [1 1 1 2]))
;=> 6

Thus reduction 'recipes' (reducers) are first class.

What if we want a collection result? It's good to know that into uses reduce:

(into [] (r/filter even? (r/map inc [1 1 1 2])))
;=> [2 2 2]

Note there are no intermediate collections produced.

And, of course, you don't always want a result of the same collection type:

(into #{} (r/filter even? (r/map inc [1 1 1 2])))
;=> #{2}

Simplicity is Opportunity

Decomplecting the core operations from representation and laziness has given us some speed, but what about the elimination of order? It should open the door to parallelism, but we are stuck with the semantics of reduce being foldl, i.e. it uses an accumulator and is fundamentally serial. We can parallelize reduction by using independent sub-reductions and combining their results, and the library defines a function that does just that: fold.

The primary signature of fold takes a combining function, a reducing function, and a collection and returns the result of combining the results of reducing subsegments of the collection, potentially in parallel. Obviously if the work is to occur in parallel, the functions must be associative, but they need not be commutative - fold preserves order. Note that there is no initial 'seed' or 'accumulator' value, as there may be with reduce and foldl. But, since the subsegments are themselves reduced (with reduce), it raises the question as to what supplies the seed values for those reductions?

The combining function (an associative binary fn) must have some 'identity' value, a value that, when combined with some X, yields X. 0 is an identity value for +, as is 1 for *. The combining fn must supply an identity value when called with no arguments (as do + and *). It will be called with no arguments to supply a seed for each leaf reduction. There is a fn (called monoid, shh!) to help you build such combining functions.

If no combining fn is supplied, the reducing fn is used. Simple folds look like reduces:

(r/fold + [1 2 3 4])
;=> 10

But by promising less (i.e. not promising stepwise reduction from left or right) fold can do more - run in parallel. It does this when the collection is amenable to parallel subdivision. Ideal candidates are data structures built from trees. Clojure vectors and maps are trees, and have parallel implementations of fold based upon the ForkJoin framework.

What if the underlying collection is not amenable (e.g. is a sequence)? fold just devolves into reduce, producing the same semantic, if not physical, result.

There's a tremendous amount you can accomplish with this reduce+combine strategy, especially when you consider that the map, filter etc reducers will not constitute independent layers of parallel jobs - they just transform the reducing fn working on the leaves.

You can have a look at the cat function included in the library for an interesting example of a combining fn. cat quickly gathers up the fold results, forming a binary tree with the reductions as leaves. It returns a highly abstract, yet now quite useful 'collection' that is just counted, reducible, foldable and seqable.

Oh yeah, perf. Don't be surprised to see things become 2-3X faster, or more with more cores.

More Opportunity (i.e. Work)

As much fun as this is, there's still more fun to be had by those so inclined:

  • There are more seq fns that could become reducer fns
  • Given multiple iterable sources, we should be able to build a multi-reducible, recovering the multi-input capabilities of map.
  • Arrays, arraylists, strings etc are all amenable to parallel fold.
    • fork/join-based vector fold is 14 lines, so these are not difficult.
  • Those IFn.LLL, DDD etc primitive-taking function interfaces can now spring to life.
    • We should be able to build primitive-transmitting reducer function pipelines.
    • We'd then need to look for and use them in the reductions of arrays and vectors of primitives
  • Internal reduce solves the lazily dangling open resource problem, a problem solved similarly by Haskell's enumerators and iteratees. (Note that unlike iteratees, reducers do not allocate wrappers per step)
    • We need reducible I/O sources.

Summary

By adopting an alternative view of collections as reducible, rather than seqable things, we can get a complementary set of fundamental operations that tradeoff laziness for parallelism, while retaining the same high-level, functional programming model. Because the two models retain the same shape, we can easily choose whichever is appropriate for the task at hand.

Follow Up

See the follow up blog post for more details about what constitutes a reducer, as well as some background about the library.

Rich

Comment

We're hitting the road! Clojure/core members have been delivering Clojure training since the language was in its infancy. Historically we've delivered this as an open registration option in partnership with The Pragmatic Studio in Reston, VA. Occasionally we'd venture out to other locations when there was a major event like Clojure/West or Clojure/conj. Now we'd like to find the cities most excited to host this training and bring it right to you!

We are continually updating our training offering to keep up with the latest changes in the Clojure world. ClojureScript is a great example of this, which we added to the training shortly after its release to the world. Our passion, both personally and professionally, is to share this language with others. The hands-on and interactive nature of the class gives you a chance to dive into the topics that interest you most.

Launch teh Clojure/core roadshow website and vote for your city!

We've launched the Clojure/core Roadshow registration site for you to tell us where we should deliver our three day training program this year. If you're looking to level up in your Clojure game, this is the perfect opportunity to do so with us. Head on over to the site and register your interest today. As an added bonus, we are going to be delivering an extra day of training specifically on Datomic. This day of training will be free and open to the public, but by attending the paid training, you will have a seat reserved for you at the Datomic training as well.

If you are an organization interested in partnering with us to host an open registration training in your city, we would love to talk with you. Contact us and let's make that happen.

If you are interested in arranging a private training event, we do that as well! We can tailor the course to address the needs of your organization. Head over to Relevance to see the details around private training.

Comment

Daniel Spiewak is a force of nature. As a highly respected member of the Scala programming language community and an overall thoughtful polyglot he seemed a natural fit as an interesting speaker for the 2011 Clojure/Conj. Daniel's talk entitled *Extreme Cleverness: Functional Data Structures in Scala was highly energetic and astoundingly informative. Daniel's open source contributions are not to be forgotten however. In addition to the important anti-xml Scala library, Daniel is attempting a bit of text-based collaboration magic with his Common Colaborative Coding Protocol project.

In this interview we talk about Clojure and its community, the conj function and Java.next languages learning from each other and their mutual struggle for mindshare.

What are your thoughts on the Clojure community?

Absolutely awesome! I've had the supreme pleasure of spending a fair amount of time amongst Clojurists (Clojurers??), and it's always a blast. Thanks in no small part to Rich Hickey, the Clojure community is smart, creative and absolutely intolerant of intolerance. The very fact that I was asked, as a Scala speaker, to come and present at Clojure/conj is a testament to how welcoming this community really is. Communities of this sort tend to be incredibly fertile soil for new ideas and profound advancement of the state of the art. I eagerly anticipate pillaging future innovations!

Are there any lessons that Clojure could learn from Scala?

I think there are a few lessons, the most important being that rich and uniform collections are extremely valuable. One of the things that drives me up the wall in Clojure is the following:

(conj [1 2 3] 4)     ; => [1 2 3 4]
(conj '(1 2 3) 4)    ; => (4 1 2 3)

Thus, the behavior of conj is a little bit unclear, since it depends on the input type. The object-oriented analogue to this would be if we defined two implementations of an interface, each defining the same method in opposite ways. I can't even begin to imagine what Liskov would say to that.

It should be noted that although Daniel is absolutely correct that conj depends on the input type, it's idea is that it will do the most efficient action given its input type. For lists the most effiecient action is to place an item at the front, for vectors elements are put onto the back.

Even more annoying is the following:

(drop 1 [1 2 3 4])    ; => (2 3 4)

So, I drop the first element of a vector and get a list?! That seems, well, weird. Combining this with the conj issue, we can get something really bizarre:

(conj (drop 1 [1 2 3 4]) 1)        ; => (1 2 3 4)
(conj [2 3 4] 1)                   ; => [2 3 4 1]

In other words, Clojure's sequence functions complect behavior with input type.

(my understanding is that a lot of these issues are resolved in ClojureScript, but I haven't had a chance to really look yet)

Clojure's sequence abstraction is a neat idea in theory, but the practice leaves something to be desired. The root of the problem here, incidentally, is that Clojure's sequence abstraction is a little bit of subtyping embedded within an otherwise functional language. This problem could be resolved by playing the same trick that Scala's collections do, putting factory accessors on each collection to allow functions to build a collection of a dynamically-determined type.

For anyone who might not know, Scala's collections solve this in a really nice way, so everything falls out essentially the way you would expect:

List(1, 2, 3) drop 1      // => List(2, 3)
Vector(1, 2, 3) drop 1    // => Vector(2, 3)

0 +: List(1, 2, 3)        // => List(0, 1, 2, 3)
0 +: Vector(1, 2, 3)      // => Vector(0, 1, 2, 3)

We can even do fancier things, like working with data structures that can only contain certain data types:

BitSet(1, 2, 3) map { _ * 2 }       // => BitSet(2, 4, 6)
BitSet(1, 2, 3) map { _.toString }  // => Set("1", "2", "3")

It's all very nice and extremely simple (though certainly not simple at all if you want to actually write a new collection, but I digress...). Granted, it has taken Scala four full collections rewrites to get to this point, but it's nice now that we're here! Some of this is using the magic of Scala's type system (like the BitSet thing), but almost all of it could be applied to Clojure.

Aside from collections, I would say that there are a few things that Scala does better than Clojure, but I'm not entirely sure I want Clojure to go down those roads. Polymorphic modules are super-useful, but they bring with them a giant raft of complexity. Virtual dispatch is a gateway to a lot of syntactic power that Clojure really cannot achieve without dodgy macros (for example, see Scala's parser combinator DSL), but again, lots of complexity and not really a good fit for the rest of the Clojure language.

Is there room for both Scala and Clojure?

Absolutely! So, here's the thing: neither Scala nor Clojure are going to displace Java. It's just not going to happen. Both Scala and Clojure are substantially simpler than Java, but vastly harder. Java has achieved what I like to call the "boat anchor" phase of a language lifecycle, where it is so ubiquitous that it has become impossible to unseat without a massive paradigm shift. An example of such a shift would be if quantum computing came around and no one ported the JVM to run on that architecture (or perhaps if it were to run substantially worse than other VMs).

In the meantime, we have a whole bunch of alternative JVM languages who's adoption is likely to remain rounding error for quite some time. Paradoxically, this is very good for the relationship between Scala and Clojure. Our biggest hurdle is overcoming Java's inertia, and that hurdle is so large that petty little power struggles between alternative languages are rendered insignificant.

In other words, it's not a zero sum game between Scala and Clojure. When a new developer picks up Scala or Clojure, they are choosing to bridge a gigantic chasm from the "mainstream" languages to something a little more on the fringe. Yes, this is a win for whatever specific language they choose, but it's an even bigger win for alternative languages in general. People need to realize that they have options, and in particular that they have options on the JVM. Every developer learning or being exposed to an alternative language is a win for everyone, and right now that is the lion's share of the battle.

In the long term, as Clojure and Scala grow in adoption, we may find that the languages are starting to be in competition for the same problem spaces. However, if and when this happens, I suspect the languages will be sufficiently divergent as to have generally disjoint niches. Even today, if you know both Scala and Clojure, there is rarely any ambiguity as to when you should apply one rather than the other. This is likely to continue even as alternative languages steal more and more of Java's developer share.

An idea that has a cult following in Clojure is that of optional (and pluggable) type systems.

What are your thoughts on this perspective?

That's an interesting question. Right off the bat, I think it's important to point out that a "pluggable" type system immediately implies an optional type system. There is also a weaker implication that such a type system could be split out from the normal compilation process and run separately. This is certainly something that I've heard Rich talk about on several occasions, and I think it's an interesting idea. However, there are a couple immediate things to point out.

For starters, I'm not entirely sure that a separate, pluggable static analysis phase is in fact a type system at all, particularly if you aren't using it to definitively reject programs. In a formal context, type systems are an intrinsic and (generally) inseparable part of the language. If you have a type system that is separable from the language, that type system is not a type system at all but merely static analysis. Now, there's absolutely nothing wrong with that! I think static analysis is a very useful tool and one which can answer much deeper and (often) more revealing questions than a baked-in static type checker. However, I think it's a definitional hair that's worth splitting because it has some fairly profound implications (such as the strength of your guarantees and how much you have to worry about composability).

Addressing the idea in general: I think it's a good one. The more questions we can answer about our code before it sits in front of customers, the better we are. We live in the information age, where an incredible amount of research and effort is being put into developing tools that allow us to make sense of truly astronomical amounts of data. Why shouldn't we be applying those tools and techniques to programming? I've seen some code bases which could qualify as astronomical in size, particularly once you consider the information density of most programming languages. Why shouldn't we be using rich, statistical tools to ask semantically deep questions about our code base? This seems like a natural evolution of the modern development process, at least to me. Pluggable static analyses make it possible to apply these sorts of techniques without being tied to the compilation cycle.

There are some very deep traps to beware though. For example, a "pluggable" type system is really not very useful unless you can compose multiple type systems into a coherant whole, and this is where problems arise. Anyone who has studied type theory in a formal context and run a few soundness proofs will understand that seemly innocuous and self-contained type rules will almost always interact in surprising ways. A good example of this is extending the simply-typed lambda calculus with reference values. The moment you do this, your type system explodes in complexity, despite the conceptually tiny nature of the change.

The fact is that you can't just tease type systems apart into composable atoms. Their features are intertwined; they are the very soul of complexity. (note: I'm not saying that they are complex to use or even to understand, but they are certainly complex to design and build) So, while I think that we're going to see an up-tick in the richness and proliferation of static analysis tools, I do not think you're going to see them really replacing type systems. I see these two concepts and orthogonal and complementary.

Is there anything in Clojure that you wish that Scala provided?

Oh, wow... I think the biggest thing is that Clojure has a much stronger focus on functional programming in that it carrots developers (nearly forcibly) to control their state. Anyone reading this who hasn't already watched Rich's talk on concurrency needs to go out and do that right now (then watch it again). The notion of epochal time is quite central to Clojure, and its benefits are manifest. Scala on the other hand is a little bit more laissez-faire with respect to state. Well-written Scala code is going to keep state on a tight leash and will end up looking a great deal like well-written Clojure code, at least in terms of where state is and how it is respected. However, Scala provides comparatively few incentives for developers to do the Right Thing. Unlike Java, it doesn't provide disincentives, but it also doesn't bias the coin in the right direction as Clojure does.

On another note, Clojure is a much simpler language than Scala, and that makes it very nice for a lot of things. Clojure code has an aesthetic which really appeals to me. That's not to say that Scala is overly complex or ugly, I just happen to really like a lot of what I've seen from Clojure. I certainly think there are syntactic corners of Scala that could be substantially smoothed. Maybe not really taking inspiration from Clojure, but certainly improving things in an area where Clojure is quite strong.

By and large though, I think that most of the awesome bullet-points that Clojure hits have already been stolen whole-sale by Scala. :-) A few examples: vector, map/set, agents (in Akka), STM (in Scala STM), SLIME (see ENSIME), etc. The Scala community is very actively watching the Clojure community. Y'all are a very fertile source of inspiration!

Comment

To celebrate the first day of the Clojure West conference, this entry of the (take...) series forcuses on the inimitable Baishampayan Ghose. As Clojure grows and gains mindshare, more and more companies are betting their success on the language, and Baishampayan's company Infinitely Beta is quite unique. In this installment we'll focus on finding Clojure, Clojure in India, and Clojure's past conferences.

How did you discover Clojure?

I have been a Lisp aficionado for as long as I can remember. I studied Scheme when I was in college and my first job out of college was a travel startup where I had the privilege of building an air search & ticketing platform in Common Lisp.

I first read about Clojure on comp.lang.lisp in early 2008 but I was quick to dismiss it in my mind because I was not too sure about its utility and honestly, I abhorred the `J' word :-)

One day, we incurred a loss of a million Rupees at my company because of a bug that got triggered by my (apparently pure) code due to our usage of mutable datastructures in Common Lisp. That made me grow a distaste for any language that had mutable datastructures by default and I started looking at Clojure in earnest.

I watched Rich's videos, read anything that talked about Clojure and started writing little bits of code in Clojure to get a feel of the language. It took me some time to understand Clojure's state management semantics and its relationship with the JVM but when I did, I was truly enlightened.

How are you using Clojure in your business?

We are two year old startup and we've been a Clojure company from day #1.

We've built two products so far, the first one being a equities research and portfolio management application for the Indian stock market trader. We used Clojure to build the backend that would do all the data-processing and expose a REST API for the frontend to consume.

The new product that we are building is a social CRM platform for enterprises. This product is built completely in Clojure from front to back. We are expected to launch before the end of this year and I will make announcement when we go live.

The only thing left for us is to move to ClojureScript and then our circle will be complete.

What motivated you to start Planet Clojure?

I was just scratching my own itch. That was mid-2009 -- we were busy building our first product and I wanted to keep a tab on all the Clojure related blogs in one place so that I could keep myself abreast with the latest happenings in the Clojure world. We have come a long way since then as we now track more than 300 Clojure blogs! A big shout-out goes to Alex Ott who has been instrumental in growing and managing Planet Clojure.

You've been to both Clojure conferences (and now the third!); what were the differences between the latest and the first?

My first reaction when I heard about the Clojure/conj 2010 was "Wow! Clojure is going mainstream". When I got to the conference I was amazed to meet and interact with the stars of the Clojure community and people whom I had known only from their blogs/tweets. I was fascinated to peek into the minds of people using Clojure to solve real-world problems, I felt that my decision to bet my startup on Clojure not all wrong -- I felt vindicated.

To me, the first conference was about Clojure asserting itself as a first-class programming language that could be used to solve all kinds of problems elegantly; it was about firmly establishing the very ethos of the language.

But the last one took the cake. While the first one was about Clojure being able to do certain things, the last one was about showing the world how it's done.

Since the first Clojure/conj the community has been able to come up with some absolutely mind-blowing things that address some really tough problems, problems that people otherwise dare to attempt solving.

Here we have David Nolen going into the hard-core land of Logic Programming, Pattern Matching and Predicate Dispatch, giving us tools to tackle really hard problems with equal amounts of ease and elegance. With the blessings of none other than Professor Friedman himself, I am sure David (and Ambrose) will take us to even greater heights.

We have Sam Aaron, with his amazing creation Overtone showing us how we can transplant Clojure's mind-blowing abstractions into a completely different domain, and thereby augmenting the whole creative process of making music.

And of course, there was ClojureScript, yet another gift to the world from Rich Hickey proving once again that simplicity, power and focus, when combined can help us solve some really tough challenges.

In a way, while Clojure/conj 2010 was about signalling the arrival of Clojure, Clojure/conj 2011 was about taking Clojure to completely uncharted territories, where no other programming language has gone before.

What is the state of Clojure in India?

As far as I know, we are probably the only Indian product company that does Clojure full-time. There are a tonne of people who are using or learning Clojure at a personal level but I guess there is still time before it's adopted by more companies. Most software companies in India do services exclusively so many times they can't bid for Clojure projects because they don't have any Clojure expertise and since they don't get those projects they don't find it worthwhile to invest in developing any expertise in Clojure. So it's kind of a chicken-and-egg problem. Having said that, India has a growing startup ecosystem and I think if Indian startup founders adopt Clojure in some way it will not only be a great competitive advantage for them, it will also help them in attracting talent -- it has certainly been that way for us.

I do a lot of Clojure evangelism in India by speaking at events, user-group meetings, etc. In the past I had conducted a 2 a day, completely gratis Clojure course and I would like to conduct one again, given sufficient interest.

Comment
Browse Our Archives »


Why Clojure/core


We combine the deep technical understanding of the creator of Clojure with the best practices of a premier agile development company to provide expert development. We are the core development team for Clojure itself, and we invest back into Clojure every week in order to sustain the platform and the community.

Learn More About Us »

Why Clojure


Complexity threatens to overwhelm the modern programmer. Rather than getting things done, it is all too easy to focus on tangential problems caused by the technology itself. Clojure was created to combat this state of affairs through: Simplicity, Empowerment and Focus.

Learn More About Clojure »


Simplicity, Power, Joy
Simplicity

Clojure is built from the ground up to be simple. Code is data. Functions are easy to write and test. Data is immutable, and state is explicit. Protocols and types expose the usable parts of OO, minus the pain. You don't have to settle for familiar complexity.

Empowerment

Don't throw away your existing code! Clojure is built on top of the JVM, the most widely used deployment platform today. Clojure provides fast, wrapper-free access to Java code, plus powerful new ways to use that code better.

Focus

Focus comes when you can work at the right level of abstraction. Clojure's design lets you start work immediately on your problems, not tool problems. Clojure's Lisp heritage provides the features you need to keep ancillary problems out of your way throughout a project's lifecycle.


top