Jan 21, 2017

BuckleScript: a significant new OCaml to JavaScript compiler

RECENTLY I've been paying attention to the BuckleScript compiler from OCaml to JavaScript (ES5, to be exact). I'm seeing some significant advances that it brings to the state of the art, and thought I would share.

A little background: I use Scala.js at work, and it is rock-solid. Once you're set up with a project, you are pretty much good to go. Bindings to significant JS libraries are high-quality; I use some of the biggest names in JS frontend dev libraries today, and all in a nice Scala-style object-functional layer.

That aside, I still periodically look around the ecosystem and try to evaluate the latest developments. I've been hearing about BuckleScript for a while, but only recently decided to try it out for a side project after trying and failing to understand how this works in JavaScript.

So, without further ceremony, let me present my findings.

BuckleScript's compiler is insanely fast, because Bob Zhang (the project lead) has taken to heart the OCaml obsession with performance and routinely tries to optimise away millisecond-level delays. Once you get a taste of that speed (entire project compiled faster than you can blink), you'll find it difficult to go back to something slower. It's like getting to use git after you've used svn all your life.

It compiles to idiomatic, readable ES5, with nice indentation, (almost) no name mangling, and a one-to-one mapping from OCaml modules to ES modules (whichever kind you prefer: Require, AMD, Google).

It targets and integrates with the existing npm ecosystem; it doesn't try to introduce yet another package manager. It makes writing bindings (types) for existing JS libraries reasonably easy, and the documentation (the manual especially) is fantastic at guiding you through that.

OCaml is a bit of an odd duck syntax-wise, even among the functional programming languages. There are nuances to get used to. But once you get used to them, it is a pleasure to program in. And if you just can't get used to them, you can always try out Facebook's Reason, which is an alternative, JavaScript-lookalike syntax for OCaml.

This focus on integration and ease of reuse of the JavaScript ecosystem means it's feasible to leverage the npm package collection in your pure OCaml project. You can deploy a backend server which performs core functions as a statically compiled, native binary (i.e. not nodejs); deploy ES5 nodejs services which take advantage of specialised npm packages for MSSQL querying, or SOAP clients, or what have you; and you can deploy ES5 in your frontend webapp scripts, all written in pure OCaml.

So, why OCaml specifically? After all, there are plenty of nice languages out there.

As it turns out, that OCaml obsession with speed and type-safety together serve it well here. It's a pragmatic, simple, and matter-of-fact language, and its runtime model maps very well to the JavaScript runtime model, while also preserving important compile-time safety guarantees.

I should emphasise that it's pragmatic: you're not forced to deal with the added mental load of monads and other type system rabbit holes--but they're all available if you need them! Personally, I feel that laziness, purity, and monads have driven away more people than they've attracted. I think that OCaml gets the balance right. Others obviously feel differently. But in concrete terms, BuckleScript is a significant contribution that shouldn't be missed.

If you've developed in a compiled language for any length of time and like type-safety guarantees, after trying BuckleScript you'll be asking yourself how much time you've wasted over the years waiting for your compiler to finish so you can continue your edit-compile cycle. Maybe it's best not to think too much about that.

Nov 4, 2016


NANA is what I called him, but to the world he is the late Lutful Quadir Chowdhury, a well-loved husband, father, grandfather, prominent in the respect he earned from his large extended family, and a highly-regarded elder statesman of the South Asian banking community. For a time he settled down and raised his family in then-West Pakistan, but when the time came he chose to migrate to the newly-formed Bangladesh.

My Nana was a foundational rock of my childhood in Dhaka. To me his spacious home compound, with its large bungalow, lush green back lawn, and giant driveway with two garages at opposite ends, were my personal domain to explore and play in. I have many happy hours of running, jumping, cycling, cricket, soccer, badminton, carrom-board, Monopoly ... and every other childhood activity there. I remember monsoon seasons when we would bring the storm shutters down and the family would gather for tea and watch the rain pour down outside through the wall-to-wall balcony windows. Hours and hours of poring through his rooms and shelves filled to bursting with books. To this day I have dreams that are set in that house.

Nana was a collector of knick-knacks that any hipster today would give his right arm for. Hand-powered ice-cream makers, spotless Swiss food processor sets, a video cassette recorder that he would bring out to keep me entertained, a slide rule he gave me when I showed signs of interest in math ... but books got the pride of place in his home. They were proudly displayed on his shelves, kept in storage in room after room, kept near his side of the bed for easy access. I would spend hours entranced by those books. Every subject you can think of, from Greek mythology to Somerset Maugham, O. Henry (The Gift of the Magi) to Khushwant Singh. I, a small child, absorbed as much of them as he would lend me, like a sponge. They awakened my interest in chess, algebra, calculus, mythology, history, essays, poetry, ... the world.

He would take me for fun-filled trips to the video cassette rental store, treat us out to fast food, order in food whenever we visited (we accidentally got cuttlefish once instead of his order of cutlets), show me how he recorded his expenses in his ledger, and all the while carry out his responsibilities as a busy man even after retirement.

He would receive many visitors, whether friends,  family, or others; to some he would give the help or advice they sought; with others he would enjoy a few rounds of chess; and from others he would procure various services: palm tree tappers, snake charmers, gardeners, hairdressers....

He would make the rounds of his home every evening, one of his well-loved flashlights in hand, checking that all windows were securely fastened, all doors were locked and bolted shut, and that all of his household were accounted for. He would discourage anyone from leaving the house after nightfall, and be up again in the early morning, making sure everyone was awake. I've never been a morning person, and he always had to spend quite a lot of time convincing me to wake up.

I think Nana, coming from a zamindari (landowning) family that lost much of its wealth to the turbulent times, had an idea in his mind of what his life should be as a patriarch of his family, and he made that a reality through sheer hard work and persistence. He created an evergreen world, a perfect world for a childhood.

In the years after that, I was away from Bangladesh most of the time. I didn't get the chance or take the opportunity to see Nana as much as I should have. For years I kept telling myself I would meet him again and we would have another game of chess. Earlier this year I finally saw him and we had our game. I told him I would come back and we would have a rematch, and he agreed. I told him my younger brother would visit him soon, and he was glad to hear it. It gave him something to look forward to. I'm so grateful I was able to keep at least one promise.

After his passing, my overwhelming feeling is gratitude that I knew him.

Oct 16, 2016

The Birkana hexadecimal number symbols

AMONG number systems, the hexadecimal system of counting (or 'radix') has a special place in the hearts of programmers, being closely related to binary, the fundamental number system used by all modern computers. Unlike decimal, which counts ten numbers (0 through 9) before having to add a 'place' (an order of magnitude), hexadecimal allows us to count sixteen numbers (0 through 9, then A through F) before having to add a place.

While the hexadecimal notation of using the first six letters of the alphabet is practical in a rough-and-ready sort of way, no one ever accused it of being elegant. There does, however, exist a fairly elegant (and I think clever) notation for writing hexadecimal--it's called Birkana. The only problem with Birkana is that most of the internet seems to have forgotten about it.

Well, almost all. I obviously came across Birkana somewhere on the internet at some point, years ago. And there may well be many posts written about it in the back reaches of the Google search engine. But to this day, the only result I've managed to find in my searches has been a mailing list post in the discussions of The International Slide Rule Group--probably not a very prominent corner of the online world.

That's too bad, because Birkana is pretty cool and it does deserve a proper introduction. So, what is it exactly?

Birkana is a Runic symbol set for expressing the hexadecimal number radix, but it's designed in such a way that the exact shape of each number rune is built by combining a basic shape for the number zero and accents which correspond to the numbers 1, 2, 4, and 8. In the right combinations, they can express any number from 0 to F. Here are the shapes for the 'building block' numbers:






As you can see, every combination of the zero line and the accents will give us a hexadecimal number (between 0 and F). I find it easiest to sum up starting at the highest possible number, then adding on the next-highest number, and so on. Some examples:

0x3 (= 0x2 + 0x1)

0x6 (= 0x4 + 0x2)

0x9 (= 0x8 + 0x1)

0xD (= 0x8 + 0x4 + 0x1)

0xF (= 0x8 + 0x4 + 0x2 + 0x1)

Notice also the exact positions of the accents--they intersect with the zero line at either the top, middle, or bottom, and they end up parallel to exactly a quarter of the way up or down the zero line. The design may have been inspired by ancient runes, but it is very carefully thought-out.

Unfortunately, at the moment it's not practical to type in Birkana in any digital format. Theoretically, the Unicode Consortium could decide to add the Birkana symbols to the Unicode specification and some enterprising font designer could come up with a set for general use. Until then, you're unlikely to come across Birkana symbols on the internet. So for now, enjoy them here, and feel free to copy the SVG shapes off this page's source.

Feb 8, 2016

The Essence of Phantom Types in Scala

The phantom of the type opera

HEIKO Seeberger over at the Codecentric blog published an interesting post about using Scala's typelevel programming to encode phantom types in a strict way so that you could tightly control the types that are allowed to be phantasmal.

For example, if you have a hacker: Hacker[Decaffeinated], and you call hacker.hackOn, you want a compile-time error saying essentially that a decaffeinated hacker can't code on.

Heiko's techniques make some tradeoffs:

  • Having to encode the methods' phantom type requirements as type bound sandwiches or implicit evidence of types. This is required if we want to keep using object-oriented method call syntax.
  • Having to bound the Hacker's phantom type parameter to an explicit hierarchy of allowed phantom types, i.e. either State or its subtypes Caffeinated, Decaffeinated. I believe this is unnecessary, as the Hacker constructor is private and the companion object provides smart constructors that allow you to get only a Hacker[State.Caffeinated] or a Hacker[State.Decaffeinated].

If we trade away the object-oriented syntax, and give up the unnecessary phantom type hierarchy, we get:

class Hacker[S] private () /* 1 */

object Hacker {
  trait Decaffeinated /* 4 */
  trait Caffeinated /* 5 */

  val caffeinated: Hacker[Caffeinated] = new Hacker /* 7 */
  val decaffeinated: Hacker[Decaffeinated] =
    new Hacker /* 8 */

  def hackOn( /* 10 */
    hacker: Hacker[Caffeinated]): Hacker[Decaffeinated] = {
    println("Hacking, hacking, hacking!")
    decaffeinated /* 12 */

  def drinkCoffee( /* 15 */
    hacker: Hacker[Decaffeinated]): Hacker[Caffeinated] = {
    caffeinated /* 18 */

In this version, a few things are going on. By line number:

We make the phantom type parameter unbound and invariant, because we made the constructor private. Client code can't create any Hackers, it can only use the ones we provide. Also, we move all the logic out of the class body and into the companion object.
4, 5
We don't need a phantom type hierarchy, or even to seal the traits, because again client code can only use the Hackers that we provide, and the phantom type parameter is, as mentioned, invariant. Also, I don't put the states inside their own companion object because they're merely incidental to the main logic; they don't have any logic dedicated to them.
7, 8
We provide the smart constructors here. Note that the constructors don't need to be methods; they can just be values because these values are immutable. So operations on them can just keep reusing the same values.
10, 15
We make the hackOn and drinkCoffee methods both take and return a Hacker with the correct phantom type to explicitly show the transitions Hacker[Caffeinated] => Hacker[Decaffeinated] and Hacker[Decaffeinated] => Hacker[Caffeinated].
12, 18
We use our own smart constructors internally to separate interface from implementation as much as possible.

With the above code, we can get the highly desirable 'type mismatch' error that immediately tells us what's wrong:

scala> Hacker drinkCoffee Hacker.caffeinated
<console>:8: error: type mismatch;
 found   : Hacker[Hacker.Caffeinated]
 required: Hacker[Hacker.Decaffeinated]
              Hacker drinkCoffee Hacker.caffeinated

scala> Hacker hackOn (Hacker hackOn (Hacker drinkCoffee Hacker.decaffeinated))
<console>:8: error: type mismatch;
 found   : Hacker[Hacker.Decaffeinated]
 required: Hacker[Hacker.Caffeinated]
              Hacker hackOn (Hacker hackOn (Hacker drinkCoffee Hacker.decaffeinated))

Admittedly, we've given up object-oriented syntax to get here. But I personally think the tradeoff is worth it.

Dec 30, 2015

How does the State monad work?

HANDLING state in a monadic way is one of the techniques Haskellers come to learn about. But how does it work, roughly?

The following is a simplified, intuitional explanation of the state monad. It's basically a trick of function currying. Suppose you have some program state, and some functions that do something to your program state:

type MyState = Int

increaseState :: MyState -> Int -> MyState
increaseState myState n = myState + n

decreaseState :: MyState -> Int -> MyState
decreaseState myState n = myState - n

That's workable, but it's not composable: it's inconvenient to keep wrapping function calls like:

let myState = 0
in increaseState (decreaseState myState 1) 2

Because the earlier states are wrapped inside the first argument, and you have to also provide a second argument, you don't get any syntactic wins here.

But if you express the functions a little differently:

increaseState :: Int -> MyState -> MyState
increaseState n myState = myState + n

decreaseState :: Int -> MyState -> MyState
decreaseState n myState = myState - n

You can take advantage of the associativity of the function arrow and think of the function types as Int -> (MyState -> MyState). Now, this return type is a transformation from an initial state to a final state. If we alias the type: type StateWrapper = MyState -> MyState, we can write our function types as Int -> StateWrapper.

So, the new function types are:

increaseState, decreaseState :: Int -> StateWrapper

If we say increaseState 2, we get a result of type StateWrapper. If we say decreaseState 1, we get a result also of type StateWrapper. If we compose these two values (which are actually functions, remember), we get a final value of type StateWrapper, that is MyState -> MyState. In fact, no matter how many stateful operations we do, if we compose them all in sequence, we get a final function of MyState -> MyState.

At the end of the sequencing (the composition), all we have to do is feed in the initial state, and we get back the final state after carrying out all the 'stateful' operations.

So, you ask, why the need for a state monad at all? Why not just do function composition in the first place? The answer is that the state wrapper type is slightly more complicated than what we've seen above. In reality, it's closer to type StateWrapper s a = s -> (a, s), that is, given an initial state s, it returns a result value a and a final state s. A series of functions of this type can't be composed together using normal function composition; and hence we use monadic binding to do the job.

So monadic binding is a more powerful form of function composition and hence action sequencing that can handle unruly types (but which still follow a certain pattern, hence the famous (>>=) :: m a -> (a -> m b) -> m b).

To get a more accurate idea of how the state monad works, see Graham Hutton's excellent tutorial.

Feb 3, 2015

Show the Structure of Your GUI in Your Code

I LIKE creating my GUIs programmatically. I suppose I'm a traditionalist (some would say a masochist). But I feel that it gives me more control over the design and keeps together things that should live together.

The downside to this is, lots of objects all declared in the same scope, with no structure to give you a hint as to how they all fit together:

    class View {
      Form frm = new Form();
      FlowLayoutPanel pnl = new FlowLayoutPanel();
      TabControl tbc = new TabControl();
      TabPage tbp1 = new TabPage();
      Label lbl = new Label();
      TextBox txt = new TextBox();
      TabPage tbp2 = new TabPage();
      Button btn = new Button();
If you think about it though, the controls will all have fixed parent-child relationships with each other. They will effectively form a tree structure during runtime, that will give rise to the final graphical display. In fact, that's what all the XAMLs and JavaFXs and QMLs of the world model. Why not actually show these relationships in the code itself with a little bit of indentation trickery?

    class View {
      Form frm              = new Form();
        FlowLayoutPanel pnl = new FlowLayoutPanel();
          TabControl tbc    = new TabControl();
            TabPage tbp1    = new TabPage();
              Label lbl     = new Label();
              TextBox txt   = new TextBox();
            TabPage tbp2    = new TabPage();
          Button btn        = new Button();

A-ha! So the Button won't be inside the TabPage; it'll be a sibling of the TabControl itself!

Of course, this is all manual, and won't work with whitespace-sensitive languages, and can get out of date when you change your layout, and all those things. But--when it does work, it's surprising how well it works at organising my thoughts about the layout into a nice 'graphical' overview.

Jan 8, 2015

Expressive Functional Programming with Continuations in Python

In Python, statements and expressions are separate and unequal citizens. For example, statements can contain expressions but expressions can't contain statements. This prevents things like full-powered anonymous functions which can do everything that named functions can, or expression calculations which contain statements like 'import' etc.

I wanted to fix this, to give Python an expressiveness which I thought it was missing. I first thought that we could introduce a new syntax token which could 'wrap' up any number of statements and return the value of the final expression in the block. This turned out to not be feasible, for a couple of reasons, as discussed in the mailing list.

Then I had the idea that we can turn all statements into expressions. In Python, function calls are expressions. So statements can be wrapped up inside functions to turn them into expressions. Since there aren't that many statements in Python, it's feasible to just do this for all of them. Each of these functions can take a continuation to represent 'the rest of the computation'. If all statements become expressions, and so everything is an expression, this lets us define full anonymous functions because now everything inside the anonymous function can be a single expression!


To manage the continuations, we first define a data structure: a pair of (continuation, expression) which is passed around among the functions we'll define later, named whatever_. expression in the pair is meant to represent the input to the continuation, if it takes one.

A 'dead' continuation doesn't take an input value.

    def __dead(k): return (k, None)

A 'live' continuation takes an input value.

    def __live(k, x): return (k, x)

Continuations are run using trampolining to prevent stack overflow, allowing us to overcome the restriction of Python not optimising tail calls. The trampoline takes care of running each continuation with input or not depending on whether it's live or dead.

    def run_k_(prog):
      (k, x) = prog

      while k is not None:
        if x is None: (k, x) = k()
        else: (k, x) = k(x)

In short, a trampoline function like the above converts recursion into iteration. If you want to learn more about trampolines, here's a beautifully simple description of the idea and some examples in Python.

Some 'Syntax' Definitions

Below I define some 'syntax' in the form of functions which use continuation-passing style (CPS). So as discussed above, these functions are wrappers for the real syntax, and turn statements into expressions. Note that some of them are fairly simple versions of the real functionality.

A print function which just prints something and then doesn't return a value, it just sets up the rest of the computation to go ahead when it returns. This pattern can be used for all the Python statements.

    def print_(x, k):
      print x
      return __dead(k)

A primitive version of a 'let' binding. This 'returns' a value, i.e. the expr, bound to the parameter name in the continuation lambda. This pattern can be used for all expression-oriented syntax.

    def let_(expr, k): return __live(k, expr)

An assertion 'statement'.

    def assert_(expr, k):
      assert expr
      return __dead(k)

An if_ 'expression' can 'return' one of two values--so in other words it can pass on either of its input expressions to its continuation.

    def if_(test_expr, then_expr, else_expr, k):
      if test_expr: return __live(k, then_expr())
      else: return __live(k, else_expr())

A cond_ is basically a switch statement, but in the form of an expression that again 'returns' a value. Can be used as a replacement for Python's if ... elif ... else ... syntax.

    def cond_(test_pairs, default_expr, k):
      for (test_expr, then_expr) in test_pairs:
        if test_expr: return __live(k, then_expr())
      else: return __live(k, default_expr())

We can import a module and pass it along to a continuation, which will then use it and pass it along implicitly to its children continuations through the magic of closures. This is almost like a 'let' binding but it binds the name to the imported module instead of to an expression.

    def import_(mod_name, k):
      m = __import__(mod_name)
      return __live(k, m)

The try_ function is different from the others because it actually runs the 'try' continuation and (if needed) the 'except' continuation. This is because it can't just set up two different continuations to be run. It has to run one first to actually find out if there's an exception or not. It returns the 'finally' continuation because the 'finally' block should always be run whether or not an exception occurred, so it's a natural fit for being a continuation.

    def try_(expr_k, except_k, finally_k):
      try: run_k_(__dead(expr_k))
      except: run_k_(__dead(except_k))
      finally: return __dead(finally_k)

The for_ function also needs to run its act continuation on all the items in its seq (sequence), because there may be a lot of items and just queuing them up for running might build up a huge data structure. It's more efficient to flatten the structure at this point and then just set up the ending continuation after all that.

    def for_(seq, act, k):
      for x in seq: run_k_(__live(act, x))
      return __dead(k)


The end result is that we build and run a large expression that looks kind of like normal code on the left, but ends with a bunch of lambdas on the right of each line. Because Python doesn't have macros or autoclosures, we can't hide the fact that we're passing around lots of functions.

Note that we stop the 'program' at any point by passing in None as the next continuation. You can see this in the functions stored inside the dict below. Also note how we're storing code that looks pretty imperative inside the functions. If you squint a little bit you can kind of ignore the extra clutter and think of each line as a separate 'imperative' statement. Indentation becomes arguably more important here than in normal Python for readability.

Finally, remember that run_k_ ('run continuation') runs the whole thing. As long as we compose our program using only functions specially designed to work with the trampolining mechanism, like the ones above, it should all run fine.

    if __name__ == "__main__":
          { "foo":
                print_("I pity the foo!", λ:
                for_(range(5), λ x:
                  print_(x, None), None)),
                import_("math", λ m:
                let_(5, λ r:
                print_("The area of the circle is:", λ:
                print_(m.pi * r * r, None)))),
                print_("None of your bazness!", None) },
          λ dispatch_dict:

        # Call to the function inside the dispatch dict is also handled by
        # the trampolining mechanism.

For the sake of readability, I've replaced all occurrences of the keyword 'lambda' above with the Greek symbol 'λ'. Of course in real Python code we'll use the keyword (search and replace should do it).

Obviously, you won't want to program like this in normal Python. It would drive people up the wall with crazy. But for specialised use cases like storing a lot of functions inside other data structures, callback-based event handling when you want to do something more complex than just call a function, or building DSLs in Python (hey, worse things have happened), this expressive method could come in very handy.

Update: this article describes a simpler, more primitive version of what I currently have. You can follow the latest developments at the GitHub repository.