Wednesday, February 14, 2007

Ruby Blocks, Closures, and Continuations

Recently, while blogging about the Ruby.NET project, it was pointed out to me that I was incorrect about Ruby.NET not supporting closures. Brendan noted that in fact continuations are what is not yet supported. I quickly realized that I was not entirely clear about the differences between a closure and a continuation, except that they both used code blocks. I decided that it must be time that I figured it out.

When in doubt about any aspect of Ruby, it is best to go back to Matz. I found an interview with Matz conducted back in 2003 by Bill Venners. First, let us get Matz's definition of a block:


Yukihiro Matsumoto: Blocks are basically nameless functions...Basically, you can pass a nameless function to another function, and then that function can invoke the passed-in nameless function. For example, a function could perform iteration by passing one item at a time to the nameless function. This is a common style, called higher order function style, among languages that can handle functions as first class objects...In Ruby, the difference is mainly a different kind of syntax for higher order functions. In other languages, you have to specify explicitly that a function can accept another function as an argument. But in Ruby, any method can be called with a block as an implicit argument. Inside the method, you can call the block using the yield keyword with a value.


OK, so far so good. Any beginning Rubyist has seen the fun with blocks:

class Array
def find
for i in 0...size
value = self[i]
return value if yield(value)
end
return nil
end
end

puts [1, 3, 5, 7, 9].find {|v| v*v > 30 }

The block in the above example is used to pass a function used to tell the the "find" method how to determine when the search is complete.

SO what is a closure? Matz, enlighten us:


Yukihiro Matsumoto: ...we can create a closure out of a block. You can pass around a nameless function object, the closure, to another method to customize the behavior of the method. As another example, if you have a sort method to sort an array or list, you can pass a block to define how to compare the elements. This is not iteration. This is not a loop. But it is using blocks...

Bill Venners: What makes a block a closure?

Yukihiro Matsumoto: A closure object has code to run, the executable, and state around the code, the scope. So you capture the environment, namely the local variables, in the closure. As a result, you can refer to the local variables inside a closure. Even after the function has returned, and its local scope has been destroyed, the local variables remain in existence as part of the closure object.


Alright, so that sounds like a pretty good definition of a closure. The Wikipedia page for Ruby has a nice, simple closure example:

# In an object instance variable (denoted with '@'), remember a block.
def remember(&a_block)
@block = a_block
end

# Invoke the above method, giving it a block that takes a name.
remember {|name| puts "Hello, #{name}!"}

# When the time is right (for the object) -- call the closure!
@block.call("John")
# => "Hello, John!"

So storing a block of code into a variable makes it a closure. It can then be called later on using the "call" method.

So what is a continuation, then? A short perusal of the comp.lang.ruby group led me to this video from RubyConf 2005 where Chad Fowler and Jim Weirich explain continuations.

According to them, "A continuation is whatever happens after a block is done executing". In other words, a continuation is an object that can restore the entire context of a program back to where it was when the continuation was saved. Sounds simple enough, but the ramifications can be a little brain boggling.

To summarize a little from Jim Weirich’s blog on Ruby:


Ruby provides continuation functionality in the form of the callcc method. The continuation is passed to a block, and within the block we can do anything we want to with the continuation. Calling it will immediately cause the callcc call that created the continuation to return the value given to the continuation.


Jim provides a nice simple code example, too:


def level3(cont)
cont.call("RETURN THIS")
end

def level2(cont)
level3(cont)
return "NEVER RETURNED"
end

def top_level_function
callcc { |cc|
level2(cc)
}
end

answer = top_level_function
puts answer # => "RETURN THIS"


The block passed into the "callcc" method has a parameter which is also a block. This parameter block is a bit of code that returns control back to the original calling code, immediately after the "callcc" method was called, along with restoring the entire context from when the original call was made.

For example, in the code example above, when the level3 method hits the line with cont.call("RETURN THIS"), the code returns to the top_level_function to immediately after where the callcc method was called in the first place. The parameter passed into the call method within the level3 method is returned by the callcc function, then returned by top_level_function. This is why "RETURN THIS" is displayed by the above code snippet.

Continuation are very cool and powerful, but I can see why most people don't get them. A little search of Krugle shows very few matches for either "callcc" or "Continuation" within the Ruby projects indexed there. It would appear very few people are using them. No wonder they might be dropping them from Ruby 2.0.

8 comments:

Unknown said...

It should be noted that you haven't actually given an example of closures in your post: your first example is merely the definition and usage of an anonymous first-class function (a block in ruby-lingo): it doesn't use its definition context, and therefore even though the function does technically close over its context, that fact is not demonstrated and completely useless.

Now if we want to demonstrate a closure we need a few things: a scope in which the block is created (a function will do), and data local to that scope. For example:

class GreetingGenerator
def initialize(greeting)
@block = lambda {|name| puts "#{greeting}, #{name}"}
end
def greet(name)
@block.call name
end
end

frenchGreeting = GreetingGenerator.new "Bonjour"
englishGreeting = GreetingGenerator.new "Hello"
southernGreeting = GreetingGenerator.new "Howdy"
japaneseGreeting = GreetingGenerator.new "Ohayou"
germanGreeting = GreetingGenerator.new "Guten Tag"

frenchGreeting.greet "John"
englishGreeting.greet "John"
southernGreeting.greet "John"
japaneseGreeting.greet "John"
germanGreeting.greet "John"

puts greeting # just checking that the `greeting` variable doesn't exist in the main scope

In this case, each block carries a little information with it ("greeting").

This information doesn't come from its application context (the `greet` method), nor does it come from the global scope (as we check in the script's last line): it comes from the definition scope of the block (the `initialize` method).

The blocks close over their definition context and carry that context with them long after the definition scope has been closed and their definition contexts (potentially) reclaimed.

We also check that the `closure` is block-specific by creating multiple blocks and calling them only after all of them have been created.

Ron Evans said...

It is indeed true that an important distiction between blocks and closures is that there is data available within the scope of the closure. Thank you for your very nice example, which does show that better than mine did.

jaaron said...

I'm glad the issues with your closure example were pointed out. This is very abstract stuff that is really hard to wrap one's head around (which has led to a lot of contradictory and just plain wrong descriptions of closures, and continuations, and so forth).

Your continuation example doesn't illustrate one of the most compelling uses of call/cc: coroutines. The idea is that you can achieve a form of cooperative concurrency in a single threaded environment using call/cc.

Example:

def foo(cont)
5.times do
puts "hello world"
callcc{ |cc| cont.call(cc) }
end
end

def bar(cont)
5.times do
puts "goodbye world"
callcc {|cc| cont.call(cc) }
end
end

# invoke foo
foo(lambda{|x| bar x})

# Note: foo is invoked with a block that calls bar as its argument.
# this will cause the two routines to alternate execution.
# the output will look like:
hello world
goodbye world
hello world
goodbye world
hello world
goodbye world
hello world
goodbye world
hello world
goodbye world
=> 5

Continuations also provide an alternate model of computation; instead of calling a subroutine, waiting for it to return, then using its result, the subroutine is called with the current continuation the callee does its thing then rather than returning executes the continuation it received.

jaaron said...

my coroutine example is somewhat flawed, in truth, the bar is just getting invoked 5 times (rather than invoked once and continued four times). A better example is:

def foo(cont)
  puts "hello world foo"
  cont2 = callcc{|cc| cont.call cc}
  puts "goodbye world foo"
  callcc{|cc| cont2.call cc}
end

def bar(cont)
  puts "hello world bar"
  cont2 = callcc{|cc| cont.call cc}
  puts "goodbye world bar"
  callcc{|cc| cont2.call cc}
end

Calling this as
foo(lambda{|x| bar x})

will cause it to output
hello world foo
hello world bar
goodbye world foo
goodbye world bar
=> #<Continuation:0x4991c>

(note that it returns a continuation, this is key for coroutines, each one returns a continuation for the other one to invoke some time later).

Ron Evans said...

Good point about using continuations as a form of concurrency. As as example, here is what looks like the equivalent code using Ruby Threads:

def foo
  puts "hello world foo"
  Thread.pass
  puts "goodbye world foo"
  Thread.pass
end

def bar
  puts "hello world bar"
  Thread.pass
  puts "goodbye world bar"
  Thread.pass
end

f = Thread.new do
  foo
end

b = Thread.new do
  bar
end

f.join

Will output:
hello world foo
hello world bar
goodbye world foo
goodbye world bar

However, even though the output is the same, the path of execution followed is very different in these two bits of code.

The "Thread.pass" method does also preserve the state, just like the "callcc" method. Also, like "callcc", eventually control will return to the next line of code after the "Thread.pass".

However, "Thread.pass" just gives up control to the thread scheduler. You do not have control over which thread executes next without using some kind of thread synchronization object. If there was another thread waiting to be scheduled, it might execute first.

The "callcc" method, on the other hand, allows you to have control by passing in the specific block of code to be executed before returning to continue execution immediately after where the callcc method was invoked. Likewise, local state is preserved for objects in scope.

Interestingly, in to the Fowler/Weirich presentation, they say that the Ruby language implementation of Threads and Continuations uses a shared code base. In part, this may explain why the Ruby.NET project is not yet supporting either one.

서광열 said...

One big problem of callc is that it often obscures the control flow of a program. People never know what really happens when they see a bunch of callcc and call methods.

setjmp/jongjmp functions in C are similar to callcc and extensive use of those function makes the code less redable. However, in some limited circumstances such as error handling, they are very powerful tools to hack.

Peter Kofler said...

[...] Some code examples were taken from Ruby Blocks, Closures, and Continuations. [...]

Thank you for your examples. They helped me creating a presentation about Concepts of Functional Programming, given at a local user group event.

larin said...

Good post, I hadn't realized that a block and a closure had differences.