Ruby: recursion, stack size and tail call optimization

To understand recursion, you must understand recursion.
- Author Unknown

TL;DR

Recursion is a tricky programming technique. It could be very useful or very harmful, depending of its use.

The default Ruby VM (MRI) has a heap size limit to handle recursions. This may cause the catastrophic error SystemStackError for big recursion calls.

There are two ways to avoid this: increase the stack size limit (without change the code) or enable tail call optimization (changing the code).

Stack level too deep

The most famous example of recursion usage is the factorial implementation:

def factorial(n)
  raise InvalidArgument, "negative input given" if n < 0

  return 1 if n == 0
  return factorial(n - 1) * n
end

Works like a charm for small numbers:

irb> factorial(1_000).to_s.size
=> 2568

Yes, the factorial of 1000 returns a number of 2568 digits! But, let’s push the limits a little:

irb> factorial(100_000).to_s.size
SystemStackError: stack level too deep

Fail! Take that SystemStackError: stack level too deep on your face :(

Increasing Stack Size

A factorial of 100000 is sufficient to cause a SystemStackError. The easiest way to fix this is increasing the Stack Size of Ruby VM, setting the RUBY_THREAD_VM_STACK_SIZE environment variable:

export RUBY_THREAD_VM_STACK_SIZE=50000000

The stack size was setted to 50MB. And now, running the factorial again:

irb> factorial(100_000).to_s.size
=> 456574

Works like a charm. But let’s try something different…

Tail call optimization

The tail call optimization (TCO) is a programming technique available in many languages where the compiler transform a recursive function into loops.

Let’s refactor the factorial method to be tail call optimizable:

def factorial(n, accumulator = 1)
  raise InvalidArgument, "negative input given" if n < 0

  return accumulator if n == 0
  return factorial(n - 1, accumulator * n)
end

But TCO is not enabled by default on RubyVM. This should be made on runtime:

RubyVM::InstructionSequence.compile_option = {
  tailcall_optimization: true,
  trace_instruction: false
}

And now, running the factorial again:

irb> factorial(100_000).to_s.size
=> 456574

Works… but do you really need to activate TCO to use recursion with safety?

Recursionless

Recursion is great, but in most scenarios, you don’t need to use it. See this factorial implementation without recursion:

def factorial(n)
  raise InvalidArgument, "negative input given" if n < 0

  result = 1
  for i in (1..n) do
    result *= i
  end
  result
end

And running the factorial for the last time:

irb> factorial(100_000).to_s.size
=> 456574

That’s it! The same behavior without recursion, TCO or any VM aditional configuration :)

Conclusion

  • The stack size could be configured via RUBY_THREAD_VM_STACK_SIZE environment variable;
  • TCO could be enabled in runtime via RubyVM compile options;
  • TCO should be avoided since it messes with stack trace, making harder to debug;
  • If you can, don’t use recursion! Period.

References

Learn something new about recursion, TCO or RubyVM? Consider to share this post with your friends or co-workers. For questions, comments or suggestions, use the comments below. Code hard and success!