Array#flatten -> Array#collapse (my first C extension)
If you work in Ruby, and probably in other dynamically typed languages,
you’ve likely had to make use of flatten. Say you have an array like
this: [[3, [3, 3, 3]], [3, [3, 3, 3]], [3, [3, 3, 3]]] and what you
want is an array like this: [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3].
Trivial, you say: array.flatten.
What if you didn’t have access to flatten? If you had to reimplement it,
what would you write?
Well, it’s helpful, to me anyway, to think about how lists (arrays) are recursive structures like trees. When we want to deal with deeply nested recursive stuff, we can always turn to a recursive function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
def flatten(array)
  flat = []
  rec_flat = proc {|a,f|
    a.each do |e|
      if e.is_a? Array
        rec_flat.call(e,f)
      else
        f << e
      end
    end
  }
  rec_flat.call(array,flat)
  flat
end
Though as I discussed in more recursion we want to be careful about exhausting the stack. So, let’s manage the stack ourselves manually:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Array
  def flat_stack
    flat = []
    stack = [] << self
    until stack.empty?
      e = stack.pop
      e.each do |a|
        if a.is_a? Array
          stack << a
        else
          flat << a
        end
      end
    end
    flat
  end
end
This is roughly how the Ruby source code handles the situation, though you’ll see they also take care to make sure recursive references to the array inside the array don’t cause an infinite loop.
I don’t really recommend rewriting this in Ruby, but the slowdown/memory bloat is only a constant factor:
 

So, not super useful to rewrite this in Ruby. But there’s one thing that
bugs me about flatten in Ruby. It’s not open to a block. You can do
flatten {|e| p e } but you’ll see Ruby just throws it away. Another
common use case is flatten.compact. Likewise, there’s no way to tell
flatten to ignore any nil values. This is kinda a bummer to me because
given flatten has to traverse the entire data structure, we might want
to save ourselves a constant factor by piggybacking on that traversal.
So here’s a version of flat_stack that gives us that flexibility:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class Array
  def flat_stack(keep_nil = true)
    flat = []
    stack = [] << self
    until stack.empty?
      e = stack.pop
      e.each do |a|
        if a.is_a? Array
          stack << a
        else
          a = yield a if block_given?
          if keep_nil
            flat << a
          else
            flat << a unless a.nil?
          end
        end
      end
    end
    flat
  end
end
However, it still is better, performance-wise, to just use
flatten.compact:

(Unpictured: Just utilizing the block and not the compaction is a little better performing, still not worth using over the normal idiom.)
This got me a bit curious, and my curiosity was such to overcome my fear: why not try to write a C extension to accomplish this?
Unfortunately, this is me writing C:

But! With help from my partner, a slew of blog posts, and the kind people on LGBT#programming I dove in and gave it a shot. First, here’s the resulting performance:

Colors might be a bit hard to see, but collapse performs better than
flatten + compact. And because I wrote mine with the exact same
allocations as the original core Ruby, the memory performance is the same:

If you want to write your own C extension, my original experiment
(patterned on
this)
is a decent template of what you’ll minimally need, and it’s on
github. Essentially all I needed
was an extconf.rb, a .c file, and for my own TDD preference I also
used a _spec.rb file to actually drive the C code.
If you’d like to make use of Array#collapse because you trust what
a golden lab writes on his day off, feel free to grab the
gem off
Rubygems.
 
				 
				