Carving Ruby arrays
I ran across an interesting method in a colleague’s commit
recently, which monkey patched Ruby’s Array
class. To me, it nearly
epitomizes Ruby’s elegance by packing a number of powerful
techniques into a single short line of code, without being unreadably
complicated. No line noise here. While I’m not sold one way or the other
on monkey patching, I was intrigued enough by the following carve!
method to dig a bit deeper.
Note: this replicates functionality provided by Enumerable.partition
,
so there’s probably no reason to use this in production. On the other
hand, as noted by Benjamin Tan in The Ruby Closures
Book, implementing a cheap
counterfeit
is usually enlightening.
Here’s the snippet under consideration:
class Array
def carve!
dup.tap { delete_if &Proc.new } - self
end
end
It does exactly what the name carve!
implies, separating one Array
into two Arrays based on the condition passed in as a block.
How it works:
dup
makes a copy of the values in the array and returns those values in a new array. That’s pretty straightforward.tap
yields self to the block then returns itself. This is the first key part of the technique.delete_if
takes a block, and deletes every element matching the condition given by the block. Note that the deletion occurs on every match, not after the iteration is complete. Note thatdelete_if
operates on the original array, not on the duplicated array.&Proc.new
is an odd construction, even for Ruby, which can appear to have a number of odd constructions. We’ll examine this in more detail shortly. This is the third key part.
Then the original array (self
) is subtracted from the duplicated and
modified array. That looks simple enough.
The slightly mind-bending aspect of carve!
is how it processes the
block passed in as the condition. It was not at all obvious to me, so
let’s take a deeper look at that.
dup and tap
The dup.tap
calls warrant a closer look. Especially tap
which to me is one
of those super simple constructions which I don’t use very often, hence
haven’t developed an intuitive feel for it. Using tap
saves two lines of
code for the same functionality:
def carve2!
other = dup
delete_if &Proc.new
other - self
end
A case could be made for carve2
being easier to understand. A (strong) case
could also be made for mastering Ruby, tap
and all. I choose mastery.
More about #tap
Ruby 1.9 introduced the tap
method, which passes self
to a block and
returns self
after processing the block. Normally, the result of the
block would be returned. In combinatory logic, this behavior is defined
by a function called kestrel, or \(K\) Combinator:
The theoretical importance of \(K\) cannot be overstated. Combined with the \(S\) combinator, the \(SK\) calculus is equivalent to an untyped lambda calculus, i.e., is Turing complete.
But that takes us too far afield for today.
The important thing when using tap
is understanding that the block has
no implicit return, it is only used as a generator of side effects.
delete_if and passing procs
Let’s take a closer look at delete_if
. This is one of those
Ruby methods which “just works” usually without having to think much about it.
For our investigation, we need to get drill a bit deeper. We’ll start
with some REPL code:
$ irb
2.2.3 :001 > [1, 2].delete_if { |e| e < 2 }
=> [2] # as expected
2.2.3 :002 > deleter = ->(e) { e < 2 } # set us up the lambda
=> #<Proc:0x007f8f0981bcb8@(irb):3 (lambda)>
2.2.3 :003 > [1, 2].delete_if(deleter) # we have to try...
ArgumentError: wrong number of arguments (1 for 0) # as expected...
from (irb):4:in 'delete_if'
from (irb):4
from /Users/doolin/.rvm/rubies/ruby-2.2.3/bin/irb:15:in '<main>'
2.2.3 :004 > [1, 2].delete_if(&deleter)
=> [2] # success
2.2.3 :005 >
As can be seen, delete_if
doesn’t take any arguments, unless it takes a single
argument prefixed with an ampersand &
. In which case, that argument is received
as Proc object.
&Proc.new
In Ruby, Proc
is an object created from a block.
When the last argument of a method is prefixed with an ampersand &
,
that argument will receive the block as a Proc object.
From ruby-doc, Proc.new:
Creates a newProc
object, bound to the current context.Proc::new
may be called without a block only within a method with an attached block, in which case that block is converted to theProc
object.
What I need to do here is create my own method which yields, and see how
&Proc
works with that. Using map
and delete_if
doesn’t give me enough
insight into how &Proc
really works. The dup.tap
gets in the way.
How about a custom map method:
class Array
def newmap
map &Proc.new
end
end
nm = [1, 2].newmap { |element| element + 1 }
puts nm # 2, 3
And that’s the long and short of carve!
.
Carving arrays
Without making any judgement call on monkey patching, carve!
feels like an
elegant addition to Ruby’s Array
class. The implementation given here is not
likely performant due to delete_if
’s behavior of deleting by each item. On the
other hand, implementing something more computationally efficient becomes very
difficult very quickly, as the deletion criteria can be arbitrary.
The coolest thing about carve!
for me as a Ruby programmer is running across
code which stretches my comprehension of the language, something which is
increasingly rare as my Ruby ability grows.