You are here

Array#compact vs. checking for nil manually

In the prototype code for a project, I wrote something like:

  1. items << item if item

As is, it's not an ugly line. However, it's inside a loop that creates an object based on each line in a text file. So, to provide context:

  1. items = []
  2.  
  3. File.open( filename, 'r' ) do |file|
  4.   file.each_lines do |line|
  5.     item = build_item( line.strip )
  6.     items << item if item
  7.   end
  8. end

So if the line contains one million lines, it runs that statement one million times. (As to why I'm testing, build_item may return nil if the line matches a given set of criteria.)

Array#compact returns a copy of the array with all nil elements removed from the array. (Array#compact! removes all nil elements from the current array.) Removing the if condition from the loop and adding items.compact! after the File.open block should do the same job as what I originally wrote.

To test, I wrote this program:

  1. require 'benchmark'
  2.  
  3. test_array1 = Array.new
  4.  
  5. 10_000_000.times do
  6.   test_array1 << "Foo"
  7.   test_array1 << nil
  8. end
  9.  
  10. test_array2 = Array.new( test_array1 )
  11. test_array3 = Array.new( test_array1 )
  12.  
  13. new_array = Array.new
  14.  
  15. Benchmark.bm do |x|
  16.   x.report( '#compact' ) { test_array1.compact }
  17.   x.report( '#compact!' ) { test_array2.compact! }
  18.   x.report( '#delete' ) { test_array3.delete( nil ) }
  19.   x.report( '#each' ) do
  20.     test_array1.each do |item|
  21.       new_array << item if item
  22.     end
  23.   end
  24. end

Running this under Ruby 1.9.1 (p243) returned these results:

      user     system      total        real
#compact  0.240000   0.100000   0.340000 (  0.336496)
#compact!  1.100000   0.120000   1.220000 (  1.218401)
#delete  3.140000   0.070000   3.210000 (  3.213278)
#each  3.130000   0.050000   3.180000 (  4.720402)

(For the curious, twenty runs of the same script returned similar numbers.)

The numbers indicate that Array#compact is significantly faster than the original code. Whether or not this optimization is needed is another matter entirely. Even for an array of twenty million elements, the difference in time is about four seconds. I suppose the difference between the two should come down to which is easier to read and understand.

I also decided to test Array#delete since items.delete( nil ) should be equivalent to item.compact!. Array#compact!, somewhat unsurprisingly, is faster although only by about two seconds. I'm not sure why Array#compact! should be almost a second slower than Array#compact.

Topics: 

Add new comment