You are here

Miscellanea from Rails-land

Error message

Deprecated function: Function create_function() is deprecated in GeSHi->_optimize_regexp_list_tokens_to_string() (line 4716 of /home/web/ithiriel.com/www/web/sites/all/libraries/geshi/geshi.php).

I have been experimenting with a Ruby on Rails project recently. Part of the project requires importing large (almost 150 MB, almost one million lines) text files of a given format into a normalized database. I have been working with this part of the project first because I want to find out how much space each log file will take up in the database.

Here's some of my notes and observations:

  • The original draft of the code parsed the line and then called save on the object for that line. Unfortunately, this script took six hours to run using script/runner. Unfortunately, I really need this process to take around one minute rather than six hours.

    ActiveRecord::Extensions (found via this post on the Accelerate HR blog) provides a method to import a large number of records at once. When configuring it to write 1,000 records at a time to the database, the SQLite version took about four and a half hours. Using "chunk" sizes of 10,000 took over nine hours before I stopped it manually because that caused the script to start swapping to disk. (Servers with only 512 MB of memory are no longer as useful as they used to be.)

    Switching to MySQL and using greater normalization results in faster run time with a chunk size of 1,000. However, even then, the import script takes about three hours to run.

  • Mixins can be used to share behavior between related models without repeating yourself. Jamis Buck and DHH call these "concerns". The use of mixins in this way does clean up the models considerably since there's a lot of similar code. However, I do have models that look like:
    class Model
      include Mixin1
      include Mixin2
    end
  • At least on my test server (running CentOS 5.3), the version of Ruby packaged with the OS is broken when using large amounts of memory. Something causes a bit to be unset which yields strange errors. Sometimes it's just a segmentation fault. Sometimes ActiveRecord complains that the object has no "pime" field (when it should have been "time") or other such aberrations. And then there was when the SQLite driver complained that "INSART" was not a valid SQL command.

    These issues do not manifest under the version of Ruby Enterprise Edition installed so this suggests that the RPM ruby is broken.

  • I installed Ruby Enterprise Edition because of the suggested gains in performance and I plan to eventually run the completed application through passenger. However, I wonder how the performance of REE compares to Ruby 1.9.1 for this application.
Topics: 

Add new comment