Skip to content
This repository has been archived by the owner on Dec 6, 2018. It is now read-only.

Performance comparision for exporting CSV data in Rails/PostgreSQL using CSV library vs. COPY SQL

Notifications You must be signed in to change notification settings

robdimarco/csv_export_example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSV Export Example

Problem to Solve

  • Wanted to speed up an export to CSV
  • Use PostgreSQL COPY function to improve performance

How It Works

  1. Create a view in PostgreSQL with 1,000,000 rows
  2. Create a ActiveRecord object named ExampleModel backed by this view
  3. Extend ActiveRecord::Relation with two methods to_csv and to_csv_copy. to_csv uses Ruby's built in CSV while to_csv_copy uses the PostgreSQL COPY command.

Using PostgreSQL COPY

To use the PostgreSQL COPY command we need to do a few things:

  1. Get access to the raw pg connection with connection.raw_connection
  2. Execute the COPY command. We do this by using the to_sql method of ActiveRecord::Relation to get the base SQL we want to execute and then running the copy, printing to stdout.
  3. We then call the get_copy_data to get the copy data in CSV form.

What We See

For small number of records (~1,000), there is no noticeable difference in performance between the to_csv and the to_csv_copy. But as the amount of data increases, you can start to see gigantic performance improvements.

On my development machine, I see speed ups of ~9x for 50,000 rows and 30x for 500,000 rows.

To Run The Benchmark

$ bundle install
$ rake db:migrate
$ rake benchmark

Example Results

       user     system      total        real
1000 records with CSV library  :  0.090000   0.010000   0.100000 (  0.450539)
1000 records with PG COPY      :  0.000000   0.000000   0.000000 (  0.327618)
       user     system      total        real
50000 records with CSV library  :  2.650000   0.100000   2.750000 (  3.157916)
50000 records with PG COPY      :  0.040000   0.020000   0.060000 (  0.431258)
       user     system      total        real
500000 records with CSV library  : 29.190000   0.820000  30.010000 ( 31.398112)
500000 records with PG COPY      :  0.460000   0.180000   0.640000 (  1.308529)

About

Performance comparision for exporting CSV data in Rails/PostgreSQL using CSV library vs. COPY SQL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published