Recipes for Parsing CSV

These recipes are specific code examples for specific CSV parsing tasks.

For other recipes, see Recipes for CSV.

All code snippets on this page assume that the following has been executed:

require 'csv'

Contents

Source Formats

You can parse CSV data from a String, from a File (via its path), or from an IO stream.

Parsing from a String

You can parse CSV data from a String, with or without headers.

Recipe: Parse from String with Headers

Use class method CSV.parse with option headers to read a source String all at once (may have memory resource implications):

string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
CSV.parse(string, headers: true) # => #<CSV::Table mode:col_or_row row_count:4>

Use instance method CSV#each with option headers to read a source String one row at a time:

CSV.new(string, headers: true).each do |row|
  p row
end

Output:

#<CSV::Row "Name":"foo" "Value":"0">
#<CSV::Row "Name":"bar" "Value":"1">
#<CSV::Row "Name":"baz" "Value":"2">
Recipe: Parse from String Without Headers

Use class method CSV.parse without option headers to read a source String all at once (may have memory resource implications):

string = "foo,0\nbar,1\nbaz,2\n"
CSV.parse(string) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]

Use instance method CSV#each without option headers to read a source String one row at a time:

CSV.new(string).each do |row|
  p row
end

Output:

["foo", "0"]
["bar", "1"]
["baz", "2"]

Parsing from a File

You can parse CSV data from a File, with or without headers.

Recipe: Parse from File with Headers

Use class method CSV.read with option headers to read a file all at once:

string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, string)
CSV.read(path, headers: true) # => #<CSV::Table mode:col_or_row row_count:4>

Use class method CSV.foreach with option headers to read one row at a time:

CSV.foreach(path, headers: true) do |row|
  p row
end

Output:

#<CSV::Row "Name":"foo" "Value":"0">
#<CSV::Row "Name":"bar" "Value":"1">
#<CSV::Row "Name":"baz" "Value":"2">
Recipe: Parse from File Without Headers

Use class method CSV.read without option headers to read a file all at once:

string = "foo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, string)
CSV.read(path) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]

Use class method CSV.foreach without option headers to read one row at a time:

CSV.foreach(path) do |row|
  p row
end

Output:

["foo", "0"]
["bar", "1"]
["baz", "2"]

Parsing from an IO Stream

You can parse CSV data from an IO stream, with or without headers.

Recipe: Parse from IO Stream with Headers

Use class method CSV.parse with option headers to read an IO stream all at once:

string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, string)
File.open(path) do |file|
  CSV.parse(file, headers: true)
end # => #<CSV::Table mode:col_or_row row_count:4>

Use class method CSV.foreach with option headers to read one row at a time:

File.open(path) do |file|
  CSV.foreach(file, headers: true) do |row|
    p row
  end
end

Output:

#<CSV::Row "Name":"foo" "Value":"0">
#<CSV::Row "Name":"bar" "Value":"1">
#<CSV::Row "Name":"baz" "Value":"2">
Recipe: Parse from IO Stream Without Headers

Use class method CSV.parse without option headers to read an IO stream all at once:

string = "foo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, string)
File.open(path) do |file|
  CSV.parse(file)
end # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]

Use class method CSV.foreach without option headers to read one row at a time:

File.open(path) do |file|
  CSV.foreach(file) do |row|
    p row
  end
end

Output:

["foo", "0"]
["bar", "1"]
["baz", "2"]

RFC 4180 Compliance

By default, CSV parses data that is compliant with RFC 4180 with respect to:

Row Separator

RFC 4180 specifies the row separator CRLF (Ruby "\r\n").

Although the CSV default row separator is "\n", the parser also by default handles row separator "\r" and the RFC-compliant "\r\n".

Recipe: Handle Compliant Row Separator

For strict compliance, use option :row_sep to specify row separator "\r\n", which allows the compliant row separator:

source = "foo,1\r\nbar,1\r\nbaz,2\r\n"
CSV.parse(source, row_sep: "\r\n") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]

But rejects other row separators:

source = "foo,1\nbar,1\nbaz,2\n"
CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError
source = "foo,1\rbar,1\rbaz,2\r"
CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError
source = "foo,1\n\rbar,1\n\rbaz,2\n\r"
CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError
Recipe: Handle Non-Compliant Row Separator

For data with non-compliant row separators, use option :row_sep. This example source uses semicolon (";") as its row separator:

source = "foo,1;bar,1;baz,2;"
CSV.parse(source, row_sep: ';') # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]

Column Separator

RFC 4180 specifies column separator COMMA (Ruby ",").

Recipe: Handle Compliant Column Separator

Because the CSV default comma separator is ',', you need not specify option :col_sep for compliant data:

source = "foo,1\nbar,1\nbaz,2\n"
CSV.parse(source) # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
Recipe: Handle Non-Compliant Column Separator

For data with non-compliant column separators, use option :col_sep. This example source uses TAB ("\t") as its column separator:

source = "foo,1\tbar,1\tbaz,2"
CSV.parse(source, col_sep: "\t") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]

Quote Character

RFC 4180 specifies quote character DQUOTE (Ruby "\"").

Recipe: Handle Compliant Quote Character

Because the CSV default quote character is "\"", you need not specify option :quote_char for compliant data:

source = "\"foo\",\"1\"\n\"bar\",\"1\"\n\"baz\",\"2\"\n"
CSV.parse(source) # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
Recipe: Handle Non-Compliant Quote Character

For data with non-compliant quote characters, use option :quote_char. This example source uses SQUOTE ("'") as its quote character:

source = "'foo','1'\n'bar','1'\n'baz','2'\n"
CSV.parse(source, quote_char: "'") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]

Recipe: Allow Liberal Parsing

Use option :liberal_parsing to specify that CSV should attempt to parse input not conformant with RFC 4180, such as double quotes in unquoted fields:

source = 'is,this "three, or four",fields'
CSV.parse(source) # Raises MalformedCSVError
CSV.parse(source, liberal_parsing: true) # => [["is", "this \"three", " or four\"", "fields"]]

Special Handling

You can use parsing options to specify special handling for certain lines and fields.

Special Line Handling

Use parsing options to specify special handling for blank lines, or for other selected lines.

Recipe: Ignore Blank Lines

Use option :skip_blanks to ignore blank lines:

source = <<-EOT
foo,0

bar,1
baz,2

,
EOT
parsed = CSV.parse(source, skip_blanks: true)
parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]]
Recipe: Ignore Selected Lines

Use option :skip_lines to ignore selected lines.

source = <<-EOT
# Comment
foo,0
bar,1
baz,2
# Another comment
EOT
parsed = CSV.parse(source, skip_lines: /^#/)
parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]

Special Field Handling

Use parsing options to specify special handling for certain field values.

Recipe: Strip Fields

Use option :strip to strip parsed field values:

CSV.parse_line(' a , b ', strip: true) # => ["a", "b"]
Recipe: Handle Null Fields

Use option :nil_value to specify a value that will replace each field that is null (no text):

CSV.parse_line('a,,b,,c', nil_value: 0) # => ["a", 0, "b", 0, "c"]
Recipe: Handle Empty Fields

Use option :empty_value to specify a value that will replace each field that is empty (String of length 0);

CSV.parse_line('a,"",b,"",c', empty_value: 'x') # => ["a", "x", "b", "x", "c"]

Converting Fields

You can use field converters to change parsed String fields into other objects, or to otherwise modify the String fields.

Converting Fields to Objects

Use field converters to change parsed String objects into other, more specific, objects.

There are built-in field converters for converting to objects of certain classes:

Other built-in field converters include:

You can also define field converters to convert to objects of other classes.

Recipe: Convert Fields to Integers

Convert fields to Integer objects using built-in converter :integer:

source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, converters: :integer)
parsed.map {|row| row['Value'].class} # => [Integer, Integer, Integer]
Recipe: Convert Fields to Floats

Convert fields to Float objects using built-in converter :float:

source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, converters: :float)
parsed.map {|row| row['Value'].class} # => [Float, Float, Float]
Recipe: Convert Fields to Numerics

Convert fields to Integer and Float objects using built-in converter :numeric:

source = "Name,Value\nfoo,0\nbar,1.1\nbaz,2.2\n"
parsed = CSV.parse(source, headers: true, converters: :numeric)
parsed.map {|row| row['Value'].class} # => [Integer, Float, Float]
Recipe: Convert Fields to Dates

Convert fields to Date objects using built-in converter :date:

source = "Name,Date\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2001-02-03\n"
parsed = CSV.parse(source, headers: true, converters: :date)
parsed.map {|row| row['Date'].class} # => [Date, Date, Date]
Recipe: Convert Fields to DateTimes

Convert fields to DateTime objects using built-in converter :date_time:

source = "Name,DateTime\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2020-05-07T14:59:00-05:00\n"
parsed = CSV.parse(source, headers: true, converters: :date_time)
parsed.map {|row| row['DateTime'].class} # => [DateTime, DateTime, DateTime]
Recipe: Convert Fields to Times

Convert fields to Time objects using built-in converter :time:

source = "Name,Time\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2020-05-07T14:59:00-05:00\n"
parsed = CSV.parse(source, headers: true, converters: :time)
parsed.map {|row| row['Time'].class} # => [Time, Time, Time]
Recipe: Convert Assorted Fields to Objects

Convert assorted fields to objects using built-in converter :all:

source = "Type,Value\nInteger,0\nFloat,1.0\nDateTime,2001-02-04\n"
parsed = CSV.parse(source, headers: true, converters: :all)
parsed.map {|row| row['Value'].class} # => [Integer, Float, DateTime]
Recipe: Convert Fields to Other Objects

Define a custom field converter to convert String fields into other objects. This example defines and uses a custom field converter that converts each column-1 value to a Rational object:

rational_converter = proc do |field, field_context|
  field_context.index == 1 ? field.to_r : field
end
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, converters: rational_converter)
parsed.map {|row| row['Value'].class} # => [Rational, Rational, Rational]

Recipe: Filter Field Strings

Define a custom field converter to modify String fields. This example defines and uses a custom field converter that strips whitespace from each field value:

strip_converter = proc {|field| field.strip }
source = "Name,Value\n foo , 0 \n bar , 1 \n baz , 2 \n"
parsed = CSV.parse(source, headers: true, converters: strip_converter)
parsed['Name'] # => ["foo", "bar", "baz"]
parsed['Value'] # => ["0", "1", "2"]

Recipe: Register Field Converters

Register a custom field converter, assigning it a name; then refer to the converter by its name:

rational_converter = proc do |field, field_context|
  field_context.index == 1 ? field.to_r : field
end
CSV::Converters[:rational] = rational_converter
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, converters: :rational)
parsed['Value'] # => [(0/1), (1/1), (2/1)]

Using Multiple Field Converters

You can use multiple field converters in either of these ways:

Recipe: Specify Multiple Field Converters in Option :converters

Apply multiple field converters by specifying them in option :converters:

source = "Name,Value\nfoo,0\nbar,1.0\nbaz,2.0\n"
parsed = CSV.parse(source, headers: true, converters: [:integer, :float])
parsed['Value'] # => [0, 1.0, 2.0]
Recipe: Specify Multiple Field Converters in a Custom Converter List

Apply multiple field converters by defining and registering a custom converter list:

strip_converter = proc {|field| field.strip }
CSV::Converters[:strip] = strip_converter
CSV::Converters[:my_converters] = [:integer, :float, :strip]
source = "Name,Value\n foo , 0 \n bar , 1.0 \n baz , 2.0 \n"
parsed = CSV.parse(source, headers: true, converters: :my_converters)
parsed['Name'] # => ["foo", "bar", "baz"]
parsed['Value'] # => [0, 1.0, 2.0]

Converting Headers

You can use header converters to modify parsed String headers.

Built-in header converters include:

You can also define header converters to otherwise modify header Strings.

Recipe: Convert Headers to Lowercase

Convert headers to lowercase using built-in converter :downcase:

source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, header_converters: :downcase)
parsed.headers # => ["name", "value"]

Recipe: Convert Headers to Symbols

Convert headers to downcased Symbols using built-in converter :symbol:

source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, header_converters: :symbol)
parsed.headers # => [:name, :value]
parsed.headers.map {|header| header.class} # => [Symbol, Symbol]

Recipe: Filter Header Strings

Define a custom header converter to modify String fields. This example defines and uses a custom header converter that capitalizes each header String:

capitalize_converter = proc {|header| header.capitalize }
source = "NAME,VALUE\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, header_converters: capitalize_converter)
parsed.headers # => ["Name", "Value"]

Recipe: Register Header Converters

Register a custom header converter, assigning it a name; then refer to the converter by its name:

capitalize_converter = proc {|header| header.capitalize }
CSV::HeaderConverters[:capitalize] = capitalize_converter
source = "NAME,VALUE\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, header_converters: :capitalize)
parsed.headers # => ["Name", "Value"]

Using Multiple Header Converters

You can use multiple header converters in either of these ways:

Recipe: Specify Multiple Header Converters in Option :header_converters

Apply multiple header converters by specifying them in option :header_converters:

source = "Name,Value\nfoo,0\nbar,1.0\nbaz,2.0\n"
parsed = CSV.parse(source, headers: true, header_converters: [:downcase, :symbol])
parsed.headers # => [:name, :value]
Recipe: Specify Multiple Header Converters in a Custom Header Converter List

Apply multiple header converters by defining and registering a custom header converter list:

CSV::HeaderConverters[:my_header_converters] = [:symbol, :downcase]
source = "NAME,VALUE\nfoo,0\nbar,1.0\nbaz,2.0\n"
parsed = CSV.parse(source, headers: true, header_converters: :my_header_converters)
parsed.headers # => [:name, :value]

Diagnostics

Recipe: Capture Unconverted Fields

To capture unconverted field values, use option :unconverted_fields:

source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, converters: :integer, unconverted_fields: true)
parsed # => [["Name", "Value"], ["foo", 0], ["bar", 1], ["baz", 2]]
parsed.each {|row| p row.unconverted_fields }

Output:

["Name", "Value"]
["foo", "0"]
["bar", "1"]
["baz", "2"]

Recipe: Capture Field Info

To capture field info in a custom converter, accept two block arguments. The first is the field value; the second is a CSV::FieldInfo object:

strip_converter = proc {|field, field_info| p field_info; field.strip }
source = " foo , 0 \n bar , 1 \n baz , 2 \n"
parsed = CSV.parse(source, converters: strip_converter)
parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]

Output:

#<struct CSV::FieldInfo index=0, line=1, header=nil>
#<struct CSV::FieldInfo index=1, line=1, header=nil>
#<struct CSV::FieldInfo index=0, line=2, header=nil>
#<struct CSV::FieldInfo index=1, line=2, header=nil>
#<struct CSV::FieldInfo index=0, line=3, header=nil>
#<struct CSV::FieldInfo index=1, line=3, header=nil>