If it will not use special variables (like $1, $&, $`...),
it can improve the performance by using Regexp#match? or String#match? instead of Regexp#=~ or String#=~.

This patch is same idea as https://github.com/ruby/ruby/pull/1836

[Fix GH-1842]

## Environment
* OS : Ubuntu 17.10
* Compiler : gcc version 7.2.0
* CPU : Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
* Memory : 16 GB

## TL;DR
Methods     | Before | After  | Speed up
----------- | ------ | ------ | --------
CSV.foreach | 44.825 | 48.201 | 7.5%
CSV#shift   | 45.200 | 49.584 | 9.7%
CSV.read    | 42.968 | 46.853 | 9.0%
CSV.table   | 10.933 | 11.277 | 3.1%

## Before
```
Calculating -------------------------------------
         CSV.foreach     44.825  (± 0.0%) i/s -    228.000  in   5.086576s
           CSV#shift     45.200  (± 0.0%) i/s -    228.000  in   5.044297s
            CSV.read     42.968  (± 0.0%) i/s -    216.000  in   5.027504s
           CSV.table     10.933  (± 0.0%) i/s -     55.000  in   5.031098s
```

## After
```
Calculating -------------------------------------
         CSV.foreach     48.201  (± 0.0%) i/s -    244.000  in   5.062256s
           CSV#shift     49.584  (± 0.0%) i/s -    248.000  in   5.001652s
            CSV.read     46.853  (± 0.0%) i/s -    236.000  in   5.037044s
           CSV.table     11.277  (± 0.0%) i/s -     57.000  in   5.054694s
```

## Benchmark code
```ruby
require 'csv'
require 'benchmark/ips'

CSV.open("/tmp/file.csv", "w") do |csv|
  csv << ["player", "gameA", "gameB"]
  1000.times do
    csv << ['"Alice"', "84.0", "79.5"]
    csv << ['"Bob"', "20.0", "56.5"]
  end
end

Benchmark.ips do |x|
  x.report "CSV.foreach" do
    CSV.foreach("/tmp/file.csv") do |row|
    end
  end

  x.report "CSV#shift" do
    CSV.open("/tmp/file.csv") do |csv|
      while line = csv.shift
      end
    end
  end

  x.report "CSV.read" do
    CSV.read("/tmp/file.csv")
  end

  x.report "CSV.table" do
    CSV.table("/tmp/file.csv")
  end
end
```

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62806 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit is contained in:
watson1978 2018-03-18 10:28:58 +00:00
Родитель 01e9d9ac7e
Коммит dce4a3f58c
1 изменённых файлов: 7 добавлений и 7 удалений

14
lib/csv.rb Normal file → Executable file
Просмотреть файл

@ -970,7 +970,7 @@ class CSV
date: lambda { |f|
begin
e = f.encode(ConverterEncoding)
e =~ DateMatcher ? Date.parse(e) : f
e.match?(DateMatcher) ? Date.parse(e) : f
rescue # encoding conversion or date parse errors
f
end
@ -978,7 +978,7 @@ class CSV
date_time: lambda { |f|
begin
e = f.encode(ConverterEncoding)
e =~ DateTimeMatcher ? DateTime.parse(e) : f
e.match?(DateTimeMatcher) ? DateTime.parse(e) : f
rescue # encoding conversion or date parse errors
f
end
@ -1271,7 +1271,7 @@ class CSV
begin
f = File.open(filename, mode, file_opts)
rescue ArgumentError => e
raise unless /needs binmode/ =~ e.message and mode == "r"
raise unless /needs binmode/.match?(e.message) and mode == "r"
mode = "rb"
file_opts = {encoding: Encoding.default_external}.merge(file_opts)
retry
@ -1870,7 +1870,7 @@ class CSV
if part.end_with?(@quote_char) && part.count(@quote_char) % 2 != 0
# extended column ends
csv.last << part[0..-2]
if csv.last =~ @parsers[:stray_quote]
if csv.last.match?(@parsers[:stray_quote])
raise MalformedCSVError,
"Missing or stray quote in line #{lineno + 1}"
end
@ -1888,7 +1888,7 @@ class CSV
elsif part.end_with?(@quote_char)
# regular quoted column
csv << part[1..-2]
if csv.last =~ @parsers[:stray_quote]
if csv.last.match?(@parsers[:stray_quote])
raise MalformedCSVError,
"Missing or stray quote in line #{lineno + 1}"
end
@ -1899,9 +1899,9 @@ class CSV
raise MalformedCSVError,
"Missing or stray quote in line #{lineno + 1}"
end
elsif part =~ @parsers[:quote_or_nl]
elsif part.match?(@parsers[:quote_or_nl])
# Unquoted field with bad characters.
if part =~ @parsers[:nl_or_lf]
if part.match?(@parsers[:nl_or_lf])
raise MalformedCSVError, "Unquoted fields do not allow " +
"\\r or \\n (line #{lineno + 1})."
else