Skip to content

Commit

Permalink
Merge pull request #174 from fatkodima/constant_regexp
Browse files Browse the repository at this point in the history
Add new `Performance/ConstantRegexp` cop
  • Loading branch information
marcandre authored Oct 26, 2020
2 parents b8e8ff7 + d50ca82 commit 05bddff
Show file tree
Hide file tree
Showing 10 changed files with 185 additions and 6 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
### New features

* [#173](https://github.com/rubocop-hq/rubocop-performance/pull/173): Add new `Performance/BlockGivenWithExplicitBlock` cop. ([@fatkodima][])
* [#151](https://github.com/rubocop-hq/rubocop-performance/issues/151): Add new `Performance/ConstantRegexp` cop. ([@fatkodima][])

### Changes

Expand Down
5 changes: 5 additions & 0 deletions config/default.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,11 @@ Performance/CompareWithBlock:
Enabled: true
VersionAdded: '0.46'

Performance/ConstantRegexp:
Description: 'Finds regular expressions with dynamic components that are all constants.'
Enabled: pending
VersionAdded: '1.9'

Performance/Count:
Description: >-
Use `count` instead of `{select,find_all,filter,reject}...{size,count,length}`.
Expand Down
1 change: 1 addition & 0 deletions docs/modules/ROOT/pages/cops.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Performance cops optimization analysis for your projects.
* xref:cops_performance.adoc#performancechainarrayallocation[Performance/ChainArrayAllocation]
* xref:cops_performance.adoc#performancecollectionliteralinloop[Performance/CollectionLiteralInLoop]
* xref:cops_performance.adoc#performancecomparewithblock[Performance/CompareWithBlock]
* xref:cops_performance.adoc#performanceconstantregexp[Performance/ConstantRegexp]
* xref:cops_performance.adoc#performancecount[Performance/Count]
* xref:cops_performance.adoc#performancedeleteprefix[Performance/DeletePrefix]
* xref:cops_performance.adoc#performancedeletesuffix[Performance/DeleteSuffix]
Expand Down
40 changes: 40 additions & 0 deletions docs/modules/ROOT/pages/cops_performance.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -405,6 +405,46 @@ array.min_by(&:foo)
array.sort_by { |a| a[:foo] }
----

== Performance/ConstantRegexp

|===
| Enabled by default | Safe | Supports autocorrection | VersionAdded | VersionChanged

| Pending
| Yes
| Yes
| 1.9
| -
|===

This cop finds regular expressions with dynamic components that are all constants.

Ruby allocates a new Regexp object every time it executes a code containing such
a regular expression. It is more efficient to extract it into a constant
or add an `/o` option to perform `#{}` interpolation only once and reuse that
Regexp object.

=== Examples

[source,ruby]
----
# bad
def tokens(pattern)
pattern.scan(TOKEN).reject { |token| token.match?(/\A#{SEPARATORS}\Z/) }
end
# good
ALL_SEPARATORS = /\A#{SEPARATORS}\Z/
def tokens(pattern)
pattern.scan(TOKEN).reject { |token| token.match?(ALL_SEPARATORS) }
end
# good
def tokens(pattern)
pattern.scan(TOKEN).reject { |token| token.match?(/\A#{SEPARATORS}\Z/o) }
end
----

== Performance/Count

|===
Expand Down
8 changes: 4 additions & 4 deletions lib/rubocop/cop/mixin/regexp_metacharacter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def literal_at_start_with_backslash_a?(regex_str)
# (tricky: \s, \d, and so on are metacharacters, but other characters
# escaped with a slash are just literals. LITERAL_REGEX takes all
# that into account.)
/\A\\A(?:#{Util::LITERAL_REGEX})+\z/.match?(regex_str)
/\A\\A(?:#{Util::LITERAL_REGEX})+\z/o.match?(regex_str)
end

def literal_at_start_with_caret?(regex_str)
Expand All @@ -35,21 +35,21 @@ def literal_at_start_with_caret?(regex_str)
# (tricky: \s, \d, and so on are metacharacters, but other characters
# escaped with a slash are just literals. LITERAL_REGEX takes all
# that into account.)
/\A\^(?:#{Util::LITERAL_REGEX})+\z/.match?(regex_str)
/\A\^(?:#{Util::LITERAL_REGEX})+\z/o.match?(regex_str)
end

def literal_at_end_with_backslash_z?(regex_str)
# is this regexp 'literal' in the sense of only matching literal
# chars, rather than using metachars like . and * and so on?
# also, is it anchored at the end of the string?
/\A(?:#{Util::LITERAL_REGEX})+\\z\z/.match?(regex_str)
/\A(?:#{Util::LITERAL_REGEX})+\\z\z/o.match?(regex_str)
end

def literal_at_end_with_dollar?(regex_str)
# is this regexp 'literal' in the sense of only matching literal
# chars, rather than using metachars like . and * and so on?
# also, is it anchored at the end of the string?
/\A(?:#{Util::LITERAL_REGEX})+\$\z/.match?(regex_str)
/\A(?:#{Util::LITERAL_REGEX})+\$\z/o.match?(regex_str)
end

def drop_start_metacharacter(regexp_string)
Expand Down
68 changes: 68 additions & 0 deletions lib/rubocop/cop/performance/constant_regexp.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# frozen_string_literal: true

module RuboCop
module Cop
module Performance
# This cop finds regular expressions with dynamic components that are all constants.
#
# Ruby allocates a new Regexp object every time it executes a code containing such
# a regular expression. It is more efficient to extract it into a constant
# or add an `/o` option to perform `#{}` interpolation only once and reuse that
# Regexp object.
#
# @example
#
# # bad
# def tokens(pattern)
# pattern.scan(TOKEN).reject { |token| token.match?(/\A#{SEPARATORS}\Z/) }
# end
#
# # good
# ALL_SEPARATORS = /\A#{SEPARATORS}\Z/
# def tokens(pattern)
# pattern.scan(TOKEN).reject { |token| token.match?(ALL_SEPARATORS) }
# end
#
# # good
# def tokens(pattern)
# pattern.scan(TOKEN).reject { |token| token.match?(/\A#{SEPARATORS}\Z/o) }
# end
#
class ConstantRegexp < Base
extend AutoCorrector

MSG = 'Extract this regexp into a constant or append an `/o` option to its options.'

def on_regexp(node)
return if within_const_assignment?(node) ||
!include_interpolated_const?(node) ||
node.single_interpolation?

add_offense(node) do |corrector|
corrector.insert_after(node, 'o')
end
end

private

def within_const_assignment?(node)
node.each_ancestor(:casgn).any?
end

def_node_matcher :regexp_escape?, <<~PATTERN
(send
(const nil? :Regexp) :escape const_type?)
PATTERN

def include_interpolated_const?(node)
return false unless node.interpolation?

node.each_child_node(:begin).all? do |begin_node|
inner_node = begin_node.children.first
inner_node && (inner_node.const_type? || regexp_escape?(inner_node))
end
end
end
end
end
end
2 changes: 1 addition & 1 deletion lib/rubocop/cop/performance/squeeze.rb
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ def on_send(node)
private

def repeating_literal?(regex_str)
regex_str.match?(/\A(?:#{Util::LITERAL_REGEX})\+\z/)
regex_str.match?(/\A(?:#{Util::LITERAL_REGEX})\+\z/o)
end
end
end
Expand Down
2 changes: 1 addition & 1 deletion lib/rubocop/cop/performance/string_include.rb
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def on_send(node)
private

def literal?(regex_str)
regex_str.match?(/\A#{Util::LITERAL_REGEX}+\z/)
regex_str.match?(/\A#{Util::LITERAL_REGEX}+\z/o)
end
end
end
Expand Down
1 change: 1 addition & 0 deletions lib/rubocop/cop/performance_cops.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
require_relative 'performance/casecmp'
require_relative 'performance/collection_literal_in_loop'
require_relative 'performance/compare_with_block'
require_relative 'performance/constant_regexp'
require_relative 'performance/count'
require_relative 'performance/delete_prefix'
require_relative 'performance/delete_suffix'
Expand Down
63 changes: 63 additions & 0 deletions spec/rubocop/cop/performance/constant_regexp_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# frozen_string_literal: true

RSpec.describe RuboCop::Cop::Performance::ConstantRegexp do
subject(:cop) { described_class.new }

it 'registers an offense and corrects when regexp contains interpolated constant' do
expect_offense(<<~RUBY)
str.match?(/\A\#{CONST}/)
^^^^^^^^^^^ Extract this regexp into a constant or append an `/o` option to its options.
RUBY

expect_correction(<<~RUBY)
str.match?(/\A\#{CONST}/o)
RUBY
end

it 'registers an offense and corrects when regexp contains multiple interpolated constants' do
expect_offense(<<~RUBY)
str.match?(/\A\#{CONST1}something\#{CONST2}\z/)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Extract this regexp into a constant or append an `/o` option to its options.
RUBY

expect_correction(<<~RUBY)
str.match?(/\A\#{CONST1}something\#{CONST2}\z/o)
RUBY
end

it 'registers an offense and corrects when regexp contains `Regexp.escape` on constant' do
expect_offense(<<~RUBY)
str.match?(/\A\#{CONST1}something\#{Regexp.escape(CONST2)}\z/)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Extract this regexp into a constant or append an `/o` option to its options.
RUBY

expect_correction(<<~RUBY)
str.match?(/\A\#{CONST1}something\#{Regexp.escape(CONST2)}\z/o)
RUBY
end

it 'does not register an offense when regexp does not contain interpolated constant' do
expect_no_offenses(<<~RUBY)
str.match?(/foo/)
RUBY
end

it 'does not register an offense when regexp is within assignment to a constant' do
expect_no_offenses(<<~RUBY)
CONST = str.match?(/\#{ANOTHER_CONST}/)
RUBY
end

it 'does not register an offense when regexp has `/o` option' do
expect_no_offenses(<<~RUBY)
str.match?(/\#{CONST}/o)
RUBY
end

it 'does not register an offense when regexp contains interpolated constant and '\
'and interpolated non constant' do
expect_no_offenses(<<~RUBY)
str.match?(/\#{CONST}\#{do_something(1)}/)
RUBY
end
end

0 comments on commit 05bddff

Please sign in to comment.