Arif Arif - 8 days ago 4
Ruby Question

Check if string has sequential characters in Ruby on Rails

Trying to validate a string to find out if it contains some sequential characters of 3 or more.

Example:

"11abcd$4567" => ['abcd', '4567']


Tried to do this by a regular expression but it looks much longer to code:

(?!abc|bcd|cde|.....)


Is there an easy way to check the sequential characters either by a regex or plain ruby ?

Answer

Regexp is not appropriate here. They are not flexible enough that one could construct a general case; and Unicode is vast, and constructing a regexp that responds to any ascending sequence of characters would take listing each of the tens or hundreds of thousands of cases. Ut could be done programmatically, but it would take time, and would be pretty costly, memory-wise.

def find_streaks(string, min_length=3)
  string                                 # "xabcy"
    .each_char                           # ['x', 'a', 'b', 'c', 'y']
    .chunk_while { |a, b| a.succ == b }  # [['x'], ['a', 'b', 'c'], ['y']]
    .select { |c| c.size >= min_length } # [['a', 'b', 'c']]
    .map(&:join)                         # ['abc']
end

EDIT: I guess this might work as a polyfill... Give it a try?

                                         # skip this thing on Ruby 2.3+, unneeded
unless Enumerable.instance_methods.include?(:chunk_while)
  module Enumerable
    def chunk_while                      # let's polyfill!
      streak = nil                       # twofold purpose: init `streak` outside
                                         # the block, and `nil` as flag to spot
                                         # the first element.

      Enumerator.new do |y|              # `chunk_while` returns an `Enumerator`.
        each do |element|                # go through all the elements.
          if streak                      # except on first element:
            if yield streak[-1], element # give the previous element and current
                                         # one to the comparator block.
                                         # `streak` will always have an element. 
              streak << element          # if the two elements are "similar",
                                         # add this one to the streak;
            else                         # otherwise
              y.yield streak             # output the current streak and
              streak = [element]         # start a new one with the current element.
            end
          else                           # for the first element, nothing to compare
            streak = [element]           # so just start the streak.
          end
        end
        y.yield streak if streak         # output the last streak;
                                         # but if `streak` is `nil`, there were
                                         # no elements, so no output.
      end
    end
  end
end

EDIT: Tested, works. Simplified a bit by removing an unnecessary temporary variable.

EDIT: Added comments[1]. Responded to comments[2].

EDIT: Well, derp. Here I go writing all this by hand... when it could have been as easy as this:

unless Enumerable.instance_methods.include?(:chunk_while)
  module Enumerable
    def chunk_while
      slice_when { |a, b| !yield a, b }
    end
  end
end

Yup, chunk_while is just the opposite of slice_when. Could have even substituted it in the original code, as .slice_when { |a, b| a.succ != b }. Sometimes I'm slow.