Rafael Viana Rafael Viana - 18 days ago 5x
Ruby Question

ruby - possible inefficient algorithm

This was my solution to a function that should return the first pair of two prime numbers spaced with a gap of g between the limits m, n if these numbers exist otherwise nil.

This is a kata from codewars.com, it passed the preliminary test. But, When I submit it I receive an error message saying that due to an inefficiency in my algorithm it takes too much time(8000ms+).

Can someone point out to me what exactly is slowing down the code, and how it should be optimized?

require 'prime'

def gap(g, m, n)
range = (m..n).to_a
primes = []
range.each { |num| primes.push(num) if Prime.prime?(num)}

primes.each_index do |count|
prv , nxt = primes[count], primes[count+1]
if !nxt.is_a? Integer
return nil

if nxt - prv == g
return [prv, nxt]



Try this:

require 'prime'

def gap(g, m, n)
  primes = Prime.each(n).select { |p| p >= m }
  primes[0..-2].zip(primes[1..-1]).find { |a, b| b - a == g } 

gap(2, 1, 1000)
#=> [3, 5]

gap(6, 1, 1000)
#=> [23, 29]

Prime.each(n).select { |p| p >= m } returns you the list of all primes between m and n. This has a better performance than building an array with all numbers between m and n and the checking each number in this array if it is a prime. It is also worth noting that Prime.each uses the eratosthenes' sieve algorithm as a default.

primes[0..-2].zip(primes[1..-1]) builds an array of each pair. This is not the most efficient way to iterate over each pair in the primes array, but I think it read better than dealing with indexes.

This might be another option:

require 'prime'
require 'set'

def gap(g, m, n)
  primes = Set.new(Prime.each(n).select { |p| p>= m })
  primes.each { |prime| return [prime, prime + g] if primes.include?(prime + g) }

the second version doesn't build an new array with all pairs, but instead just checks for each prime if prime + g is included in the primes array too. I use a Set, because it improves include? lookups to O(1) (whereas include? on arrays would be O(n).

I am not sure which version will be faster and it might be worth it to run some benchmarks. The first versions needs more memory, but does less calculations. The performance probably depends on the size of the range.