I have a data dump, of which the following is one row of it:
{,lat:26.3832456,distance:678.4075116373302,lon:120.4731951,address:tourism:viewpoint,},{,lat:26.3830149,distance:622.2862561842148,lon:120.473753,address:name:xe7,xbe,x85,xe6,xbc,xa2,xe5,x9d,xaa,tourism:viewpoint,},{,lat:26.3833609,distance:363.7364243757184,lon:120.4763708,address:name:xe5,x9c,x8b,xe4,xb9,x8b,xe5,x8c,x97,xe7,x96,x86,tourism:viewpoint,},{,lat:26.3823648,distance:223.60523114628876,lon:120.4821298,address:name:xe5,x90,x8e,xe6,xbe,xb3,natural:bay,},{,lat:26.3788243,distance:470.02293394005875,lon:120.480733,address:name:xe5,x90,x8e,xe6,xbe,xb3,xe5,xb1,xb1,source:GNS,natural:peak,},{,lat:26.3750042,distance:893.4290785528082,lon:120.4808826,address:name:xe8,x93,xae,xe8,x8a,xb1,xe5,x9c,x92,source:GNS,natural:peak,},{,lat:26.3763331,distance:742.92090763674,lon:120.4795115,address:name:xe8,xa5,xbf,xe5,xbc,x95,xe5,xb3,xb6,place:hamlet,source:GNS,},{,lat:26.378645,distance:623.327734488774,lon:120.4839399,address:source:PGS,natural:coastline,},{,lat:26.3801244,distance:418.6308872217763,lon:120.4772875,address:highway:residential,},{,lat:26.3791422,distance:434.6736862343828,lon:120.4792953,address:highway:residential,},{,lat:26.3779802,distance:739.2129423740619,lon:120.4751349,address:highway:unclassified,},{,lat:26.3770924,distance:675.0424314750977,lon:120.4815607,address:highway:residential,},{,lat:26.3760869,distance:798.0261247167285,lon:120.4821517,address:highway:path,},{,lat:26.3766434,distance:737.1372670528466,lon:120.4821003,address:highway:path,},{,lat:26.3813278,distance:384.84440601318613,lon:120.4766175,address:highway:path,},{,lat:26.3755092,distance:833.3985359252805,lon:120.4802778,address:highway:road,},{,lat:26.3785345,distance:496.6253230490143,lon:120.4799081,address:highway:road,}
distance
{,lat:26.3823648,distance:223.60523114628876,lon:120.4821298,address:name:xe5,x90,x8e,xe6,xbe,xb3,natural:bay,}
distance
require 'rubygems'
require 'mechanize'
require 'csv'
CSV.open('Output.csv', "wb") do |csv|
CSV.foreach('Original.csv', :headers=>true) do |row|
vector = row.split(",")
dist = vector.match("^.*\/distance:\/(.*)\/")
csv << dist
end
end
Not very elegant, but it seems to work:
s.scan(/\{[^{}]*\}/).min_by { |r| r =~ /distance:(.*),/; $1.to_f }
where s
would be your initial data dump as a string.
scan
splits the initial data into an array of records (anything between pairs of braces which is not a brace is considered part of a record). min_by loops through that array looking for the record which has a minimum value given by the block passed as a parameter - in this case the block is just a regex match looking for the distance value in the record.