Andriy Andriy - 3 years ago 179
Ruby Question

Parse index.html with Nokogiri and assign a.link with the following text

Please help me to figure out how to properly assign Build name with date and then sort out all link in ascending order by upload date.

Example of Index.html looks as the following:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
<head><title>Index of localhost/BUILD</title>
</head>
<body>
<h1>Index of localhost/BUILD</h1>
<pre>Name Last modified Size</pre><hr/>
<pre><a href="../">../</a>
<a href="BUILD.10.tar">BUILD.10.tar</a> 27-Sep-2017 15:46 250 bytes
<a href="BUILD.13.tar">BUILD.13.tar</a> 28-Sep-2017 12:14 254 bytes
<a href="BUILD.15.tar">BUILD.15.tar</a> 29-Sep-2017 08:56 257 bytes
<a href="BUILD.16.tar">BUILD.16.tar</a> 29-Sep-2017 08:56 258 bytes
<a href="BUILD.17.tar">BUILD.17.tar</a> 29-Sep-2017 08:56 256 bytes
<a href="BUILD.9.tar">BUILD.9.tar</a> 27-Sep-2017 15:44 247 bytes
</pre>
<hr/><address style="font-size:small;">Artifactory/5.2.1 Server</address></body></html>


Currently my script looks as the following:

require 'open-uri'
require 'nokogiri'

build_url = "/home/index.html"
index_html = open(build_url).read
index_dom = Nokogiri::HTML.parse index_html

builds =[]
links = index_dom.css('a').each { |link|
build = link.text
if build.end_with?(".tar")
builds.push(build)
end
}
rc_builds = []
builds.sort.each { |b| rc_builds << b }
p rc_builds


This need to be changed to get Build name and Last modified, and output rc_builds array sorted in ascending order by the Last modified.

No changes to index.html can be made. So solution should be based on the index.html page in example.

The problem is that I cannot figure out how to access Last Modified text.

Answer Source

That's how I would do it:

dom = Nokogiri::HTML.parse index_html

builds =[]

pre =  dom.css('pre')
build_info = pre[1].text

result = []

build_info.split("\n").each do |line|
  next unless line =~ /BUILD/
  arr = line.split(/\s+/)
  result.push({
    build: arr[0],
    modified: "#{arr[1]} #{arr[2]}",
    size: "#{arr[3]}",
    size_unit: "#{arr[4]}"
  })
end


p result

#[{:build=>"BUILD.10.tar", :modified=>"27-Sep-2017 15:46", :size=>"250", :size_unit=>"bytes"}, {:build=>"BUILD.13.tar", :modified=>"28-Sep-2017 12:14", :size=>"254", :size_unit=>"bytes"}, {:build=>"BUILD.15.tar", :modified=>"29-Sep-2017 08:56", :size=>"257", :size_unit=>"bytes"}, {:build=>"BUILD.16.tar", :modified=>"29-Sep-2017 08:56", :size=>"258", :size_unit=>"bytes"}, {:build=>"BUILD.17.tar", :modified=>"29-Sep-2017 08:56", :size=>"256", :size_unit=>"bytes"}, {:build=>"BUILD.9.tar", :modified=>"27-Sep-2017 15:44", :size=>"247", :size_unit=>"bytes"}]
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download