Marcus Marcus - 2 months ago 10
Java Question

JAVA: Parsing values from an HMTL file filled by JS Data Tables

I've got a problem on parsing a website for it's contents.
It's a "data storage" site (network documentation), from which i need the data to generate some configuration.

Whatsoever.
I'm connecting to the page with

HttpURLConnection
without any problem.
Reading the site into a String and Parse it with JSoup.

When opening the page in my browser i get following Element:

<input type="text" name="b2" value="XXXXX" size="6" onfocus="return sbnrSel()" autocomplete="OFF" onkeyup="searchSuggest(this.id,'b2sel','getSTG?b2='+this.value,1)" onclick="document.getElementById('b2sel').style.display='none'" id="b2" class="muss" />


When Looking into my String I find following Element:

<input type="text" name="b2" size="6" onfocus="return sbnrSel()" autocomplete="OFF" onkeyup="searchSuggest(this.id,'b2sel','getSTG?b2='+this.value,1)" onclick="document.getElementById('b2sel').style.display='none'" id="b2" class="muss" />


I want to parse the value (in this case XXXXX).
My code does not find it, cause there is no "value" in the Element.
Here how I try to refence it:

doc.getElementsByAttributeValue("name", "b2").first().attr("value"))


As far as I understand the page... It gets updated on loading by a Java Script inside the code. But I just have no idea, how to access this data from my JAVA code...

This is the begining of the HTML page with the JS Code:

<script src="/js/cuba-ng.js" type="text/javascript"></script>
<script src="/js/fchng.js" type="text/javascript"></script>
<script src="/js/jquery.js" type="text/javascript"></script>
<script src="/js/jquery.dataTables.js" type="text/javascript"></script>
<script type="text/javascript">//<![CDATA[
$(document).ready
( function()
{ $("#chan").dataTable( { "aaSorting": [[ 4, "desc" ]], "bPaginate": false, "bFilter": false } );
}
);
function sbnrSel()
{ if (document.forms[0].b3.value > 39) return;
var bnr = "xxxx";
bnr = document.forms[0].b1.value;
var sbnr = bnr.substr(1,3);
var ba = bnr.charAt(0);
switch (ba)
{ case "1": sba="I";break;
case "2": sba="O";break;
case "3": sba="B";break;
case "4": sba="D";break;
default: sba="Z";break;
}
document.forms[0].b2.value=sba+sbnr;
}

//]]></script>


Unfortunately I am not able to provide the whole code and data. Some security restrictions. Hope you understand.

I was able to look into the referenced JS scripts via a Firefox Plugin.
Seems the HTML is filled with JS data tables libaries.

Looking forward to your support.
And sorry, if there is any crucial information missing.

Answer

I solved this by using a different way to access the side. Instead of using HttpURLConnection, i found the HTMLunit Webclient. With this, I am able to run the JavaScript code within the reply. Please see the used code:

WebClient webClient = new WebClient();
WebRequest request;
request = new WebRequest(new URL(<enter URL here>));
Page page = webClient.getPage(request);
text = page.getWebResponse().getContentAsString();
Comments