Oleksandr Gubchenko Oleksandr Gubchenko - 1 month ago 17
Java Question

Htmlunit login on github - can't identify form id

I'm trying to write Java application, my primary scope is login to website and parse some data. I've choosed to use htmlunit and jsoup. I'm stuck at the beginning. While trying to find form id on https://github.com/login page to put it in htmlunit code and proceed with login, but the source code of the page is the following:

<form accept-charset="UTF-8" action="/session" data-form-nonce="39175dde4169cc3f2ad998cac114a63525a17f3f" method="post">


the form doesn't have an id, so how htmlunit can identify it?

Possibly post a code example.

Thanks.

Answer

There is only one form on the github login page, so identifying is not really an issue here. If you want to know how to select an element without using getElementByID you can use querySelector("...") instead:

Example Code

WebClient webClient = new WebClient(BrowserVersion.CHROME);

String url = "https://github.com/login";

webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);

HtmlPage page = webClient.getPage(url);
DomElement form = (DomElement) page.querySelector("form");

System.out.println(form.asXml());

webClient.close();

Output

<form accept-charset="UTF-8" action="/session" data-form-nonce="0cd9f59e177729dbfe5a1b275514fdcc21be8c84" method="post">
  <div style="margin:0;padding:0;display:inline">
    <input name="utf8" type="hidden" value="✓"/>
    <input name="authenticity_token" type="hidden" value="3rrjjZbyJ6n310XnDR9mXCi5pJ6OsA+HvLJ0oem8k/XHj37Sd26GXxG7IQk5tcbDnPQnE7WvIjNgU77428iajw=="/>
  </div>
  <div class="auth-form-header p-0">
    <h1>
      Sign in to GitHub
    </h1>
  </div>
  <div id="js-flash-container">
  </div>
  <div class="auth-form-body mt-3">
    <label for="login_field">

          Username or email address

    </label>
    <input autocapitalize="off" autocorrect="off" autofocus="autofocus" class="form-control input-block" id="login_field" name="login" tabindex="1" type="text"/>
    <label for="password">

          Password 
      <a href="/password_reset" class="label-link">
        Forgot password?
      </a>
    </label>
    <input class="form-control form-control input-block" id="password" name="password" tabindex="2" type="password"/>
    <input class="btn btn-primary btn-block" data-disable-with="Signing in…" name="commit" tabindex="3" type="submit" value="Sign in"/>
  </div>
</form>