Afsane Fadaei Afsane Fadaei - 22 days ago 8
Ruby Question

Nokogiri : NoMethodError (undefined method `inner_html' for nil:NilClass)

I'm trying to parse a simple XML data with nokogiri.
this is my XML:

POST /.... HTTP/1.1
Host: ....
Content-Type: text/xml; charset=utf-8
Content-Length: length
SOAPAction: "http://...."

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="...." xmlns:xsd="...." xmlns:soap="....">
<soap:Body>
<WS_QueryOnSec xmlns="......">
<type>string</type>
<ID>string</ID>
</WS_QueryOnSec>
</soap:Body>
</soap:Envelope>


and this is my simle request:

require "nokogiri"
@doc = Nokogiri::XML(request.body.read)
@something = @doc.at('type').inner_html


But Nokogiri can not find the Type or ID node.
When I change the data into this every thing works fine:

<soap:Body>
<type>string</type>
<ID>string</ID>
</soap:Body>


It seems the problem is the raw text above the data and the nods with xmlns or the other attributes!
What do you recommend to resolve this ?

Answer

The first "XML" isn't XML. It's text that contains XML. Remove the header information down to the blank line and try it again.

I think it'd help you to read the XML spec or to read some tutorials about creating XML which will help you understand how it's defined. XML is a tight specification and doesn't allow any deviation. The syntax is pretty flexible, but you have to play by its rules.

Consider these examples:

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
foo

<root>
  <node />
</root>
EOT

doc.errors # => [#<Nokogiri::XML::SyntaxError: Start tag expected, '<' not found>]

Removing the text, which is outside the root tag results in a proper parse:

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
<root>
  <node />
</root>
EOT

doc.errors # => []

<root> isn't neccesarily the name of the "root" node, it's just the outermost tag:

doc = Nokogiri::XML(<<EOT)
<foo>
  <node />
</foo>
EOT

doc.errors # => []

and still results in a valid DOM/internal representation of the document:

puts doc.to_html 

# >> <foo>
# >>   <node></node>
# >> </foo>

Your XML sample is using namespaces, which complicate matters somewhat. The Nokogiri documentation talks about how to deal with them, so you'll want to understand that part of parsing XML because you'll encounter it again. Here's the easy way of working with them:

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="utf-8"?>
<Envelope xmlns:xsi="...." xmlns:xsd="...." xmlns:soap="....">
  <Body>
    <WS_QueryOnSec xmlns="......">
      <type>string</type>
      <ID>string</ID>
    </WS_QueryOnSec>
  </Body>
</Envelope>
EOT

namespaces = doc.collect_namespaces

doc.at('type', namespaces).text # => "string"