An Illusion An Illusion - 2 months ago 11
Scala Question

Scala: What is the easiest way to get all leaf nodes and their paths in an XML?

I am currently implementing DFS traversal of an xml such that it goes to each leaf node and generates the path to the leaf node.

Given XML:

<vehicles>
<vehicle>
gg
</vehicle>
<variable>
</variable>
</vehicles>


Output (Somthing like):

Map("gg" -> "vehicles/vehicle", "" -> "vehicles/variable")


It would be great if there is a library available that does this so I dont have to maintain the code.


Thanks. Any help is appreciated.

Answer

Here is a solution using standard scala xml library, prints out a map of paths -> "node text"

import scala.xml._               
val x = <div class="content"><a></a><p><q>hello</q></p><r><p>world</p></r><s></s></div>               
var map = Map[String,String]()               
def dfs(n: Seq[Node], brc: String): Unit = 
        n.foreach(x => {
                        if(x.child.isEmpty){
                           if(x.text == ""){ 
                            map = map + (brc + x.label -> "")
                            dfs(x.child,brc)
                          }
                          else{ 
                            map = map + (brc + x.label + " " -> x.text)
                            dfs(x.child,brc)
                          }
                        } 
                        else{ 
                          val bc = brc + x.label + ">"
                          dfs(x.child,bc)
                        }
                     }
               )               

dfs(x,"")
print(map) 
Comments