peep peep - 6 months ago 174
Ruby Question

logstash filter: get all array elements as new event

I'm trying to get array elements after xml parser as following:

filter {
xml {
source => "message"
target => "xmldata"
store_xml => "false"
xpath => ["/OMA/ESMLog/LogEntry/Index/text()","index"]
xpath => ["/OMA/ESMLog/LogEntry/Status/text()","status"]
xpath => ["/OMA/ESMLog/LogEntry/TimeStampRaw/text()","timestampraw"]
xpath => ["/OMA/ESMLog/LogEntry/Description/text()","description"]
}
mutate {
remove_field => [ "message", "inxml", "xmldata" ]
}

mutate {
replace => {
"index" => "%{[index][0]}"
"status" => "%{[status][0]}"
"timestampraw" => "%{[timestampraw][0]}"
"description" => "%{[description][0]}"

}
}
date {
match => [ "timestampraw", "UNIX" ]
}
}


As you can see I'm able to get every first element from the arrays, but how can I get all elements from the arrays as a new event?
So, I want to see every 'LogEntry' element as a new event from the XML.
Here some example xml(raw xml from omsa):

<?xml version="1.0" encoding="UTF-8"?>
<OMA>
<ESMLog>
<LogEntry>
<Index>0</Index>
<Status>2</Status>
<TimeStamp>Tue Nov 3 07:22:57 2015</TimeStamp>
<TimeStampRaw>1446535377</TimeStampRaw>
<Description>The system board Mem2 temperature is within range.</Description>
</LogEntry>
<LogEntry>
<Index>1</Index>
<Status>3</Status>
<TimeStamp>System Boot</TimeStamp>
<TimeStampRaw>1446535378</TimeStampRaw>
<Description>The system board Mem2 temperature is less than the lower warning threshold.</Description>
</LogEntry>
<LogEntry>
<Index>2</Index>
<Status>2</Status>
<TimeStamp>Mon Nov 2 14:17:09 2015</TimeStamp>
<TimeStampRaw>1446473829</TimeStampRaw>
<Description>Drive 0 is installed in disk drive bay 1.</Description>
</LogEntry>
<LogEntry>
<Index>3</Index>
<Status>4</Status>
<TimeStamp>Mon Nov 2 14:17:04 2015</TimeStamp>
<TimeStampRaw>1446473824</TimeStampRaw>
<Description>Drive 0 is removed from disk drive bay 1.</Description>
</LogEntry>
<LogEntry>
<Index>4</Index>
<Status>2</Status>
<TimeStamp>Mon Nov 2 14:15:54 2015</TimeStamp>
<TimeStampRaw>1446473754</TimeStampRaw>
<Description>Drive 0 is installed in disk drive bay 1.</Description>
</LogEntry>
<LogEntry>
<Index>5</Index>
<Status>4</Status>
<TimeStamp>Mon Nov 2 13:58:54 2015</TimeStamp>
<TimeStampRaw>1446472734</TimeStampRaw>
<Description>Drive 0 is removed from disk drive bay 1.</Description>
</LogEntry>
<LogEntry>
<Index>6</Index>
<Status>2</Status>
<TimeStamp>Fri Feb 5 11:07:27 2010</TimeStamp>
<TimeStampRaw>1265368047</TimeStampRaw>
<Description>Drive 0 is installed in disk drive bay 1.</Description>
</LogEntry>
<LogEntry>
<Index>7</Index>
<Status>2</Status>
<TimeStamp>Fri Feb 5 11:07:08 2010</TimeStamp>
<TimeStampRaw>1265368028</TimeStampRaw>
<Description>Drive 0 in disk drive bay 1 is operating normally.</Description>
</LogEntry>
<LogEntry>
<Index>8</Index>
<Status>4</Status>
<TimeStamp>Fri Feb 5 11:07:07 2010</TimeStamp>
<TimeStampRaw>1265368027</TimeStampRaw>
<Description>Drive 0 is removed from disk drive bay 1.</Description>
</LogEntry>
<LogEntry>
<Index>9</Index>
<Status>4</Status>
<TimeStamp>Fri Jan 29 09:33:27 2010</TimeStamp>
<TimeStampRaw>1264757607</TimeStampRaw>
<Description>Fault detected on drive 0 in disk drive bay 1.</Description>
</LogEntry>
<LogEntry>
<Index>10</Index>
<Status>2</Status>
<TimeStamp>Mon Feb 25 16:14:15 2008</TimeStamp>
<TimeStampRaw>1203956055</TimeStampRaw>
<Description>Log cleared.</Description>
</LogEntry>
<NumRecords>11</NumRecords>
</ESMLog>
<ObjStatus>2</ObjStatus>
<SMStatus>0</SMStatus>
</OMA>


Here is the solution I made, by Jettro's example:

filter {
xml {
source => "message"
target => "xmldata"
store_xml => "false"
xpath => ["/OMA/ESMLog//LogEntry","logentry"]
}

mutate {
remove_field => [ "message", "inxml", "xmldata" ]
}

split {
field => "[logentry]"
}

xml {
source => "logentry"
store_xml => "false"
xpath => ["/LogEntry/Index/text()","index"]
xpath => ["/LogEntry/Status/text()","status"]
xpath => ["/LogEntry/TimeStampRaw/text()","timestampraw"]
xpath => ["/LogEntry/Description/text()","description"]
}
mutate {
replace => {
"index" => "%{[index][0]}"
"status" => "%{[status][0]}"
"timestampraw" => "%{[timestampraw][0]}"
"description" => "%{[description][0]}"

}
}
date {
match => [ "timestampraw", "UNIX" ]
}
mutate {
remove_field => [ "logentry" , "timestampraw" ]
}
}


I seems after the split start to create a "loop" and process all the array from the deeper sections.
thanks

Answer

Since your example is a bit verbose to try I made an easier xml, but you should be able to get out of it what you need. The trick is to use the split filter. Below the config I used and the output.

# <result><logline><description>item 1</description></logline><logline><description>item 2</description></logline></result>
input {
    stdin{}
}
filter {
    xml {
        source => "message"
        store_xml => "false"
        xpath => ["/result/logline","loglines"]
        remove_field => [ "message", "host" ] 
    }
    split {
        field => "loglines"
    }
    xml {
        source => "loglines"
        store_xml => "false"
        xpath => ["/logline/description/text()","description"]
        remove_field => [ "loglines" ] 
    }    
}
output {
    stdout{ codec => rubydebug }
}

And the output then becomes:

{
       "@version" => "1",
     "@timestamp" => "2016-06-07T09:40:35.420Z",
           "host" => "Jettros-MBP.fritz.box",
    "description" => [
        [0] "item 1"
    ]
}
{
       "@version" => "1",
     "@timestamp" => "2016-06-07T09:40:35.420Z",
           "host" => "Jettros-MBP.fritz.box",
    "description" => [
        [0] "item 2"
    ]
}

As you can see, there are now two events.