DaTebe DaTebe - 4 months ago 116
JSON Question

Access nested JSON Field in Logstash

I have a Problem with accessing a nested JSON field in logstash (latest version).

My config file is the following:

input {
http {
port => 5001
codec => "json"
}
}

filter {
mutate {
add_field => {"es_index" => "%{[statements][authority][name]}"}
}
mutate {
gsub => [
"es_index", " ", "_"
]
}
mutate {
lowercase => ["es_index"]
}
ruby {
init => "
def remove_dots hash
new = Hash.new
hash.each { |k,v|
if v.is_a? Hash
v = remove_dots(v)
end
new[ k.gsub('.','_') ] = v
if v.is_a? Array
v.each { |elem|
if elem.is_a? Hash
elem = remove_dots(elem)
end
new[ k.gsub('.','_') ] = elem
} unless v.nil?
end
} unless hash.nil?
return new
end
"
code => "
event.instance_variable_set(:@data,remove_dots(event.to_hash))
"
}
}

output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => "elasticsearch:9200"
index => "golab-%{+YYYY.MM.dd}"
}
}


I have a filter with mutate. I want to add a field that I can use as a part of the index name. When I use this
"%{[statements][authority][name]}"
the content in the brackets is used as string.
%{[statements][authority][name]}
is saved in the
es_index
field. Logstash seems to think this is a string, but why?

I've also tried to use this expression:
"%{statements}"
. It's working like expected. Everything in the field statements is passed to
es_index
. If I use
"%{[statements][authority]}"
strange things happen.
es_index
is filled with the exact same output that
"%{statements}"
produces. What am I missing?

Logstash Output with
"%{[statements][authority]}"
:

{
"statements" => {
"verb" => {
"id" => "http://adlnet.gov/expapi/verbs/answered",
"display" => {
"en-US" => "answered"
}
},
"version" => "1.0.1",
"timestamp" => "2016-07-21T07:41:18.013880+00:00",
"object" => {
"definition" => {
"name" => {
"en-US" => "Example Activity"
},
"description" => {
"en-US" => "Example activity description"
}
},
"id" => "http://adlnet.gov/expapi/activities/example"
},
"actor" => {
"account" => {
"homePage" => "http://example.com",
"name" => "xapiguy"
},
"objectType" => "Agent"
},
"stored" => "2016-07-21T07:41:18.013880+00:00",
"authority" => {
"mbox" => "mailto:info@golab.eu",
"name" => "GoLab",
"objectType" => "Agent"
},
"id" => "0771b9bc-b1b8-4cb7-898e-93e8e5a9c550"
},
"id" => "a7e31874-780e-438a-874c-964373d219af",
"@version" => "1",
"@timestamp" => "2016-07-21T07:41:19.061Z",
"host" => "172.23.0.3",
"headers" => {
"request_method" => "POST",
"request_path" => "/",
"request_uri" => "/",
"http_version" => "HTTP/1.1",
"http_host" => "logstasher:5001",
"content_length" => "709",
"http_accept_encoding" => "gzip, deflate",
"http_accept" => "*/*",
"http_user_agent" => "python-requests/2.9.1",
"http_connection" => "close",
"content_type" => "application/json"
},
"es_index" => "{\"verb\":{\"id\":\"http://adlnet.gov/expapi/verbs/answered\",\"display\":{\"en-us\":\"answered\"}},\"version\":\"1.0.1\",\"timestamp\":\"2016-07-21t07:41:18.013880+00:00\",\"object\":{\"definition\":{\"name\":{\"en-us\":\"example_activity\"},\"description\":{\"en-us\":\"example_activity_description\"}},\"id\":\"http://adlnet.gov/expapi/activities/example\",\"objecttype\":\"activity\"},\"actor\":{\"account\":{\"homepage\":\"http://example.com\",\"name\":\"xapiguy\"},\"objecttype\":\"agent\"},\"stored\":\"2016-07-21t07:41:18.013880+00:00\",\"authority\":{\"mbox\":\"mailto:info@golab.eu\",\"name\":\"golab\",\"objecttype\":\"agent\"},\"id\":\"0771b9bc-b1b8-4cb7-898e-93e8e5a9c550\"}"
}


You can see that authority is part of
es_index
. So it was not chosen as a field.

Many thanks in advance

Answer

I found a solution. Credits go to jpcarey (Elasticsearch Forum)

I had to remove codec => "json". That leads to another data structure. statements is now an array and not an object. So I needed to change %{[statements][authority][name]} to %{[statements][0][authority][name]}. That works without problems.

If you follow the given link you'll find an better implementation of my mutate filters.