Sophia Feng Sophia Feng - 4 months ago 12
Java Question

ElasticSearch drops a field in indexing with Java

I recently come across with this problem with ElasticSearch indexing in Java. When writing a record into ElasticSearch from serialized json byte array, one of the fields is missing or dropped.

The pretty-printed

byte[] content
example:

{
"created_at": 1468390585000,
"name": "Lucy",
"id": 123,
"message": "Hi how are you",
"thread_id": 456,
"user_id": 789
}


The Java index call:

IndexRequest indexRequest = new IndexRequest(INDEX, TYPE, data.getId().toString())
.source(content)
.versionType(VersionType.EXTERNAL)
.version(data.getCreatedAt().getTime());


In the indexing, all the fields are present in result except the
name
:

GET /my_index/post/123

{
"_index": "my_index",
"_type": "post",
"_id": "123",
"_version": 1468390585000,
"found": true,
"_source": {
"id": 123,
"user_id": 456,
"created_at": 1468390585000,
"message": "Hi how are you",
"thread_id": 789
}
}


name
is a new field I newly-created. It is present in the mapping:

{
"my_index":{
"mappings":{
"post":{
"properties":{
"created_at":{
"type":"long"
},
"name":{
"type":"string"
},
"id":{
"type":"long"
},
"message":{
"type":"string",
"analyzer":"english_text"
},
"thread_id":{
"type":"long"
},
"user_id":{
"type":"long"
}
}
}
}
}
}


Other fields were created with the creation of the
post
type.

I suspect that there is some kinda of filtering in writing/indexing the data in Java API. I can
PUT
the same json in command line and see
name
included in result. It seems only the Java API is dropping the field. But I am not sure.

If you have any ideas, I'll appreciate it!

Answer

It turned out that there are existing documents without the new name attributes. Therefore inserting a document with the same id and version with amended name will get version conflicts. The PUT from command line works because it is a new document. It has nothing to do with Java.

I should use update API to do partial update.