jvataman jvataman - 1 month ago 14
Java Question

ElasticSearch - Too many open files

I know that this is a known and discussed problem, but I would just like to get the dimensions right here:

I am running

ElasticSearch 2.4
on a single
Ubuntu Sever 16.04 node (12 cores, 256G ram)
. I have increased
ulimit to > 130k
(and verified via _nodes/stats/process).

I have two indices with 10 shards each (since multiple nodes will join the cluster soon).

Now I am writing with up to 900 concurrent Java TransportClients which leads to a collapse of the ElasticSearch server within seconds, throwing a "Too many open files" Exception.

Am I missing something here? Are 900 concurrent writes too much for a single instance to handle? Or are 10 shards too many for one node?

Answer

Here is what turned out to be the case:

  • Connecting via the Java TransportClient creates a huge overhead. It does not use the HTTP REST API but an ES binary protocol. (as explained here)
    • Queries via TransportClient are negligibly faster than via REST.
    • The TransportClient creates a thread-pool on the client which as of now is non-configurable. It will maintain several connections so the nodes and is able to cope with failover retrieve cluster-statistics, etc. This leads to a considerable long-term load on the client.
    • In our case every additional connected TransportClient produced ~1000 open file descriptors on the ES machine.

We switched to Jest Client which significantly reduced the load on both client and server. 900 concurrently active clients now result in <2000 file descriptors on the server.

Thanks to Andrei Stefan for pointing us into the right direction.