Raymond Raymond - 3 years ago 89
Java Question

Should I use Java String Pool for synchronization based on unique customer id?

We have server APIs to support clients running on ten millions devices. Normally clients call server once a day. That is about 116 clients seen per second. For each client (each with unique ID), it may make several APIs calls concurrently. Server then need to sequence those API calls from the same client. Because, those API calls will update the same document in the Mongodb database. For example: last seen time and other embedded documents.

Therefore, I need to create a synchronization mechanism based on client's unique Id. After some research, I found String Pool is appealing and easy to implement. But, someone made a comment that locking on String Pool may conflict with other library/module which also use it. And, therefore, String Pool should never be used for synchronization purpose. Is the statement true? Or should I implement my own "String Pool" by WeakHashMap as mentioned in the link below?

Good explanation of String Pool implementation in Java:
http://java-performance.info/string-intern-in-java-6-7-8/

Article stating String Pool should not be use for synchronization:
http://www.journaldev.com/1061/thread-safety-in-java

==================================

Thanks for BeeOnRope's suggestion, I will use Guava's Interner to explain the solution. This way, client that don't send multiple requests at the same time will not be blocked. In addition, it guarantees only one API request from one client is processed at the same time.

Interner<String> myIdInterner = Interners.newWeakInterner();

public void processApi1(String clientUniqueId, RequestType1 request) {
synchronized(myIdInterner.intern(clientUniqueId)) {
// code to process request
}
}

public void processApi2(String clientUniqueId, RequestType2 request) {
synchronized(myIdInterner.intern(clientUniqueId)) {
// code to process request
}
}

Answer Source

Well if your strings are unique enough (e.g., generated via a cryptographic hash1) synchronizing on client IDs will probably work, as long as you call String.intern() on them first. Since the IDs are unique, you aren't likely to run into conflicts with other modules, unless you happen to pass your IDs in to them and they follow the bad practice of locking on them.

That said, it is probably a bad idea. In addition to the small chance of one day running into unnecessary contention if someone else locks on the same String instance, the main problem is that you have to intern() all your String objects, and this often suffers from poor performance because of the native implementation of the string intern table, it's fixed size, etc. If you really need to lock based only on a String, you are better off using Guava's Interners.newWeakInterner() interner implementation, which is likely to perform much better. Wrap your string in another class to avoid clashing on the built-in String lock. More details on that approach in this answer.

Besides that, there is often another natural object to lock on, such as a lock in a session object, etc.

This is quite similar to this question which has more fleshed out answers.


1 ... or, at a minimum, have at least have enough bits to make collision unlikely enough and if your client IDs aren't part of your attack surface.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download