user1950349 user1950349 - 4 years ago 89
Java Question

How to make sure total bytes of a map (sum of all keys and values length) stays within the limit?

I am getting lot of records from a particular source and I need to send those records to our database. Below is what I am doing:


  • I am storing all these records in a
    ConcurrentHashMap
    where key is
    Integer
    and value is
    ConcurrentLinkedQueue
    and this CHM gets populated by multiple threads in a thread safe way.

  • Now I have a single background thread (runs every 1 minute) which reads from this map and send those events to some other method which does validation and send it to our database.



Below is my method which will be called by a single background thread every 1 minute.

private void validateAndSend(final int partition,
final ConcurrentLinkedQueue<DataHolder> dataHolders) {

Map<byte[], byte[]> clientKeyBytesAndProcessBytesHolder = new HashMap<>();
int totalSize = 0;
while (!dataHolders.isEmpty()) {
DataHolder dataHolder = dataHolders.poll();
byte[] clientKeyBytes = dataHolder.getClientKey().getBytes(StandardCharsets.UTF_8);
if (clientKeyBytes.length > 255)
continue;
byte[] processBytes = dataHolder.getProcessBytes();
int clientKeyLength = clientKeyBytes.length;
int processBytesLength = processBytes.length;

totalSize += clientKeyLength + processBytesLength;
if (totalSize > 64000) {
sendToDatabase(partition, clientKeyBytesAndProcessBytesHolder);
clientKeyBytesAndProcessBytesHolder.clear(); // watch out for gc
totalSize = 0;
}
clientKeyBytesAndProcessBytesHolder.put(clientKeyBytes, processBytes);
}
// calling again with remaining values
sendToDatabase(partition, clientKeyBytesAndProcessBytesHolder);
}


In the above method, I will iterate
dataHolders
CLQ and I will extract
clientKeyBytes
and
processBytes
from it. Here is the validation that I am supposed to do:


  • If
    clientKeyBytes
    length is greater than 255 then I will skip it and continue iterating.

  • And then I will keep incrementing
    totalSize
    variable which will be sum of
    clientKeyLength
    and
    processBytesLength
    and this totalSize length should be less than 64000 always.

  • As soon as it is reaching
    64000
    limit, I will send the
    clientKeyBytesAndProcessBytesHolder
    map to
    sendToDatabase
    method and clear out the map, reset
    totalSize
    to 0 and start populating again.

  • If it doesn't reaches that limit and
    dataHolders
    got empty, then we will send whatever we have.



Basically what I have to make sure is whenever
sendToDatabase
method is called,
clientKeyBytesAndProcessBytesHolder
map should have size less than
64000
(sum of all keys and values length). It should never be called with the size greater than
64000
.

Is this the best and efficient way to do what I am doing or there is any better way to accomplish the same thing?

Update:

This is how it should be?

private void validateAndSend(final int partition,
final ConcurrentLinkedQueue<DataHolder> dataHolders) {

Map<byte[], byte[]> clientKeyBytesAndProcessBytesHolder = new HashMap<>();
int totalSize = 0;
while (!dataHolders.isEmpty()) {
DataHolder dataHolder = dataHolders.poll();
byte[] clientKeyBytes = dataHolder.getClientKey().getBytes(StandardCharsets.UTF_8);
if (clientKeyBytes.length > 255)
continue;
byte[] processBytes = dataHolder.getProcessBytes();
int clientKeyLength = clientKeyBytes.length;
int processBytesLength = processBytes.length;

int additionalLength = clientKeyLength + processBytesLength;
if (totalSize + additionalLength > 64000) {
Message message = new Message(partition, clientKeyBytesAndProcessBytesHolder);
sendToDatabase(message.getAddress(), message.getLocation());
clientKeyBytesAndProcessBytesHolder.clear(); // watch out for gc
totalSize = 0;
}
clientKeyBytesAndProcessBytesHolder.put(clientKeyBytes, processBytes);
totalSize += additionalLength;
}
// calling again with remaining values
Message message = new Message(partition, clientKeyBytesAndProcessBytesHolder);
sendToDatabase(message.getAddress(), message.getLocation());
}

Answer Source

Looks good, but there is a small bug: totalSize is reset to 0 where it should be set to clientKeyLength + processBytesLength -- the bytes for the current key are ignored when the data is sent, although the entry is added after the if statement.

I'd change the code as follows (the whole question might be better suited for the codereviews stack exchange):

int additionalLength = clientKeyLength + processBytesLength;
if (totalSize + additionalLength > 64000) {
    sendToDatabase(partition, clientKeyBytesAndProcessBytesHolder);
    clientKeyBytesAndProcessBytesHolder.clear(); // watch out for gc
    totalSize = 0;
}
clientKeyBytesAndProcessBytesHolder.put(clientKeyBytes, processBytes);
totalSize += additionalLength;

P.S.: What is the expected behavior when the same key is inserted multiple times? Your code currently inserts all instances...

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download