Tacitus86 Tacitus86 - 1 month ago 12
Java Question

Java String from byte array

I am currently reading in a UDP byte array that I know is a string and I know the MAXIMUM possible length of said string. So I print out a string (which is usually shorter than the max length). I am able to print it out but it prints out the text then junk characters. Is there a way to trim the junk binary data without knowing the actual length of the valid text?

String result = new String(input, Charset.forName("US-ASCII"));


Ill try for those asking for more data. Here is how the UDP message is read:

sock.receive(incoming);
byte[] data = incoming.getData();
String s = new String(data, 0, incoming.getLength());


The UDP message itself will contain a header of fixed size and then a set of data (Max size of 1024 bytes). This data may be int, string, byte etc. This is determined by header data. So depending on the type, i chop the data out based on the appropriate size chunks. The problem I am focusing on is the String type of data. I know that the max size of a string will be 128 bytes per string, so I read that amount in chunks via where dataArray is the byte array.:

for (int i = 0; i < msg.length; i = i + readSize)
{
dataArray = Arrays.copyOfRange(msg, i, i + readSize);
}


Then I use the original code in the first code set in this post to place the data into a string object. Thing is, the text that is usually sent is less than the 128 bytes allocated for max size. So when I print the string, I get the valid text and then whitespace and non-normal ascii characters (junk data). Hope this addition helps.

An example of the output is here. Everything up to the .mof is valid:

https://1drv.ms/i/s!Ai0t7Oj1PUFBpRP9K_2RlocAK4B7

Answer

Ok here is how I was able to get it to work. It's a rather manual method but before using

String result = new String(input, Charset.forName("US-ASCII"));

to combine the byte array into a string, I looked at each byte and made sure it was within the printable range of 0x20 - 0x7e. If not, I replaced the value with a space (0x20). Then finished off with a .trim on the string.

Comments