George Wang George Wang - 4 months ago 14
Java Question

byte array length varies before and after transformation

I have a need to send and receive large byte array over internet(http restful service).

the simplest way I can think of is to convert the byte array into string.

I searched around and found this post Java Byte Array to String to Byte Array

I had the follow code to verify the accuracy of the transformation.

System.out.println("message");
System.out.println (message);

String message = "Die Strahlengriffelgewächse stammen...";

byte[] pack = Fbs.packExce(message);
System.out.println ("pack");
System.out.println (pack);
System.out.println ("packlenght:" + pack.length);

String toString = new String(pack);
System.out.println ("toString");
System.out.println (toString);

byte[] toBytes = toString.getBytes();
System.out.println ("toBytes");
System.out.println (toBytes);
System.out.println ("toByteslength:" +toBytes.length);


the "Fbs.packExce()" is a method of taking in large chunk of string and churning out byte array of large size.

I changed the length of the message, checked and printed out the length of byte arrays before converting to string and after converting back.

I got the following results:

...
pack
[B@5680a178
packlenght:748
...
toBytes
[B@5fdef03a
toByteslength:750

----------------------

...
pack
[B@5680a178
packlenght:1016
...
toBytes
[B@5fdef03a
toByteslength:1018


I had omitted the "message" since it is too long.

8 times out of 10, I can see that the derived byte array(the new one, saying "toBytes") is longer by 2 bytes than the original byte array ( the "pack")

I said 8 of 10, because there were also scenarios when the length are the same between the derived and the original, see below

...
pack
[B@5680a178
packlenght:824
toString
...
toBytes
[B@5fdef03a
toByteslength:824
...


I can not figure out the exact rules.

does anyone has any idea?

or are there any better ways of converting byte array to and from string?

cheers

Answer

the simplest way I can think of is to convert the byte array into string.

The simplest way is the wrong way. For most character encodings, converting an arbitrary byte sequence to a text is likely to be lossy.

A better (i.e. more robust) way is to use Base64 encoding. Read the javadoc for the Base64 class and its dependent encode and decoder classes.


If you do persist in trying to convert arbitrary bytes top characters and back using new String(byte[]) and the like:

  • Be sure that you chose a character encoding where a Bytes -> Characters -> Bytes conversion sequence is not lossy. (LATIN-1 will work)

  • Don't rely on the current execution platform's default character encoding for the encoding / decoding charset.

  • In a client / server system, the client and server have to use the same encoding.

Comments