Why doesn't Java's
new String(byte, charset)
These methods don't perform encoding, they simply represent a copy of the
String instance's internal state.
Encoding is the process of converting logical glyphs to a numeric representation, a series of bytes. Think of a
String as representing a sequence of Unicode glyphs. The
String class has APIs to access these glyphs as 32-bit code points, or as a series of 16-bit values encoded with UTF-16-BE (which happens to be the string's native, internal representation), or as a series of bytes in a chosen encoding. You only need to specify the encoding in the last case.
Some encodings, like UTF-8, support all Unicode characters, while many others, like US-ASCII, support only a tiny subset. The
char-based APIs don't allow specifying a different encoding (UTF-16-LE, or UTF-16 with a BOM) because one is sufficient, and promoting uniformity minimizes errors from mismatched encodings.