If we have a character in our text file which is in unicode, mustn't it be 2 bytes of data?
int x = fin.read()
Good question! You're right that in Java characters are always two bytes, but that isn't true elsewhere (e.g. in the contents of a file).
A file is not encoded "in "Unicode" because Unicode is a specification, not an encoding. Encodings map the Unicode specification to certain byte sequences, and not all such encodings use two-byte characters. Java
chars are UTF-16 which is always two bytes wide, but many files are stored as UTF-8 which is variable-width; ASCII chars are one byte, others are two or more.
More to the point however,
InputStream is designed to read binary data, not characters, and binary data is (essentially) always read one byte at a time. If you want to read text you wrap your stream in a a
Reader (preferably explicitly specifying the encoding to be used) to convert the binary data into text. Internally it will call
read() one or more times in order to properly construct a character from the sequence of bytes based on the encoding.