Lin Ma Lin Ma - 4 months ago 10
Bash Question

get UTF-8 encoded hex value for international character

Using Mac OSX and if there is a file encoded with UTF-8 (contains international characters besides ASCII), wondering if any tools or simple command (e.g. in Python 2.7 or shell) we can use to find the related hex (base-16) values (in terms of byte stream)? For example, if I write some Asian characters into the file, I can find the related hex value.

My current solution is I open the file and read them byte by byte using Python str. Wondering if any simpler ways without coding. :)

Edit 1, it seems the output of

od
is not correct,

cat ~/Downloads/12
1

od ~/Downloads/12
0000000 000061
0000001


Edit 2, tried
od -t x1
options as well,

od -t x1 ~/Downloads/12
0000000 31
0000001


thanks in advance,
Lin

Answer

od is the right command, but you need to specify an optional argument -t x1:

$ od -t x1 ~/Downloads/12
0000000 31
0000001

If you prefer not to see the file offsets, try adding -A none:

$ od -A none -t x1 ~/Downloads/12
 31

Additionally, the Linux man page (but not the OS X man page) lists this example: od -A x -t x1z -v, "Display hexdump format output."

Reference: http://www.unix.com/man-page/osx/1/od/

Comments