Lin Ma Lin Ma - 1 year ago 43
Bash Question

get UTF-8 encoded hex value for international character

Using Mac OSX and if there is a file encoded with UTF-8 (contains international characters besides ASCII), wondering if any tools or simple command (e.g. in Python 2.7 or shell) we can use to find the related hex (base-16) values (in terms of byte stream)? For example, if I write some Asian characters into the file, I can find the related hex value.

My current solution is I open the file and read them byte by byte using Python str. Wondering if any simpler ways without coding. :)

Edit 1, it seems the output of

is not correct,

cat ~/Downloads/12

od ~/Downloads/12
0000000 000061

Edit 2, tried
od -t x1
options as well,

od -t x1 ~/Downloads/12
0000000 31

thanks in advance,

Answer Source

od is the right command, but you need to specify an optional argument -t x1:

$ od -t x1 ~/Downloads/12
0000000 31

If you prefer not to see the file offsets, try adding -A none:

$ od -A none -t x1 ~/Downloads/12

Additionally, the Linux man page (but not the OS X man page) lists this example: od -A x -t x1z -v, "Display hexdump format output."