Victor Mezrin Victor Mezrin - 1 year ago 71
C++ Question

SQLite - preferable encoding for windows platform

I develop c++ app for Windows.

I will use SQLite 3 to store:

  • paths to different files on HDD

  • strings from the GUI interface (originally for such strings may be used any encoding - English, Spanish, Chinese, etc.)

  • different ASCII strings

I would like to have UNIQUE index on the column with filepath strings. But it's not required - I can do it by my c++ code.

What encoding should I use - UTF-8, UTF-16le or UTF-16be ?

SQLite has 3 functions to open DB:
Seems that for Windows I have to use
because path may contain non-ACSII symbols. Is it right?

Answer Source

Just use UTF-8, which is the default.

The various UTF-16 encodings waste space (except when the vast majority of text in the DB is non-ASCII), which requires more I/O, which makes everything slower. Furthermore, most 16 functions convert their parameters from/to UTF-8 and then call an internal function that uses UTF-8, so they will always be slower.

While functions with 16 in their name receive and return UTF-16 strings, this is independent of the database's actual encoding (all functions convert from/to UTF-8 or UTF-16 as required).

Functions without 16 use UTF-8, which is just a different encoding. The set of characters you can use is exactly the same in both cases, and the SQL always behaves the same.

Some functions (e.g., sqlite3_open_v2) are not available in a 16 version.

Using the 16 functions makes sense only if you are forced to use UTF-16 strings for other reasons, and would have to convert anyway.