LP. Gonçalves LP. Gonçalves - 2 months ago 18
C# Question

MySQL Field Space Allocation

First I create a model on my application, then Entity Framework generates the SQL for creating a table.
The first generates a column with type

, the second generates


public string Code { get; set; }

public string CodeTwo { get; set; }


There's any difference between these two declarations(space allocation)?
(Even if they store the same value like "test" which has 5 characters.)

If I know that a field has a variance of it's length between let's say 10-15 characters, is the best approach limiting to the max length or let it "unlimited"(space allocation) ?

Thanks in advance.
Sorry my poor english.


Translated answer of the user @Marconcílio Souza , on the same question asked in another language.

When the Entity Framework generates the tables in your database it check the types of each field, in the case of type STRING when you specify the size it does the same specification to the bank with its corresponding type.

In the case of its

[StringLength (20)]
public string Code {get; set; }

The corresponding MySQL is varchar (20), but when the same string type and declared without a fixed size Entity Framework allocate as much as possible for this type in the database in the case of MySQL and longtext.

The columns of type BLOB as LONGTEXT are inherently variable length and take up almost no storage when not used. The space required by them is not affected even if a NULL value in the case of a use such as 'text' test 'set' the allocation and the size of the passed string.

* Advantages / disadvantages of BLOBs vs. VARCHARs *

All comments in this paragraph referring VARCHAR type are valid for CHAR type too. Each comment ends with BLOB + or VARCHAR + mark to indicate what type of data is better.

 - You know maximum length of your data?

With VARCHARs you need to declare the maximum length of the chain. With blobs you do not have to worry about it. BLOB +

  • You need to store very long strings?

A single VARCHAR is limited to 32K bytes (i.e., about 10 thousand Unicode characters). The maximum size is blob (according to Service Guide);

  - Page size 1kb => 64 Mb   - Page Size 2kb => 512 Mb   - Page size of 4 KB => 4Gb   - Page size of 8KB => 32Gb


  • You need to store many long text columns in single table?

The total line length (uncompressed) is restricted to 64K. VARCHARs are stored online directly, so you can not store many long strings in a row. Blobs are represented by their blob-id, and uses only 8 bytes from 64K maximum. BLOB +

  • You want to minimize the call between client and server?

VARCHAR data is fetched along with other line data in a search operation and usually several rows are sent over the network at the same time. Every single blob needs to do extra search operation open / fetch. VARCHAR +

  • You want to minimize the amount of data transferred between client and server?

The advantage of blobs is that to get the line you get only blob-id, so you can decide whether or not to seek BLOB data. In older versions of InterBase there was a problem that VARCHARs were sent over the network in declared full length. This problem has been fixed in Firebird 1.5 and InterBase 6.5. draw (BLOB + for older versions of the server)

  • You want to minimize the space used?

VARCHARs are compressed RLE (indeed entire line are compressed except blobs). A maximum of 128 bytes can be compressed to 2 bytes. This means that even empty varchar (32000) will occupy 500 + 2 bytes.

Blobs are not compressed, but empty (ie null) blob will occupy only 8 bytes of blob-id (and will be later RLE compressed). non-empty blob may be stored on the same page as other data from the line (if appropriate) or in separate page. Small blob that fits the data page has overhead of 40 bytes (or a little more). Big blob has the same 40-byte overhead in the data page, plus 28 bytes overhead on each blob page (30 bytes in the first). A blob page can not contain more than one blob (ie blob pages are not shared as data pages). For example. for 4K page size, if you store 5K blob, two pages of the blob type will be allocated, which means that you lose 3K of space! In other words - the larger page size, the higher probability that small blobs will fit on data page, but also more wasted space if separate blob pages are needed for large blobs. VARCHAR + (except VARCHARs with extremely large declared length, or tables with lots of NULL blobs)

  • You need table with extremely large number of rows?

Each line is identified by DB_KEY, which is a 64-bit value, 32 bits, 32 bits and which is balanced ID is used to locate the line. maximum number of theoretical way of rows in a table is 2 ^ 32 (but for various reasons the maximum true is even lower). Blob -IDS are allocated from the same address space as DB_KEYs, that means the more blobs in the table, less DB_KEYs remain to face queues. On the other hand, when the stored lines are wide (e.g. they contain long VARCHARs), then fewer lines fit the data page and many DB_KEY values ​​remain unasigned anyway. varchar +?

  • You want a good performance?

Because large blobs are stored outside the data pages, they increase "density" of lines of data pages efficiency and thus cache (reduce the number of I / O operations during the search). BLOB +

  • You need to perform the search on the contents of text columns?

In VARCHAR you can use operators such as '=', '>', among them, of (), case sensitive as and departure case insensitive CONTAINING. In most cases index can be used to speed up the search. Blobs can not be indexed, and you are restricted to TASTE, starting and containing operators. You can not directly compare blobs with operators '=', '>' etc. (Unless you use UDF), so you can not, for example, join tables in Blob fields. VARCHAR +

  • You want to search content of these texts with CONTAINING?

Containig can be used to perform case-insensitive search content VARCHAR field. (No index use) Because you can not set collation order for BLOB columns, you can not use the fully insensitive search case with national characters in BLOB columns (only the lower half of the character set is case insensitive). (Alternatively, you can use UDF). Firebird 2 already allows you to set text wrapping (and binary) columns. VARCHAR +

  • You need capital contents of the text column?

You can use the built-in UPPER () function on varchar, but not the blob. (Also CAST, MIN, MAX can not be used with blobs) VARCHAR +

You can not sort by blob column. (E GROUP BY, DISTINCT, UNION, JOIN ON) Unable to concatenate blob columns. VARCHAR +

There is no built-in conversion function (CAST) for converting blob to VARCHAR or VARCHAR to blob. (But you can write UDF for this purpose.) Since Firebird 1.5 you can use builtin SUBSTRING function to convert blob to VARCHAR (but FROM clauses and can not exceed 32K). to draw

You can not assign value to blob directly in SQL command, for example. Enter values ​​guide (MyBlob) ( 'abc'); (But you can use UDF for converting string to blob). VARCHAR +

Firebird - 0.9.4 already has this functionality to draw

  • You need a good security on these text columns?

To recover the table data, you must be granted the SELECT privilege. To retrieve blob, you need to know only blob -id (stored in the table), but Firebird / InterBase will not check if you have any blob table rights belongs. This means that everyone who know or guess right blob -id can read the blob without any rights to the table. (You can try it with ISQL and BLOBDUMP command.) VARCHAR +

More details

Reference 1

Reference 2

Reference 3

Reference 4