Jeki Maulana Jeki Maulana - 2 months ago 14
C# Question

MySQL field changed from VARCHAR(20) to INT after LOAD DATA LOCAL INFILE

I have problem when load .csv file from local. My field has changed from VARCHAR(20) to INT by it self. I know that after used this syntax (this is what I want):

SELECT * FROM `len_enq_spph` WHERE po = '240000388';


its return null

But if I used this syntax:

SELECT * FROM `len_enq_spph` WHERE po = 240000388;


its return what I want.

po field is VARCHAR(20).
But why that field must INT in where clause?
This example of my csv file

id|order_id|ext_ord_ref|order_id2__1
1|160000402|110005678|240000388
2|160000402|110005678|240000388
3|160000402|110005678|240000388


And this is my LOAD syntax:

LOAD DATA LOCAL INFILE 'c:\\temp\\SPPH.csv' INTO TABLE len_enq_spph
FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' IGNORE 1 LINES;


I tried to specific fields, but still not working.

Answer

If that's a MS-DOS format csv file, the line terminator probably includes a carriage return character. The line terminator is probably '\r\n'.

When you specify LINES TERMINATED BY '\n', the '\r' character is being included in the last field.

One way to see the difference between what is stored, and the literal is to use the HEX function. Something like this:

  SELECT HEX(q.po)         AS hex_po_col
       , HEX('240000388')  AS hex_literal
    FROM len_enq_spph q
   WHERE q.po = 240000388

The comparison to integer "works" to match the row because MySQL is doing an implicit conversion of the po column to numeric. MySQL does that by reading character by character until it encounters a character which makes the string not form a valid number.

As another test:

  SELECT HEX(q.po)         AS hex_po_col
    FROM len_enq_spph q
   WHERE q.po = '240000388\r'

It's also possible it's a different, or additional "hidden" characters loaded into the column.

  SELECT HEX(q.po)         AS hex_po_col
    FROM len_enq_spph q
   WHERE q.po LIKE '240000388%'

If you only want to store valid representations of numeric values, you could change the LOAD DATA statement to load the field into an intermediate user-defined variable, and then use a SET clause with an expression that does a conversion to numeric, to clean it up.

I'm just guessing at the other column names in the table, based on the first line of the csv file...

 LOAD DATA LOCAL INFILE 'c:/temp/SPPH.csv'
 INTO TABLE len_enq_spph
 FIELDS TERMINATED BY '|'
 LINES TERMINATED BY '\r\n'
 IGNORE 1 LINES
 ( `id`
 , `order_id`
 , `ext_ord_ref`
 , @order_id2_1
 )
 SET `po` = @order_id2_1 + 0