Zack Zack - 2 months ago 13
SQL Question

Scan and find data type

Can we scan a table and find data type for it because when we load csv file into a table using import export wizard it defaults even a numeric column to varchar can we scan the table after import and find the datatype which could be ideal?

For example consider a csv file A.csv it contains

ColumnA ColumnB ColumnC ColumnD ColumnE
1234 xyz 123.4 1 abc123
4258 acv 785.6 0 abs58
785 fgf 879.6 1 ftrd15
448 wsd 87878.6 0 wewe
78528 ews 6968.2 1 awaq
525554 qwe 2.3 0 afgd87


so when I load this using import export wizard all columns are varchar but I need to scan the table and then find data type. I am not concerned about loading the data with correct data types initially I only care after I load the data into the table and then scan the table to find the data type.

Answer

It looks like you'd be dealing with 6 main datatypes.

  1. DATETIME
  2. INTEGER
  3. DECIMAL
  4. CHARACTER
  5. VARCHAR
  6. BIT

So one way to do this without explicitly trying to convert each one and catch an error, since you can't use TRY_CONVERT in 2008, is by using ISDATE, ISNUMERIC, and CHARINDEX. For each column, you can do something like this. Of course, you could do this in a cursor for each column, or just copy the case statement a few times, or with a cross join.

SELECT DISTINCT
'ColumnA' as ColumnName,
CASE
   WHEN ISNUMERIC(ColumnA) = 1 AND LEN(ColumnA) = 1 AND ColumnA NOT LIKE '%[2-9]%' THEN 'Bit',
   WHEN ISNUMERIC(ColumnA) = 1 AND CHARINDEX('.',ColumnA) > 0 THEN 'Decimal'
   WHEN ISNUMERIC(ColumnA) = 1 AND CHARINDEX('.',ColumnA) = 0 THEN 'Integer'
   WHEN ISDATE(ColumnA) = 1 THEN 'Date'
   WHEN LEN(ColumnA) = 1 AND ColumnA LIKE '%[a-z]%' THEN 'Character'
   ELSE 'VARCHAR'
END AS DataTypeCheck
FROM YourTable

It's not perfect since we aren't checking all data types, but it should get you started at least. You can add some more LEN() functions to figure out what you want to set your DECIMAL length and precision to, as well as your VARCHAR() length. However, there is no way of knowing if a follow on insert would cause binary data to be truncated... since the values are unknown. So you'd just need to set these field lengths to be large enough to accept any follow on input. Also this will give you all possible data types for that column. so if you have 12 and 12.34 it will return INT and DECIMAL, in which you should choose DECIMAL. This can be taken care of in a follow on query if need be.