ThoseKind ThoseKind - 4 months ago 9
MySQL Question

Determining Data Type From Value in Row of CSV File

I wrote a Python program that reads CSV files and spits them out as insert statements for MySQL. Now I want an additional program that is able to form the

CREATE
statements for each new table using the column headers. Due to MySQL formatting I need to be able to determine the type of each value in a column similar to the
type()
function in shell. My idea is to read the second row of a CSV file, after the column headers, and pull out values as separated by a
,
(i.e. one value from each column). Then take that value and run it through the
type()
function and return
int, str, float, etc.
. Later I will do some
if statements
with the value returned by
type()
to append the proper data type to the end of the column header for the
CREATE
statement.

So far I have managed to get as far as pulling each value out of a compiled list of the first row and printing the
type()
that it is. Only problem is that everything in a list is treated as a string so the only return is
<class 'str'>
. If I knew what type of value it was I would be able to convert from string to the correct type, but that defeats the whole purpose of the program. I am sure there is a better way to do it that doesn't spit the first row out as a list of strings, but I am not sure how to do that. Here is what I have so far as well as ideas with how to continue:

import csv, os

path = 'C:/Users/user/Desktop/file/test/'
for file in os.listdir(path):
if file.endswith('.csv'):
with open(path +file) as inFile:
with open(path + file[:-4] + ".txt", "w") as outFile:
csvFile = csv.reader(inFile)
columnHeader = next(csvFile)
firstRow = next(csvFile)
i = 0
for value in firstRow:
valueType = (type(value))
for header in columnHeader:
if valueType = class 'str': # Don't think this formatting is correct
columnHeader.append(' varchar (255)')
if valueType = class 'int':
columnHeader.append(' int')
if valueType = class 'float':
columnHeader.append(' float')


I have also looked into using this library but would prefer to do it myself.

Are the methods I chose to do this plausible/possible? Are there any better ways you can think of to do it? If so, how?

Answer

This would be the simple way to do it:

def find_type(a):
    try:
        var_type = type(int(a))
    except:
        try:
            var_type = type(float(a))
        except:
            var_type = type(a)
    return var_type

a = ['123123', '11.21', 'Some Bank', '11/2/1995']

print([find_type(x) for x in a])
#[<class 'int'>, <class 'float'>, <class 'str'>, <class 'str'>]

I am pretty sure that there are more elegant ways though.