TJ72 TJ72 - 1 year ago 82
Python Question

Apache Zeppelin Python UDF into SQL

I am trying to get a function that I wrote in python to add a new column into an SQL table. I can't figure out how to pass a UDF from that function to the SQL table. I believe that the way to do this is withColumn, I just don't know how to.

The goal is to grab the day/week/year from the SQL table and calculate the month from the given data. The function below works if I set day/week/year equal to values.

Here is the Function:

def getmonth(day,week,year):
x = datetime.strptime('{}_{}_{}{}'.format(day,week,year,-0), '%d_%W_%Y%w')
month = x.strftime('%m')
udf(getmonth)


The SQL...

DriveConfig = sqlContext.sql("""
SELECT
daymade as day,
weekmade as week,
yearmade as year
FROM datatable2 """)


This is what my table looks like roughly, I want to add that month column that is between week and year

day week year
2 42 2017
3 2 2011
1 14 2005
...

Answer Source
def getmonth(day,week,year):
    x = datetime.strptime('{}_{}_{}{}'.format(day,week,year,-0), '%d_%W_%Y%w')
    month = x.strftime('%m')
    return udf(month)

month = udf(getmonth, IntegerType())

DriveConfig = DriveConfig.withColumn("month", month(DriveConfig.day, DriveConfig.week, DriveConfig.year))
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download