Dinosaurius Dinosaurius - 1 year ago 67
Python Question

How to add if-then expression into user-defined function?

I have this user-defined function in python Spark:

result = udf(lambda num1, num2: (num1 - num2) / math.sqrt(1-(num1/num2)), FloatType())


I want to add a check that
num1/num2
is greater than 1. If it's lower than 1, then
num1
should be equal to
num2
.

if (num1/num2 > 1):
num1 = num2


How can I add this simple check into
udf
expression?

I tried this, but it seems to fail:

def calculate(num1, num2):
if (num1/num2 > 1):
num1 = num2
result = (num1 - num2) / math.sqrt(1-(num1/num2))
return result
calc_z = udf(lambda num1, num2: calculate, FloatType())

Answer Source

You can use where function to get the result as below

df.withColumn("result", when(($"num1" / $"num2") < 1, $"num2")
  .otherwise($"num1"))

Its always better to use available function rather than using UDF.

If you still want to use UDF you can register above your udf as

calc_z = udf(calculate, FloatType())

Hope this helps!

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download