Dinosaurius - 2 years ago 83

Python Question

I have this user-defined function in python Spark:

`result = udf(lambda num1, num2: (num1 - num2) / math.sqrt(1-(num1/num2)), FloatType())`

I want to add a check that

`num1/num2`

`num1`

`num2`

`if (num1/num2 > 1):`

num1 = num2

How can I add this simple check into

`udf`

I tried this, but it seems to fail:

`def calculate(num1, num2):`

if (num1/num2 > 1):

num1 = num2

result = (num1 - num2) / math.sqrt(1-(num1/num2))

return result

calc_z = udf(lambda num1, num2: calculate, FloatType())

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

You can use where function to get the result as below

```
df.withColumn("result", when(($"num1" / $"num2") < 1, $"num2")
.otherwise($"num1"))
```

Its always better to use available function rather than using UDF.

If you still want to use UDF you can register above your udf as

```
calc_z = udf(calculate, FloatType())
```

Hope this helps!

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**