Emiliano Emiliano - 2 months ago 7
Scala Question

How can I perform a similar UPDATE-WHERE statement on SparkSQL2.0?

How can I implement an SQL query like this, in SparkSQL 2.0 using DataFrames and Scala language? I've read a lot of posts but none of them seems to achieve what I need, or if you can point me one, would do. Here's the problem:

UPDATE table SET value = 100 WHERE id = 2
UPDATE table SET value = 70 WHERE id = 4
.....


Suppose that you have a table
table
with two columns like this:

id | value
--- | ---
1 | 1
2 | null
3 | 3
4 | null
5 | 5


Is there a way to implement the above query, using map, match cases, usdf or if-else statements? The values that I need to store in the
value
field are not sequential, so I have specific values to put there. I'm aware too that it is not possible to modify a immutable data when dealing with DataFrames. I have no code to share because I can't get it to work nor reproduce any errors.

Answer

Yes you can, it's very simple. You can use when and otherwise.

val pf = df.select($"id", when($"id" === 2, lit(100)).otherwise(when($"id" === 4, lit(70)).otherwise($"value")).as("value"))