aola aola - 5 months ago 46
Scala Question

joining DataFrames in spark

I would like joing two dataframes: edges and selectedComponent by two keys using


val selectedComponent = hiveContext.sql(s"""select * from $tableWithComponents
|where component=$component""".stripMargin)

but not this way

val theSelectedComponentEdges = hiveContext.sql(
s"""select * from $tableWithComponents a join $edges b where ( or""")

but using join function

edges.join(selectedComponent, edges("src")===selectedComponent("id"))

but I am not sure how I supposed to using here "or".

Anyone can help me :-)?

edges.join(selectedComponent, (edges("src")===selectedComponent("id")) ||  (edges("dst")===selectedComponent("id")))