I have two dataframes in Spark Sql(D1 and D2).
I am trying to inner join both of them [D1.join(D2, "some column")]
and get back data of only D1, not the complete data set.
Both D1 and D2 are having the same columns.
Could some one please help me on this??
I am using Spark 1.6.
Let say you want to join on "id" column. Then you could write :
val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext.implicits._ d1.as("d1").join(d2.as("d2"), $"d1.id" === $"d2.id").select($"d1.*")