Avijit Avijit - 1 year ago 127
Scala Question

Join Dataframes in Spark

I have joined two Dataframes in spark using below code -


Dataframes are: expDataFrame, accountList


val expDetails = expDataFrame.as("fex").join(accountList.as("acctlist"),$"fex.acct_id" === $"acctlist.acct_id", "inner")


Now I am trying to show both acct_id from both dataframe.

I have done below code -

expDetails.select($"fex.acct_id",$"acct_id.acct_id").show


but getting same column name twice as acct_id

I want two unique column name like fex_acct_id, acctlist_acct_id to identify the column from which dataframe.

Answer Source

You simply have to add an alias to the columns using the as or alias methods. This will do the job :

expDetails.select(
  $"fex.acct_id".as("fex_acct_id"),
  $"acct_id.acct_id".as("acctlist_acct_id")
).show