Avijit Avijit - 3 months ago 35
Scala Question

Join Dataframes in Spark

I have joined two Dataframes in spark using below code -


Dataframes are: expDataFrame, accountList


val expDetails = expDataFrame.as("fex").join(accountList.as("acctlist"),$"fex.acct_id" === $"acctlist.acct_id", "inner")


Now I am trying to show both acct_id from both dataframe.

I have done below code -

expDetails.select($"fex.acct_id",$"acct_id.acct_id").show


but getting same column name twice as acct_id

I want two unique column name like fex_acct_id, acctlist_acct_id to identify the column from which dataframe.

Answer

You simply have to add an alias to the columns using the as or alias methods. This will do the job :

expDetails.select(
  $"fex.acct_id".as("fex_acct_id"),
  $"acct_id.acct_id".as("acctlist_acct_id")
).show
Comments