Newb101 Newb101 - 3 months ago 26
Scala Question

Apache spark querying columnar data

I have been looking at Apache Spark 2.0 and trying to use Spark SQL to process some data in text files. The data is structured as follows

1

Bob

London

2014

2

Robert

Paris

2016

3

Sam

Rome

2011

How would I run SparkSQL queries on data schema such as this.

I also realize spark provides various options to read data sources.

spark.read.

csv,jdbc,load,options,parquet,table,textFile

format,json,option,orc,schema,text


Could anyone of these be used?

Answer

This solved the task for me

spark.sparkContext.hadoopConfiguration.set("textinputformat.record.delimiter","\n\n")
spark.sparkContext.textFile("File.txt")