HadoopExam Blogs

HadoopExam Learning Resources

Question 2: You have been given following code written in Scala and Spark. Below is the content for IBM.csv file

IBM,101,20150112

Google,400,20150112

IBM,107,20150113

Apple,230,20150112

Now you have written following code, in interactive shell

val myRDD = sc.textFile("data.csv")

val splittedRDD = myRDD.map(_.split(","))

val value = splittedRDD.map(x=>(x[0],1)).XXXXX.collect()

Please replace XXXXX ith correct function, which will produce output value as

Array((IBM,2),(Google,1),(Apple,1))

1. reduceByKey((X,Y)=> X+Y)

2. reduce((X,Y)=> X+Y)

3. groupBy((X,Y)=> X+Y)

4. countBy((X,Y)=> X+Y)

Correct Answer: 1 Exp: reduceByKey(func), is a function, which applies func function on each tuple for each key. So each key will produce the sum of count. It is similar to word count example.

Oreilly Databricks Spark Certification     Hortonworks HDPCD Spark Certification     Cloudera CCA175 Hadoop and Spark Developer Certifications    MCSD : MapR Certified Spark Developer  

  1. Apache Spark Professional Training with Hands On Lab Sessions 
  2. Oreilly Databricks Apache Spark Developer Certification Simulator
  3. Hortonworks Spark Developer Certification 
  4. Cloudera CCA175 Hadoop and Spark Developer Certification 

Watch below Training Video

You are here: Home MapR Certification MapR:Spark MapR Spark Certification Sample Question-2