HadoopExam Blogs

HadoopExam Learning Resources

Question 4: What is an RDD?

1.  When you start a cluster, it will create pool of RDDs to store in Memory data during application execution.

2.  It is a pool of predefined storage like 64MB or 128MB but created lazily, whenever applications are submitted.

3.  They are collections of objects which are distributed across nodes in a cluster, but created by applications code itself.

4.  It is a container for the source code for your application and will be distributed on all the nodes of the cluster.

Correct Answer: 3 Exp: RDD is primary abstraction in Apache Spark, An RDD represents collection on objects that is distributed across nodes in a

cluster. Most of the computation or operation you will be doing will be performed on RDD.

Oreilly Databricks Spark Certification     Hortonworks HDPCD Spark Certification     Cloudera CCA175 Hadoop and Spark Developer Certifications    MCSD : MapR Certified Spark Developer  

  1. Apache Spark Professional Training with Hands On Lab Sessions 
  2. Oreilly Databricks Apache Spark Developer Certification Simulator
  3. Hortonworks Spark Developer Certification 
  4. Cloudera CCA175 Hadoop and Spark Developer Certification 

Watch below Training Video

You are here: Home MapR Certification MapR:Spark MapR Spark Certification Sample Question-4