CRT020: Databricks Spark Certification

Reading Time: 3 minutes

Last week, I cleared my Spark Certification from Databricks with 91.3%. Here is the link to the exam. In this post, I’ll try to cover each and every related thing which is required to clear this exam. I’ll discuss where to study, exam pattern, what to study, how to study, and the syllabus.

Exam Pattern:

The exam consists of 2 sections. One is MCQ based and the other is Coding questions. There are 20 MCQs and 19 Coding questions. Some key points regarding the exam:

  • The exam cost is 300$.
  • The time allowed is 3 hours to complete the test.
  • Only 1 washroom break allowed.
  • Good and continuous Internet connection required.
  • Minimum 70% marks required to clear the exam.
  • Links allowed to open during the exam: Scala Doc, Spark Doc, and Databricks Doc
  • Don’t panic during the exam and attend the exam with a cool head.
  • The result will be declared within 48 hours and certification is made available within 1 week.

Where to Study:

This is the foremost question that comes to our mind that where exactly to study for it. So, the answer to this is not that straightforward. This is because there are plenty of resources available to study regarding the same. I’m sharing which I referred for this exam:

  1. Spark Book: Spark Definitive Guide
  2. Spark Documentation
  3. Databricks Documentation

I guess with this you can easily clear the exam. Without wasting any minute, let’s see what we need to study for this exam.

What to Study:

With Spark 2.X, they are focussing more on Structured APIs like DataFrames and Datasets. So, a considerable amount of time needs to be given to these APIs. Also, they expect one to be familiar with topics like Spark Architecture, Spark Tuning, etc. As RDDs will be deprecated in Spark 3.X, so less time should be spent on the same as indicated by their syllabus as well. So, here are the priority wise topics one should focus upon:

  • DataFrames/Datasets
    For this one can refer to Chapters 4 to 11 of Spark Definitive Guide. In this, you should practice each and every example given to get a good grip on APIs. Although APIs will be accessible in the exam but searching each and everything will consume time. This part is very important as it comprises the majority of the syllabus.
  • Spark Architecture and Key Concepts
    For this refer to Chapters 1 to 3  and 15 to 19 of Spark Definitive Guide. In this, you should understand concepts like Jobs, stages, tasks, etc. One should also be familiar with Spark deployment, Cluster configuration, tuning, etc. The majority of this part is being asked in MCQs.

How to Study:

Tips on how to study for the exam:

  • Book your exam by paying the fee as this will make you super serious.
  • Practice most and almost all DataFrame/Dataset APIs.
  • Hands-on APIs will save you time in the exam.
  • Try to learn with the help of a case study, otherwise, it might feel boring sometimes. You can refer to the case study used by the Definitive Guide book.

Conclusion:

So, it’s a good exam which will test your knowledge on Spark Structured APIs, it’s architecture, configuration, etc.

I hope, I covered all the required details and confusion regarding the exam. If you still have any doubt, I’ll be more than happy to answer them in the comments.

 


Knoldus-blog-footer-image

 

Written by 

Ayush is a Software Consultant having more than 11 months of experience. He has knowledge of various programming languages like C, C++, Java, Scala, JavaScript and is currently working on Big Data Technologies like Spark, Kafka, ElasticSearch. He is always eager to learn new and advance concepts in order to expand his horizon and apply them in project development with his existing knowledge. His hobbies includes playing Cricket, Travelling, Watching Movies

3 thoughts on “CRT020: Databricks Spark Certification3 min read

  1. Congratulations on your certification. Have you gone through any video learning course or material that you can suggest. I am also interested in certification.

    1. I haven’t gone through any video course, but you can always refer to databricks spark summit videos, they are quite good.

  2. For Coding challenges, do we need to write out code in databricks notebook or Is it going to be like for below code, what is the output expected . Refer to question 8 of below link for better understanding of my question.

    https://databricks-prod-cloudfront.cloud.databricks.com/public/793177bc53e528530b06c78a4fa0e086/0/6221173/100020/latest.html

    In conclusion, could you please shed some more light on coding challenge and how to complete them with respect to timings and notebook experience.

Comments are closed.