Last week, I cleared my Spark Certification from Databricks with 91.3%. Here is the link to the exam. In this post, I’ll try to cover each and every related thing which is required to clear this exam. I’ll discuss where to study, exam pattern, what to study, how to study, and the syllabus.
The exam consists of 2 sections. One is MCQ based and the other is Coding questions. There are 20 MCQs and 19 Coding questions. Some key points regarding the exam:
- The exam cost is 300$.
- The time allowed is 3 hours to complete the test.
- Only 1 washroom break allowed.
- Good and continuous Internet connection required.
- Minimum 70% marks required to clear the exam.
- Links allowed to open during the exam: Scala Doc, Spark Doc, and Databricks Doc
- Don’t panic during the exam and attend the exam with a cool head.
- The result will be declared within 48 hours and certification is made available within 1 week.
Where to Study:
This is the foremost question that comes to our mind that where exactly to study for it. So, the answer to this is not that straightforward. This is because there are plenty of resources available to study regarding the same. I’m sharing which I referred for this exam:
I guess with this you can easily clear the exam. Without wasting any minute, let’s see what we need to study for this exam.
What to Study:
With Spark 2.X, they are focussing more on Structured APIs like DataFrames and Datasets. So, a considerable amount of time needs to be given to these APIs. Also, they expect one to be familiar with topics like Spark Architecture, Spark Tuning, etc. As RDDs will be deprecated in Spark 3.X, so less time should be spent on the same as indicated by their syllabus as well. So, here are the priority wise topics one should focus upon:
For this one can refer to Chapters 4 to 11 of Spark Definitive Guide. In this, you should practice each and every example given to get a good grip on APIs. Although APIs will be accessible in the exam but searching each and everything will consume time. This part is very important as it comprises the majority of the syllabus.
- Spark Architecture and Key Concepts
For this refer to Chapters 1 to 3 and 15 to 19 of Spark Definitive Guide. In this, you should understand concepts like Jobs, stages, tasks, etc. One should also be familiar with Spark deployment, Cluster configuration, tuning, etc. The majority of this part is being asked in MCQs.
How to Study:
Tips on how to study for the exam:
- Book your exam by paying the fee as this will make you super serious.
- Practice most and almost all DataFrame/Dataset APIs.
- Hands-on APIs will save you time in the exam.
- Try to learn with the help of a case study, otherwise, it might feel boring sometimes. You can refer to the case study used by the Definitive Guide book.
So, it’s a good exam which will test your knowledge on Spark Structured APIs, it’s architecture, configuration, etc.
I hope, I covered all the required details and confusion regarding the exam. If you still have any doubt, I’ll be more than happy to answer them in the comments.