Databricks (DE)
Databricks is a SaaS (Software-as-a-Service) cloud based data and analytics service which is available on major cloud providers (AWS, Azure, GCP)
Databricks Lakehouse Platform is enables organizations to:
Ingest, process and transform massive volume and type of data
Explore data through data science techniques, including ML
Provide data engineers, data scientists and data analysts the unique tools and common collaboration space
Guarantee that data available for the business queries is reliable and up to date
Attribute Name | Description | SFIA Skills |
---|---|---|
Databricks (DE) - Lvl 1 | Consultant on this level understands and can clearly communicate what is Databricks Lakehouse Platform (high level), what is Delta Lake (general concepts), what is Databricks SQL and what is Databricks Machine Learning (ML) workspaces. Able to navigate between and within workspaces and perform basic operations on existing data sets. Can support existing routines with guidance and requires support in unfamiliar circumstances. | DENG |
Databricks (DE) - Lvl 2 | Consultant at this level should be able comfortably operate with databases, tables and views on Databricks. Using basic Databricks SQL comfortable with creating db objects, writing data into tables, cleaning data, combining and reshaping tables in SQL Analytics workspace. Familiar with basic Python and able to do basic data manipulations using PySpark in Data Engineering Workspace. | DENG |
Databricks (DE) - Lvl 3 | Consultant on this level is comfortable with ingesting data into Databricks platform, integrating BI Tools with Databricks, and able to create data visualizations on Databricks. Comfortable with theory and implementation of multi-hop architecture (bronze-silver-gold), and able to freely work with Delta Live Tables and DLT pipelines. | DENG |
Databricks (DE) - Lvl 4 | Consultant at this level should have a deep knowledge and understanding of Databricks Lakehouse architecture and its benefits. Familiar with different types of clusters and compute configuration. | DENG |
Databricks (DE) - Lvl 5 | Consultant on this level should be comfortable with optimizing data storage, understanding delta lake transactions and its properties, features and limitations. Able to perform quality enforcement on streaming data and implement slowly changing dimensions (Type 1 or Type 2). Comfortable with securing data on Databricks platform, including accessing PII, implementing RBAC etc. Comfortable with monitoring, logging and handling errors as well as programmatic platform interactions (Databricks CLI, REST API) | DENG |
Databricks (DE) - Lvl 6 | Consultant at this level should be comfortable with cluster policies and Spark cluster configuration for high performance. Comfortable with query optimization tasks, able to identify and rectify five most common performance issues with Spark applications: "Spill", "Skew", "Shuffles", "Storage", "Serialization". With the guidance of Databricks Architect should be able to come up with Databricks Lakehouse cost-effective solution architecture diagram and implementation plan for a given environment. | DENG |
Databricks (DE) - Lvl 7 | Consultant on this level should be able to facilitate strategic and architectural conversations with a Customer, deeply understand Databricks product, its use cases, benefits and limitations. Able to guide internal and external teams on any Databricks related topic. Achieved wide industry recognition (i.e. Databricks Solution Architect Champion or equivalent). | DENG |