Round 1: Technical
✅ Databricks vs. Hadoop
📍 How is Databricks different from Hadoop?
✅ Scaling in Databricks
📍 What are the methods to scale in Databricks?
📍 How can you ensure autoscaling in a cluster?
✅ Spark Architecture
📍 Provide a walkthrough of Spark architecture.
✅ Cluster Transformations
📍 What types of transformations can a cluster handle?
📍 Explain the concept of lazy transformations.
✅ Adaptive Query Execution (AQE)
📍 What is AQE, and how does it enhance query performance?
✅ Delta Live Tables (DLT)
📍 Explain DLT in Databricks.
📍 Provide the code flow for the DLT framework.
✅ Optimization in Databricks
📍 What are the key optimization techniques in Databricks?
✅ Pipeline Design
📍 What is the thought process behind creating a pipeline where data arrives daily and historical data needs to be managed simultaneously?
✅ Data Flow vs. Control Flow
📍 What is the difference between data flow and control flow?
✅ Triggers in Azure Data Factory (ADF)
📍 What are the types of triggers available in ADF?
Round 2: Technical
✅ Connecting to Different Sources in ADF
📍 What are the basic steps to establish a connection with different sources in ADF?
✅ Integration Runtime (IR)
📍 What is the role of the Integration Runtime in ADF?
✅ Linked Service
📍 What is a Linked Service in ADF?
📍 Can a single Linked Service be used to connect two different Salesforce instances via IR?
✅ Dynamic Scheduling
📍 If pulling multiple Excel files using SFTP without a schedule trigger, how can a trigger be set up to activate whenever a file arrives?
✅ Databricks vs. ADF for Data Ingestion
📍 Databricks and ADF both support data ingestion. Which is preferred and why?
✅ Large Data Ingestion
📍 If ingesting 10TB of data from on-premises, what tool or approach would you prefer?
✅ Pipeline Troubleshooting
📍 A long-running pipeline is processing only 50GB of data but taking over 6 hours. Where would you start troubleshooting as a data engineer?
Round 3: Technical
✅ PySpark and SQL
📍 Questions focused on PySpark and SQL problem-solving.
✅ Verdict: Selected