Round 1: Technical
Introduction and Project Overview
🔹Introduce yourself and go over all your completed projects, detailing the technologies used.
🔹Project-Specific Questions
🔹Follow-up questions related to the specifics and challenges of projects.
SQL Questions
🔹Given an employee table with columns name, dept, and salary, find the 2nd highest salary for each department.
🔹As I solved this using the DENSE_RANK function, the interviewer asked why I hadn’t used ROW_NUMBER instead.
🔹Given a transaction table with columns trans_id, trans_date, and trans_amt, create a new column showing the cumulative amount for each month.
Spark and PySpark Questions
🔹Describe the projects where you’ve used Spark and explain your approach.
🔹How do you manage Spark clusters?
🔹Solve the 2nd highest salary question using PySpark.
Round 2: Client Interview (Gartner)
Introduction and Project Details
🔹Provide a brief introduction and describe your projects, highlighting your contributions.
🔹In-Depth Project Discussion
🔹Detailed questions about the architecture and design choices of my projects.
Technical AWS Questions
🔹Explain the difference between AWS Lambda and AWS Glue.
🔹Describe the types of AWS S3 storage options.
🔹How do you use AWS Step Functions?
🔹Given a long-running PySpark script on a Production EMR cluster, how would you optimize the code and identify bottlenecks?
🔹Follow-up questions related to the specifics of Spark jobs.
SQL Questions
🔹Based on two tables, predict the output for different join types, Joins to consider: Inner, Left, Right, Cross.
Table 1
ID
1
1
1
1
2
2
2
NULL
Table 2
ID
1
1
2
2
2
2
3
3
🔹Given the Employee table below, write a query to find employees who earn more than their managers.
| Id | Name | Salary | ManagerId |
|----|-------|--------|-----------|
| 1 | Joe | 70000 | 3 |
| 2 | Henry | 80000 | 4 |
| 3 | Sam | 60000 | NULL |
| 4 | Max | 90000 | NULL |
Python Questions
🔹Write a function to check if given two strings are anagram or not.
🔹You are given two strings word1 and word2. Merge the strings by adding letters in alternating order, starting with word1. If a string is longer than the other, append the additional letters onto the end of the merged string. Return the merged string.
Input: word1 = "abc", word2 = "pqr" Output: "apbqcr"
Input: word1 = "ab", word2 = "pqrs"Output: "apbqrs"
PySpark Coding Questions
🔹Write PySpark code to rename columns efficiently for a dataset with 100 columns.
🔹Write PySpark code to filter given data and count the results.
Round 3: Managerial
Introduction and Project Summary
🔹Describe yourself and your project experience.
Career and Challenges
🔹Why are you looking for a change?
🔹Discuss the challenges you've faced in your projects.
🔹What have you learned from any failures?
This summarized my interview experience, the questions I encountered, and the key areas covered during the selection process.