Bitwise | Senior Data Engineer Interview Experience | 3+ YoE



Round 1: Technical

Introduction and Project Overview

🔹Introduce yourself and go over all your completed projects, detailing the technologies used.

🔹Project-Specific Questions

🔹Follow-up questions related to the specifics and challenges of projects.

SQL Questions

🔹Given an employee table with columns name, dept, and salary, find the 2nd highest salary for each department.

🔹As I solved this using the DENSE_RANK function, the interviewer asked why I hadn’t used ROW_NUMBER instead.

🔹Given a transaction table with columns trans_id, trans_date, and trans_amt, create a new column showing the cumulative amount for each month.

Spark and PySpark Questions

🔹Describe the projects where you’ve used Spark and explain your approach.

🔹How do you manage Spark clusters?

🔹Solve the 2nd highest salary question using PySpark.

Round 2: Client Interview (Gartner)

Introduction and Project Details

🔹Provide a brief introduction and describe your projects, highlighting your contributions.

🔹In-Depth Project Discussion

🔹Detailed questions about the architecture and design choices of my projects.

Technical AWS Questions

🔹Explain the difference between AWS Lambda and AWS Glue.

🔹Describe the types of AWS S3 storage options.

🔹How do you use AWS Step Functions?

🔹Given a long-running PySpark script on a Production EMR cluster, how would you optimize the code and identify bottlenecks?

🔹Follow-up questions related to the specifics of Spark jobs.

SQL Questions

🔹Based on two tables, predict the output for different join types, Joins to consider: Inner, Left, Right, Cross.

Table 1

ID

1

1

1

1

2

2

2

NULL

Table 2

ID

1

1

2

2

2

2

3

3

🔹Given the Employee table below, write a query to find employees who earn more than their managers.

| Id | Name | Salary | ManagerId |

|----|-------|--------|-----------|

| 1 | Joe  | 70000 | 3     |

| 2 | Henry | 80000 | 4     |

| 3 | Sam  | 60000 | NULL   |

| 4 | Max  | 90000 | NULL   |

Python Questions

🔹Write a function to check if given two strings are anagram or not.

🔹You are given two strings word1 and word2. Merge the strings by adding letters in alternating order, starting with word1. If a string is longer than the other, append the additional letters onto the end of the merged string. Return the merged string.

Input: word1 = "abc", word2 = "pqr" Output: "apbqcr"

Input: word1 = "ab", word2 = "pqrs"Output: "apbqrs"

PySpark Coding Questions

🔹Write PySpark code to rename columns efficiently for a dataset with 100 columns.

🔹Write PySpark code to filter given data and count the results.

Round 3: Managerial 

Introduction and Project Summary

🔹Describe yourself and your project experience.

Career and Challenges

🔹Why are you looking for a change?

🔹Discuss the challenges you've faced in your projects.

🔹What have you learned from any failures?

This summarized my interview experience, the questions I encountered, and the key areas covered during the selection process.