LTIMindtree
Global technology consulting and digital solutions company.
4 Rounds
~21 Days
Medium
The Interview Loop
Recruiter Screen (30 min)
Standard fit check, behavioral questions, and resume overview.
Technical Loop (3-4 Rounds)
Deep dive into domain knowledge, coding, and system design.
Interview Question Bank
Data Engineer
•
Technical
•
hard
How do you handle data skewness in PySpark when joining a massive transaction table with a customer table?
#PySpark
#Performance Tuning
#Salting
#Broadcast Joins
Data Engineer
•
Technical
•
medium
Explain the difference between repartition() and coalesce() in PySpark. When should you use which?
#PySpark
#Partitioning
#Optimization
Data Engineer
•
Technical
•
hard
Explain the Catalyst Optimizer in Spark. How does it generate the physical plan from a logical plan?
#Spark Internals
#Catalyst Optimizer
#Query Execution
Data Engineer
•
Technical
•
hard
How do you ensure exactly-once processing semantics in a Kafka to Spark Streaming pipeline?
#Kafka
#Spark Streaming
#Exactly-Once Semantics
Data Engineer
•
Technical
•
medium
What is a Broadcast Join in Spark? What is the default threshold, and what happens if the broadcasted table exceeds driver memory?
#PySpark
#Joins
#Optimization
Data Engineer
•
Technical
•
easy
Explain the concept of lazy evaluation in Spark. Can you give an example of an action vs. a transformation?
#Spark Basics
#Lazy Evaluation
#Transformations vs Actions
Data Engineer
•
Technical
•
medium
What is Delta Lake? Explain the ACID transaction capabilities it brings to Apache Spark.
#Databricks
#Delta Lake
#ACID Transactions
Data Engineer
•
Technical
•
hard
How do you optimize a Spark job that is failing with an OutOfMemory (OOM) error during a groupByKey operation?
#PySpark
#OOM
#Optimization
#reduceByKey
Data Engineer
•
Technical
•
medium
What are accumulators in Spark? Provide a real-world use case for using them.
#Spark Internals
#Accumulators
#Shared Variables
Data Engineer
•
Technical
•
hard
Explain the Z-Ordering optimization in Delta Lake. When should you use it instead of partitioning?
#Databricks
#Delta Lake
#Z-Ordering
#Performance Tuning
Difficulty Radar
Based on recent AI-sourced data.
Meet Your Interviewers
The "Standard" Interviewer
Senior EngineerFocuses on core competencies, system constraints, and clear communication.
SimulateUnwritten Rules
Think Out Loud
Always explain your thought process before writing code or drawing architecture.