Tech Mahindra
Multinational IT services and consulting company.
4 Rounds
~21 Days
Medium
The Interview Loop
Recruiter Screen (30 min)
Standard fit check, behavioral questions, and resume overview.
Technical Loop (3-4 Rounds)
Deep dive into domain knowledge, coding, and system design.
Interview Question Bank
Data Scientist
•
Behavioral
•
medium
Tell me about a time you had to explain a complex machine learning model's predictions to a non-technical client stakeholder.
#Stakeholder Management
#Explainable AI
#SHAP
#LIME
Data Scientist
•
Behavioral
•
medium
Describe a situation where your model's performance degraded in production. How did you troubleshoot and resolve it?
#Model Drift
#Monitoring
#Debugging
Data Scientist
•
Behavioral
•
medium
How do you prioritize tasks when working on multiple client deliverables with tight, overlapping deadlines?
#Time Management
#Prioritization
#Client Management
Data Scientist
•
Behavioral
•
hard
Tell me about a time you disagreed with a client's approach to a data problem. How did you handle it?
#Conflict Resolution
#Consulting
#Client Interaction
Data Scientist
•
Behavioral
•
medium
Describe a project where you had to clean and process a massive, messy dataset. What was your approach?
#Data Cleaning
#Big Data
#Problem Solving
Data Scientist
•
Coding
•
medium
Write a SQL query to find the second highest salary of employees within each department.
#Window Functions
#DENSE_RANK
#Subqueries
Data Scientist
•
Coding
•
medium
Write a Pandas script to merge two large datasets on a common key, and fill missing numerical values with the mean of their respective categories.
#Pandas
#Data Imputation
#Groupby
#Merge
Data Scientist
•
Coding
•
medium
Write a SQL query to calculate the 7-day rolling average of daily active users (DAU).
#Window Functions
#Rolling Average
#Date Functions
Data Scientist
•
Coding
•
hard
Implement a Python function to calculate the TF-IDF of a given corpus of text from scratch (without using Scikit-Learn).
#Python
#TF-IDF
#Math
#NLP
Data Scientist
•
Coding
•
medium
Write a SQL query to find the top 3 customers by revenue in each region.
#Window Functions
#RANK
#CTEs
Data Scientist
•
Coding
•
easy
Write a Python program to detect and remove outliers from a Pandas DataFrame using the Interquartile Range (IQR) method.
#Python
#Pandas
#Outlier Detection
#Statistics
Data Scientist
•
Coding
•
easy
Write a SQL query to find all employees who joined in the last 30 days and have the job title 'Data Scientist'.
#Date Functions
#Filtering
#WHERE Clause
Data Scientist
•
Coding
•
easy
Write a Python function to reverse a string without using built-in functions like reversed() or slicing [::-1].
#Python
#Strings
#Loops
Data Scientist
•
System Design
•
hard
Design a predictive maintenance system for a manufacturing client using IoT sensor data.
#IoT
#Time Series
#Streaming Data
#Architecture
Data Scientist
•
System Design
•
medium
How would you deploy a machine learning model as a REST API using Docker and a cloud provider like AWS or Azure?
#Docker
#FastAPI/Flask
#Cloud Deployment
#Containerization
Data Scientist
•
System Design
•
hard
Design a recommendation engine for a telecom provider's value-added services (e.g., Netflix bundles, international roaming).
#Recommendation Systems
#Collaborative Filtering
#Cold Start
Data Scientist
•
System Design
•
hard
Architect an MLOps pipeline for continuous training and deployment of a customer churn model.
#CI/CD
#Model Registry
#Airflow
#Kubernetes
Data Scientist
•
System Design
•
hard
Design a real-time fraud detection system for credit card transactions.
#Real-time Processing
#Fraud Detection
#Latency
#Architecture
Data Scientist
•
Technical
•
medium
In a telecom churn prediction project, the dataset is highly imbalanced (95% non-churn, 5% churn). How do you handle this?
#Imbalanced Data
#SMOTE
#Class Weights
#Evaluation Metrics
Data Scientist
•
Technical
•
medium
Explain the difference between Random Forest and Gradient Boosting. When would you choose one over the other?
#Ensemble Learning
#Bagging
#Boosting
#Bias-Variance Tradeoff
Data Scientist
•
Technical
•
hard
How would you extract key entities and intent from unstructured IT support tickets to automate ticket routing?
#NER
#Intent Classification
#Transformers
#Spacy
Data Scientist
•
Technical
•
medium
What are the core assumptions of Linear Regression, and how do you validate them?
#Linear Regression
#Assumptions
#Residual Analysis
Data Scientist
•
Technical
•
medium
What is the Curse of Dimensionality, and what techniques do you use to mitigate it?
#Dimensionality Reduction
#PCA
#Feature Selection
Data Scientist
•
Technical
•
hard
Explain the difference between ARIMA and LSTM for time series forecasting. When is one preferred over the other?
#ARIMA
#LSTM
#Deep Learning
#Forecasting
Data Scientist
•
Technical
•
easy
How does L1 (Lasso) regularization differ from L2 (Ridge) regularization?
#Regularization
#Lasso
#Ridge
#Overfitting
Data Scientist
•
Technical
•
medium
What evaluation metrics would you use for a highly skewed fraud detection dataset, and why?
#Evaluation Metrics
#Fraud Detection
#Precision-Recall
Data Scientist
•
Technical
•
hard
Explain the architecture of Transformers and the concept of the self-attention mechanism.
#Transformers
#Self-Attention
#NLP
Data Scientist
•
Technical
•
easy
What is a p-value? How do you use it to make decisions in A/B testing?
#Hypothesis Testing
#A/B Testing
#p-value
Data Scientist
•
Technical
•
medium
Explain the working of the K-Means clustering algorithm. How do you choose the optimal 'K'?
#Clustering
#K-Means
#Elbow Method
#Silhouette Score
Data Scientist
•
Technical
•
medium
What is Data Leakage in machine learning, and how can you prevent it during cross-validation?
#Data Leakage
#Cross-Validation
#Pipelines
Data Scientist
•
Technical
•
medium
How do you prevent overfitting in deep neural networks?
#Overfitting
#Dropout
#Early Stopping
#Regularization
Data Scientist
•
Technical
•
hard
Explain the mathematical intuition behind Support Vector Machines (SVM) and the kernel trick.
#SVM
#Math
#Kernel Trick
Data Scientist
•
Technical
•
medium
How do you handle out-of-vocabulary (OOV) words in NLP tasks?
#OOV
#Tokenization
#Word Embeddings
#Subword Tokenization
Data Scientist
•
Technical
•
medium
What is SMOTE, and how does it work under the hood?
#SMOTE
#Imbalanced Data
#Algorithms
Data Scientist
•
Technical
•
medium
Explain ROC AUC. Can a model have a high accuracy but a low AUC? Give an example.
#ROC AUC
#Evaluation Metrics
#Imbalanced Data
Difficulty Radar
Based on recent AI-sourced data.
Meet Your Interviewers
The "Standard" Interviewer
Senior EngineerFocuses on core competencies, system constraints, and clear communication.
SimulateUnwritten Rules
Think Out Loud
Always explain your thought process before writing code or drawing architecture.