Capgemini
Global leader in partnering with companies to transform and manage their business by harnessing the power of technology.
4 Rounds
~21 Days
Medium
The Interview Loop
Recruiter Screen (30 min)
Standard fit check, behavioral questions, and resume overview.
Technical Loop (3-4 Rounds)
Deep dive into domain knowledge, coding, and system design.
Interview Question Bank
Machine Learning Engineer
•
Behavioral
•
medium
Tell me about a time you had to explain a complex Machine Learning model to a non-technical stakeholder or client.
#Stakeholder Management
#Consulting
#Explainable AI
Machine Learning Engineer
•
Behavioral
•
hard
Describe a situation where your ML model performed well in training/testing but failed or degraded in production. How did you troubleshoot and resolve it?
#Debugging
#Production Issues
#Experience
Machine Learning Engineer
•
Behavioral
•
easy
Working at Capgemini often means juggling multiple client deliverables. How do you prioritize your tasks when faced with tight, competing deadlines?
#Agile
#Prioritization
#Consulting
Machine Learning Engineer
•
Behavioral
•
medium
Tell me about a time you disagreed with a senior engineer or solutions architect on a technical approach. How did you handle it?
#Conflict Resolution
#Teamwork
Machine Learning Engineer
•
Behavioral
•
medium
Describe a project where you had to quickly learn a new technology, cloud service, or ML framework to meet a client's specific requirement.
#Continuous Learning
#Consulting
Machine Learning Engineer
•
Behavioral
•
easy
Why do you want to work at Capgemini, and how do you think working in IT consulting differs from working at a traditional product company?
#Company Knowledge
#Consulting Mindset
Machine Learning Engineer
•
Coding
•
medium
Write a Python function to merge overlapping intervals from a list of time intervals.
#Python
#Arrays
#Sorting
Machine Learning Engineer
•
Coding
•
medium
Write a SQL query to calculate the 7-day rolling average of daily sales for a retail client.
#SQL
#Window Functions
#Time Series
Machine Learning Engineer
•
Coding
•
hard
Implement a basic K-Means clustering algorithm from scratch in Python using NumPy.
#Python
#NumPy
#Machine Learning Algorithms
Machine Learning Engineer
•
Coding
•
easy
Given a Pandas DataFrame containing client transaction data with missing values, write code to impute missing numeric values with the median and categorical values with the mode.
#Python
#Pandas
#Data Cleaning
Machine Learning Engineer
•
Coding
•
medium
Write a SQL query to find the second highest salary within each department.
#SQL
#Window Functions
#Ranking
Machine Learning Engineer
•
Coding
•
hard
Write a Python function to calculate the TF-IDF scores for a given list of text documents without using Scikit-Learn.
#Python
#NLP
#Math
Machine Learning Engineer
•
Coding
•
medium
Find the top K most frequent elements in an array of user interaction logs.
#Python
#Heaps
#Hash Maps
Machine Learning Engineer
•
Coding
•
easy
Write a Python script to extract specific entities (like dates and amounts) from a text string using Regular Expressions.
#Python
#Regex
#Text Processing
Machine Learning Engineer
•
System Design
•
hard
Design a personalized recommendation system for a large retail client's e-commerce platform.
#Recommendation Systems
#Collaborative Filtering
#Scalability
Machine Learning Engineer
•
System Design
•
hard
Design a real-time credit card fraud detection system for a banking client.
#Fraud Detection
#Streaming
#Real-time Processing
Machine Learning Engineer
•
System Design
•
medium
How would you design an end-to-end MLOps pipeline on AWS or Azure for a model that needs weekly retraining?
#MLOps
#Cloud
#CI/CD
Machine Learning Engineer
•
System Design
•
medium
Design a document extraction system to automatically parse and extract key fields from scanned invoices using OCR and NLP.
#Computer Vision
#NLP
#OCR
Machine Learning Engineer
•
System Design
•
medium
Design a system to predict customer churn for a telecommunications company. What features would you use and how would you serve the predictions?
#Tabular Data
#Classification
#Feature Engineering
Machine Learning Engineer
•
System Design
•
hard
A client wants to deploy a Large Language Model (LLM) for an internal knowledge base but is highly concerned about data privacy. How do you design this?
#LLMs
#RAG
#Data Privacy
#Security
Machine Learning Engineer
•
Technical
•
easy
Explain the bias-variance tradeoff. How do you know if your model is suffering from high bias or high variance?
#Model Evaluation
#Overfitting
#Underfitting
Machine Learning Engineer
•
Technical
•
medium
How do you handle highly imbalanced datasets in a classification problem for a fraud detection client?
#Data Imbalance
#Classification
#Fraud Detection
Machine Learning Engineer
•
Technical
•
medium
Explain the difference between Bagging and Boosting. Give examples of algorithms for each.
#Ensemble Methods
#Random Forest
#XGBoost
Machine Learning Engineer
•
Technical
•
hard
How does XGBoost handle missing values under the hood?
#XGBoost
#Missing Data
#Tree Algorithms
Machine Learning Engineer
•
Technical
•
medium
What are the mathematical and practical differences between L1 (Lasso) and L2 (Ridge) regularization?
#Regularization
#Linear Models
#Feature Selection
Machine Learning Engineer
•
Technical
•
hard
Explain the architecture of a Transformer model and how the self-attention mechanism works.
#NLP
#Transformers
#Attention Mechanism
Machine Learning Engineer
•
Technical
•
medium
How do you evaluate an unsupervised clustering model when ground truth labels are not available?
#Clustering
#Model Evaluation
#Unsupervised Learning
Machine Learning Engineer
•
Technical
•
medium
What is the difference between data drift and concept drift? How do you monitor for them in a production MLOps pipeline?
#Model Monitoring
#Data Drift
#Production ML
Machine Learning Engineer
•
Technical
•
medium
Explain the ROC curve and AUC. In what scenario would you prefer to use the Precision-Recall curve over ROC?
#Metrics
#Classification
Machine Learning Engineer
•
Technical
•
hard
How do you optimize a Deep Learning model that is taking too long to train on a cloud GPU instance?
#Optimization
#GPU
#PyTorch/TensorFlow
Machine Learning Engineer
•
Technical
•
medium
What is the difference between generative and discriminative models? Provide examples of each.
#Generative AI
#Classification
#Statistics
Machine Learning Engineer
•
Technical
•
hard
Explain the vanishing gradient problem in deep neural networks and how architectures like LSTMs or ResNets solve it.
#Neural Networks
#Backpropagation
#Architecture
Machine Learning Engineer
•
Technical
•
medium
Walk me through the steps of containerizing a Python machine learning model using Docker for deployment.
#Docker
#Deployment
#Containerization
Machine Learning Engineer
•
Technical
•
medium
What is MLflow, and how have you used it in the machine learning lifecycle?
#MLflow
#Experiment Tracking
#Model Registry
Machine Learning Engineer
•
Technical
•
easy
Explain the architectural differences between Batch inference and Real-time inference. When would you use which?
#Inference
#Architecture
#Deployment
Difficulty Radar
Based on recent AI-sourced data.
Meet Your Interviewers
The "Standard" Interviewer
Senior EngineerFocuses on core competencies, system constraints, and clear communication.
SimulateUnwritten Rules
Think Out Loud
Always explain your thought process before writing code or drawing architecture.