Amazon

E-commerce and cloud computing giant with AWS, the world's leading cloud platform.

5 Rounds ~28 Days Very Hard

Start Mock Interview

The Interview Loop

Recruiter Screen (30 min)

Standard fit check, behavioral questions, and resume overview.

Technical Loop (3-4 Rounds)

Deep dive into domain knowledge, coding, and system design.

Interview Question Bank

All Roles AI Engineer 47 Cloud Engineer 69 Data Analyst 43 Data Engineer 76 Data Scientist 65 Machine Learning Engineer 15 ML Engineer 52 Product Manager 15 Software Engineer 15

All Topics LLMs 5 MLOps 5 Deep Learning 4 Deployment 3 Algorithms 3 Model Optimization 3 ML Fundamentals 2 Inference 2

ML Engineer • Behavioral • medium

Describe a model you deployed to production. What were the biggest challenges?

#Deployment #Challenges

Practice

ML Engineer • Behavioral • hard

Tell me about a time you had to optimize a model for latency without sacrificing too much accuracy.

#Latency #Accuracy

Practice

ML Engineer • Behavioral • medium

Describe how you collaborated with data scientists to productionize their research code.

#Research to Production

Practice

ML Engineer • Behavioral • hard

Tell me about a time an ML model caused an unexpected real-world impact.

#Responsibility #AI Safety

Practice

ML Engineer • Behavioral • easy

How do you keep up with the rapidly evolving ML landscape?

#Continuous Learning

Practice

ML Engineer • Behavioral • hard

Describe a time you had to re-architecture a system because the original ML approach didn't scale.

#Scalability

Practice

ML Engineer • Behavioral • medium

Tell me about a disagreement you had with a researcher. How did you resolve it?

#Communication

Practice

ML Engineer • Behavioral • medium

How do you decide when a model is 'good enough' to ship?

#Quality #Judgment

Practice

ML Engineer • Behavioral • medium

Tell me about a time you demonstrated customer obsession in an ML project. (LP)

#Customer Obsession

Practice

ML Engineer • Coding • hard

Implement a K-means clustering algorithm from scratch in Python.

#K-Means #Clustering

Practice

ML Engineer • Coding • hard

Implement logistic regression with gradient descent in NumPy.

#Logistic Regression #NumPy

Practice

ML Engineer • Coding • hard

Write a custom PyTorch Dataset and DataLoader for irregular time series data.

#PyTorch #DataLoader

Practice

ML Engineer • Coding • medium

Implement a sliding window approach to detect anomalies in a time series.

#Anomaly Detection #Time Series

Practice

ML Engineer • Coding • hard

How would you write a batched inference pipeline using Python and Triton server?

#Triton #Batching

Practice

ML Engineer • System Design • hard

Design a CI/CD pipeline for ML models.

#CI/CD #Deployment

Practice

ML Engineer • System Design • hard

What is a feature store? Design one from scratch.

#Feature Engineering #MLOps

Practice

ML Engineer • System Design • hard

How would you serve a model that needs to respond in under 10ms?

#Low Latency #Serving

Practice

ML Engineer • System Design • hard

Design a system to retrain models automatically when performance degrades.

#Retraining #Automation

Practice

ML Engineer • System Design • hard

Design YouTube's video recommendation system end to end.

#Recommendations #Ranking

Practice

ML Engineer • System Design • hard

Design a real-time content moderation system.

#NLP #Real-Time

Practice

ML Engineer • System Design • hard

Design a search ranking system for an e-commerce platform.

#Ranking #Relevance

Practice

ML Engineer • System Design • hard

Design a training and serving architecture for a large language model at scale.

#Infrastructure #Scale

Practice

ML Engineer • System Design • hard

How would you build a personalized ad targeting system?

#Targeting #ML Systems

Practice

ML Engineer • Technical • easy

What is the difference between a data scientist and an ML engineer?

#Roles #MLOps

Practice

ML Engineer • Technical • medium

Explain the model training pipeline from raw data to deployment.

#Pipeline #Training

Practice

ML Engineer • Technical • medium

What is the difference between online learning and offline learning?

#Online Learning #Batch Learning

Practice

ML Engineer • Technical • medium

How do you handle missing data in ML model features?

#Imputation #Missing Data

Practice

ML Engineer • Technical • medium

Explain gradient descent variants: batch, stochastic, and mini-batch.

#Gradient Descent #Optimization

Practice

ML Engineer • Technical • medium

What are learning rate schedulers and why are they important?

#Learning Rate #Training

Practice

ML Engineer • Technical • hard

Explain the attention mechanism in transformers with mathematical detail.

#Attention #Transformers

Practice

ML Engineer • Technical • hard

What is quantization in neural networks? How does it reduce inference cost?

#Quantization #Inference

Practice

ML Engineer • Technical • hard

Explain knowledge distillation. When would you use it?

#Distillation #Compression

Practice

ML Engineer • Technical • hard

What is the difference between model parallelism and data parallelism in distributed training?

#Parallelism #Training

Practice

ML Engineer • Technical • medium

How do you version ML models and datasets? What tools do you use?

#Versioning #DVC #MLflow

Practice

ML Engineer • Technical • hard

Explain blue-green deployment vs canary deployment for ML models.

#Blue-Green #Canary

Practice

ML Engineer • Technical • hard

How do you detect data drift vs model drift? How do you respond to each?

#Drift #Production

Practice

ML Engineer • Technical • medium

What is shadow mode deployment in ML?

#Shadow Mode #A/B Testing

Practice

ML Engineer • Technical • medium

Explain model serialization formats: ONNX, TorchScript, SavedModel.

#ONNX #Serialization

Practice

ML Engineer • Technical • medium

What is Kubernetes? How is it used for ML model serving?

#Kubernetes #Serving

Practice

ML Engineer • Technical • hard

How do you optimize GPU utilization during training?

#GPU #Performance

Practice

ML Engineer • Technical • hard

Explain mixed precision training (FP16/BF16). What are the risks?

#Mixed Precision #Performance

Practice

ML Engineer • Technical • medium

What are the differences between PyTorch and TensorFlow for production?

#PyTorch #TensorFlow

Practice

ML Engineer • Technical • medium

How do you profile and debug a slow training run?

#Profiling #Debugging

Practice

ML Engineer • Technical • hard

Explain the RLHF (Reinforcement Learning from Human Feedback) training approach.

#RLHF #Fine-Tuning

Practice

ML Engineer • Technical • hard

What is LoRA (Low-Rank Adaptation)? How does it reduce fine-tuning costs?

#LoRA #Fine-Tuning

Practice

ML Engineer • Technical • hard

What is RAG (Retrieval-Augmented Generation)? Describe its architecture.

#RAG #Vector Search

Practice

ML Engineer • Technical • hard

How would you evaluate an LLM for a production use case?

#Evaluation #Benchmarking

Practice

ML Engineer • Technical • medium

Explain vector databases. What are FAISS, Pinecone, and Weaviate?

#Vector DB #Embeddings

Practice

ML Engineer • Technical • medium

What is model ensembling? When does it help, and when does it hurt?

#Ensembling #Performance

Practice

ML Engineer • Technical • hard

How would you use SageMaker for end-to-end MLOps?

#SageMaker #AWS

Practice

ML Engineer • Technical • hard

Explain how Amazon Personalize works internally.

#Personalize #AWS

Practice

ML Engineer • Technical • hard

How would you deploy a fraud detection model on AWS Lambda?

#Lambda #Fraud

Practice

Difficulty Radar

Based on recent AI-sourced data.

Meet Your Interviewers

The "Standard" Interviewer

Senior Engineer

Focuses on core competencies, system constraints, and clear communication.

Simulate

Unwritten Rules

Think Out Loud

Always explain your thought process before writing code or drawing architecture.

Practice Now