Amazon
E-commerce and cloud computing giant with AWS, the world's leading cloud platform.
5 Rounds
~28 Days
Very Hard
The Interview Loop
Recruiter Screen (30 min)
Standard fit check, behavioral questions, and resume overview.
Technical Loop (3-4 Rounds)
Deep dive into domain knowledge, coding, and system design.
Interview Question Bank
AI Engineer
•
Behavioral
•
hard
Tell me about a time an AI system you built produced unexpected or harmful outputs.
#Responsibility
#Ethics
AI Engineer
•
Behavioral
•
hard
Describe an AI product you built from scratch. What were the key technical decisions?
#Product Development
AI Engineer
•
Behavioral
•
hard
Tell me about an AI project where you had to balance innovation with reliability.
#Reliability
#Innovation
AI Engineer
•
Behavioral
•
hard
Describe a situation where you had to debug a hard-to-reproduce AI model failure.
#Problem Solving
AI Engineer
•
Behavioral
•
medium
How do you handle stakeholder uncertainty around AI capabilities and limitations?
#Stakeholders
#Expectations
AI Engineer
•
Behavioral
•
medium
Tell me about a time you optimized an LLM application for cost or latency.
#Cost
#Latency
AI Engineer
•
Behavioral
•
medium
Describe a time you had to choose between using an AI model and a simpler rule-based system.
#Tradeoffs
#Pragmatism
AI Engineer
•
Behavioral
•
easy
How do you stay current with the fast-moving AI/ML research landscape?
#Research
#Continuous Learning
AI Engineer
•
Coding
•
hard
Implement a semantic chunking strategy for long documents.
#Chunking
#Embeddings
AI Engineer
•
Coding
•
hard
Implement a simple RAG pipeline using Python, LangChain, and FAISS.
#RAG
#Python
AI Engineer
•
Coding
•
medium
Write a retry mechanism with exponential backoff for LLM API calls.
#Reliability
#APIs
AI Engineer
•
Coding
•
medium
Write a Python class to manage conversation history for a multi-turn chatbot.
#Chatbot
#Memory
AI Engineer
•
System Design
•
hard
Design an AI agent system that can autonomously browse the web and complete tasks.
#Agents
#Tool Use
AI Engineer
•
System Design
•
hard
Design a real-time AI safety filter for user-generated content.
#Content Moderation
#Real-Time
AI Engineer
•
System Design
•
hard
How would you build a multi-modal AI system that processes both text and images?
#Multi-Modal
#Vision
AI Engineer
•
System Design
•
hard
Design an AI code review system that integrates with GitHub PRs.
#Code Review
#LLM
AI Engineer
•
System Design
•
hard
Design a document question-answering system using RAG.
#RAG
#Vector Search
AI Engineer
•
System Design
•
hard
Design an AI-powered customer support chatbot for an e-commerce platform.
#Chatbot
#LLM
AI Engineer
•
System Design
•
hard
How would you architect an AI platform that supports 1000 concurrent LLM requests?
#Scaling
#LLM Serving
AI Engineer
•
Technical
•
medium
What are guardrails in LLM applications? How do they work?
#Guardrails
#Output Filtering
AI Engineer
•
Technical
•
medium
What is LangChain? What are its key components (Chains, Agents, Tools)?
#LangChain
#Agents
AI Engineer
•
Technical
•
medium
Explain structured output generation from LLMs (JSON mode, Instructor library).
#Structured Output
#JSON
AI Engineer
•
Technical
•
medium
What is streaming response from an LLM API? How do you implement it in a web app?
#Streaming
#API
AI Engineer
•
Technical
•
medium
How do you manage LLM API rate limits and costs in production?
#Rate Limiting
#Cost
AI Engineer
•
Technical
•
hard
Explain function calling / tool use in LLMs. How do you implement it?
#Function Calling
#Tool Use
AI Engineer
•
Technical
•
hard
Explain the difference between GPT, BERT, and T5 architectures.
#GPT
#BERT
#T5
AI Engineer
•
Technical
•
medium
What is prompt engineering? What are few-shot, zero-shot, and chain-of-thought prompting?
#Prompt Engineering
#Few-Shot
AI Engineer
•
Technical
•
hard
Explain how RLHF (Reinforcement Learning from Human Feedback) improves LLMs.
#RLHF
#Alignment
AI Engineer
•
Technical
•
hard
What is RAG (Retrieval-Augmented Generation)? When would you use it over fine-tuning?
#RAG
#Fine-Tuning
AI Engineer
•
Technical
•
medium
Explain the difference between fine-tuning and in-context learning.
#Fine-Tuning
#ICL
AI Engineer
•
Technical
•
medium
What is token context window? How do you handle documents longer than the context limit?
#Context Window
#Chunking
AI Engineer
•
Technical
•
hard
Explain positional encoding in transformers. What are the differences between absolute and rotary position embeddings?
#Positional Encoding
#RoPE
AI Engineer
•
Technical
•
hard
What is hallucination in LLMs? How do you detect and mitigate it?
#Hallucination
#Safety
AI Engineer
•
Technical
•
medium
Explain the difference between autoregressive and masked language modeling.
#Autoregressive
#Masked LM
AI Engineer
•
Technical
•
hard
What is a mixture of experts (MoE) architecture? How does it scale?
#MoE
#Scaling
AI Engineer
•
Technical
•
hard
Explain how vector similarity search works. What are HNSW and IVF indices?
#HNSW
#Similarity Search
AI Engineer
•
Technical
•
medium
Compare vector databases: Pinecone, Weaviate, Qdrant, and pgvector.
#Vector DB
#Embeddings
AI Engineer
•
Technical
•
medium
How do you choose the right embedding model for a domain-specific search task?
#Embedding Models
#Search
AI Engineer
•
Technical
•
medium
What is semantic search? How does it differ from keyword-based search?
#Semantic Search
#NLP
AI Engineer
•
Technical
•
hard
Explain the difference between dense and sparse retrieval in RAG.
#Dense Retrieval
#BM25
AI Engineer
•
Technical
•
hard
How do you evaluate retrieval quality in a RAG system?
#Evaluation
#Retrieval
AI Engineer
•
Technical
•
hard
How do you evaluate the quality of an LLM-generated response?
#LLM Evaluation
#RAGAS
AI Engineer
•
Technical
•
hard
What is AI alignment? What are the key safety concerns with large-scale AI deployment?
#Alignment
#Safety
AI Engineer
•
Technical
•
hard
Explain the concept of AI bias. How do you detect and mitigate it in production?
#Bias
#Fairness
AI Engineer
•
Technical
•
hard
What is Constitutional AI? How does Anthropic use it?
#Constitutional AI
#Anthropic
AI Engineer
•
Technical
•
hard
How do you red-team an AI system?
#Red Teaming
#Security
AI Engineer
•
Technical
•
medium
How do you integrate OpenAI API or Gemini API into a production application?
#OpenAI
#Gemini
Cloud Engineer
•
Behavioral
•
hard
Tell me about a time you had to dive deep into a complex technical problem that others couldn't solve. What was the root cause and how did you find it?
#Dive Deep
#Root Cause Analysis
#Problem Solving
Cloud Engineer
•
Behavioral
•
medium
Describe a situation where you had a tight deadline to migrate a workload to the cloud. How did you prioritize your tasks to ensure you delivered on time without compromising security?
#Deliver Results
#Time Management
#Cloud Migration
Cloud Engineer
•
Behavioral
•
medium
Tell me about a time you significantly reduced cloud infrastructure costs.
#FinOps
#Impact
Cloud Engineer
•
Behavioral
•
medium
Describe a situation where you had to choose between two cloud architectures. How did you decide?
#Architecture
#Tradeoffs
Cloud Engineer
•
Behavioral
•
medium
Tell me about a time you improved the reliability of a cloud-based data system.
#SRE
#Impact
Cloud Engineer
•
Behavioral
•
medium
How do you communicate a complex cloud architecture to non-technical stakeholders?
#Stakeholders
Cloud Engineer
•
Behavioral
•
medium
Describe your experience with incident post-mortems. What do you include?
#Post-Mortem
#Learning
Cloud Engineer
•
Behavioral
•
hard
Tell me about a major cloud outage you experienced. How did you respond?
#Outage
#On-Call
Cloud Engineer
•
Behavioral
•
medium
Tell me about a time when you had to make a critical architectural decision with incomplete information. What was the risk, and how did you proceed?
#Bias for Action
#Decision Making
#Risk Management
Cloud Engineer
•
Behavioral
•
easy
How do you stay updated with new cloud services and features?
#Continuous Learning
Cloud Engineer
•
Behavioral
•
medium
Tell me about a time when you had to deal with a difficult customer or internal stakeholder who was unhappy with your cloud infrastructure delivery. How did you resolve it?
#Customer Obsession
#Conflict Resolution
#Communication
Cloud Engineer
•
Behavioral
•
hard
Describe a time you migrated a critical workload to the cloud with zero downtime.
#Cloud Migration
Cloud Engineer
•
Coding
•
easy
Write a bash one-liner or short script to parse an Apache access log file, find the top 10 IP addresses making the most requests, and count how many requests each made.
#Bash
#Linux
#Log Analysis
Cloud Engineer
•
Coding
•
medium
Write a Python script using Boto3 to find all unattached EBS volumes in a specific region and delete them if they have been unattached for more than 30 days.
#Python
#Boto3
#EBS
#Automation
Cloud Engineer
•
System Design
•
hard
A financial services client requires a Disaster Recovery plan with a Recovery Point Objective (RPO) of 5 minutes and a Recovery Time Objective (RTO) of 1 hour. How would you design this across two AWS regions?
#Disaster Recovery
#RPO/RTO
#Cross-Region Replication
#Route53
Cloud Engineer
•
System Design
•
medium
Design an event-driven serverless architecture to process image uploads. When a user uploads an image, it needs to be resized, watermarked, and its metadata stored in a database.
#Serverless
#Lambda
#S3
#DynamoDB
#Event-Driven Architecture
Cloud Engineer
•
System Design
•
hard
How do you design a multi-region active-active architecture on AWS?
#Multi-Region
#HA
Cloud Engineer
•
System Design
•
hard
Design a highly available, scalable web application on AWS that handles sudden spikes in traffic, similar to Prime Day. The application consists of a stateless web tier and a relational database.
#Auto Scaling
#ALB
#RDS Multi-AZ
#ElastiCache
#High Availability
Cloud Engineer
•
System Design
•
hard
How do you implement disaster recovery for a cloud data warehouse?
#DR
#RTO
#RPO
Cloud Engineer
•
System Design
•
hard
How would you architect a data platform that reduces spend by 40% without impacting performance?
#FinOps
#Cloud
Cloud Engineer
•
System Design
•
hard
How would you set up a streaming data pipeline on GCP using Pub/Sub and Dataflow?
#GCP
#Pub/Sub
#Dataflow
Cloud Engineer
•
System Design
•
hard
Design a data lake on AWS using S3, Glue, and Athena.
#AWS
#S3
#Athena
Cloud Engineer
•
Technical
•
easy
Explain the difference between regions, availability zones, and edge locations.
#Regions
#AZs
Cloud Engineer
•
Technical
•
medium
What is a cloud-native application? How does it differ from a lifted-and-shifted one?
#Cloud Native
#Migration
Cloud Engineer
•
Technical
•
hard
Explain multi-cloud vs hybrid cloud architectures and their tradeoffs.
#Multi-Cloud
#Hybrid
Cloud Engineer
•
Technical
•
hard
Explain Kubernetes architecture: control plane, nodes, pods, and services.
#K8s
#Containers
Cloud Engineer
•
Technical
•
hard
What is a Kubernetes Operator and when would you build one?
#Operators
#CRD
Cloud Engineer
•
Technical
•
medium
Explain Kubernetes resource requests vs limits. What happens if a pod exceeds its memory limit?
#Resources
#OOM
Cloud Engineer
•
Technical
•
hard
What is a service mesh? Explain how Istio works.
#Istio
#Service Mesh
Cloud Engineer
•
Technical
•
hard
How would you set up horizontal pod autoscaling based on custom metrics?
#HPA
#Custom Metrics
Cloud Engineer
•
Technical
•
medium
An EC2 Linux instance is experiencing high CPU utilization, but when you run 'top', the CPU usage from user processes is low, while 'wa' (iowait) is very high. What does this mean and how do you fix it?
#Linux
#Performance Tuning
#EBS
#I/O
Cloud Engineer
•
Technical
•
hard
You have an S3 bucket receiving thousands of PUT requests per second. Users are reporting 503 Slow Down errors. What is causing this and how do you architect around S3 request limits?
#S3
#Performance Optimization
#Throttling
Cloud Engineer
•
Technical
•
hard
How would you design an IAM strategy for a large enterprise moving to AWS to ensure least privilege while allowing developers to innovate? Explain how you would use SCPs, IAM Roles, and Permission Boundaries.
#IAM
#AWS Organizations
#SCPs
#Security
Cloud Engineer
•
Technical
•
medium
A customer complains that their EC2 instance in a private subnet cannot reach the internet to download updates, even though a NAT Gateway is configured. Walk me through your troubleshooting steps.
#VPC
#EC2
#NAT Gateway
#Troubleshooting
Cloud Engineer
•
Technical
•
hard
How do you implement cost governance in a large AWS organization?
#Cost
#AWS
Cloud Engineer
•
Technical
•
medium
What is AWS Transit Gateway? When would you use it?
#Transit Gateway
#Networking
Cloud Engineer
•
Technical
•
medium
Explain the AWS Well-Architected Framework's five pillars.
#Well-Architected
Cloud Engineer
•
Technical
•
hard
How does container networking work in Kubernetes?
#Networking
#CNI
Cloud Engineer
•
Technical
•
easy
Compare and contrast Amazon RDS and Amazon DynamoDB. In what specific scenarios would you choose DynamoDB over RDS for a microservice?
#RDS
#DynamoDB
#NoSQL
#Relational
Cloud Engineer
•
Technical
•
medium
You are deploying infrastructure using AWS CloudFormation. A stack update fails and is stuck in the UPDATE_ROLLBACK_FAILED state. Why does this happen and how do you recover the stack?
#CloudFormation
#IaC
#Troubleshooting
Cloud Engineer
•
Technical
•
medium
What is OpenTelemetry? How does it standardize observability?
#OpenTelemetry
#Tracing
Cloud Engineer
•
Technical
•
medium
How would you set up CloudWatch dashboards for a data pipeline?
#CloudWatch
#AWS
Cloud Engineer
•
Technical
•
medium
Explain the three pillars of observability: logs, metrics, and traces.
#Logs
#Metrics
#Traces
Cloud Engineer
•
Technical
•
easy
What is a runbook? How do you create effective runbooks for data infrastructure?
#Runbook
#On-Call
Cloud Engineer
•
Technical
•
medium
How do you do capacity planning for a cloud data platform?
#Scaling
#Planning
Cloud Engineer
•
Technical
•
hard
Explain chaos engineering. How would you implement it for a data pipeline?
#Chaos Engineering
#Fault Injection
Cloud Engineer
•
Technical
•
medium
What are SLOs, SLAs, and SLIs? How do you define them for a data platform?
#SLO
#Reliability
Cloud Engineer
•
Technical
•
hard
How would you implement network segmentation for a multi-tier application?
#Security
#Subnets
Cloud Engineer
•
Technical
•
medium
What is AWS PrivateLink? When would you use it?
#PrivateLink
#VPC
Cloud Engineer
•
Technical
•
medium
How do cloud IAM roles and policies work? Explain least-privilege principle.
#IAM
#Permissions
Cloud Engineer
•
Technical
•
medium
Explain TLS/SSL termination in a cloud load balancer.
#TLS
#Load Balancer
Cloud Engineer
•
Technical
•
hard
What is zero-trust networking? How do you implement it on cloud?
#Zero Trust
#Networking
Cloud Engineer
•
Technical
•
medium
How does AWS Glue Data Catalog work with Athena?
#Glue
#Athena
Cloud Engineer
•
Technical
•
medium
Explain AWS S3 storage classes and lifecycle policies.
#S3
#Cost
Cloud Engineer
•
Technical
•
hard
What is BigQuery Slots? How do you optimize BigQuery query costs?
#GCP
#Cost
Cloud Engineer
•
Technical
•
medium
Explain the difference between AWS Lambda and EC2 for data processing.
#Lambda
#Serverless
Cloud Engineer
•
Technical
•
hard
Compare AWS EMR, GCP Dataproc, and Azure HDInsight for Spark workloads.
#EMR
#Dataproc
#Spark
Cloud Engineer
•
Technical
•
hard
How do you handle Terraform state across multiple teams?
#State Management
#Collaboration
Cloud Engineer
•
Technical
•
medium
Explain idempotency in infrastructure provisioning.
#Idempotency
#Terraform
Cloud Engineer
•
Technical
•
medium
How do you manage secrets in cloud infrastructure? (HashiCorp Vault, AWS Secrets Manager)
#Secrets Management
#Vault
Cloud Engineer
•
Technical
•
medium
What is the difference between Terraform and Pulumi?
#Terraform
#Pulumi
Cloud Engineer
•
Technical
•
hard
Explain Terraform's state management. What happens if the state file is corrupted?
#IaC
#State
Cloud Engineer
•
Technical
•
medium
How does a Kubernetes Ingress controller work?
#Ingress
#Load Balancing
Cloud Engineer
•
Technical
•
medium
Explain the difference between Docker and containerd.
#Docker
#containerd
Cloud Engineer
•
Technical
•
hard
Compare AWS, GCP, and Azure for a data-intensive workload. What are the key differentiators?
#AWS
#GCP
#Azure
Cloud Engineer
•
Technical
•
medium
What is the shared responsibility model in cloud security?
#Cloud Security
#IAM
Cloud Engineer
•
Technical
•
easy
Explain IaaS, PaaS, and SaaS with examples.
#IaaS
#PaaS
#SaaS
Cloud Engineer
•
Technical
•
hard
What is a VPC (Virtual Private Cloud)? How do you design a secure VPC architecture?
#VPC
#Security
Cloud Engineer
•
Technical
•
medium
How does auto-scaling work? What are the different scaling strategies?
#Auto-Scaling
#EC2
Data Analyst
•
Behavioral
•
medium
Tell me about an analysis that changed a major business decision.
#Business Impact
#Influence
Data Analyst
•
Behavioral
•
medium
How do you handle a situation where a stakeholder challenges your analysis?
#Stakeholders
#Confidence
Data Analyst
•
Behavioral
•
medium
Describe a time you found an insight that was counterintuitive.
#Curiosity
Data Analyst
•
Behavioral
•
hard
Tell me about a time you had incomplete data but still needed to deliver analysis.
#Ambiguity
Data Analyst
•
Behavioral
•
easy
How do you ensure your analyses are reproducible?
#Reproducibility
Data Analyst
•
Behavioral
•
medium
Tell me about a time you discovered data quality issues mid-analysis. What did you do?
#Problem Solving
Data Analyst
•
Behavioral
•
medium
How do you prioritize analytical requests when multiple teams need you?
#Time Management
Data Analyst
•
Behavioral
•
medium
Describe a dashboard you built that was widely adopted. What made it successful?
#Visualization
Data Analyst
•
Coding
•
hard
Write a SQL query to find customers who made purchases in both January and February but not March.
#Set Operations
Data Analyst
•
Coding
•
easy
Explain how groupby and agg work in pandas with an example.
#Pandas
#GroupBy
Data Analyst
•
Coding
•
hard
What is a funnel query? Write one for a 3-step user onboarding flow.
#Funnel Analysis
Data Analyst
•
Coding
•
medium
Explain window functions. Write a query using LAG() to compute day-over-day change.
#Window Functions
Data Analyst
•
Coding
•
hard
Write a SQL query to calculate the rolling 28-day average session duration per user.
#Rolling Average
#Sessions
Data Analyst
•
Coding
•
hard
How would you detect anomalies in a daily revenue time series using SQL?
#Anomaly Detection
#SQL
Data Analyst
•
Coding
•
medium
What is a pivot table in SQL? How would you implement it without native PIVOT support?
#Pivot
#Data Transformation
Data Analyst
•
Coding
•
medium
How would you merge two large DataFrames efficiently in pandas?
#Pandas
#Merging
Data Analyst
•
Coding
•
medium
Describe how to detect and handle outliers in a dataset using Python.
#Outliers
#Data Cleaning
Data Analyst
•
Coding
•
easy
Write Python code to load a CSV, clean missing values, and compute summary statistics.
#Data Cleaning
#Pandas
Data Analyst
•
Coding
•
medium
Write a SQL query to calculate month-over-month revenue growth.
#Revenue
#Growth Analytics
Data Analyst
•
Coding
•
hard
How would you build a cohort analysis for user retention in SQL?
#Cohort Analysis
#Retention
Data Analyst
•
Coding
•
medium
How would you use pandas to compute a 7-day rolling average of sessions?
#Pandas
#Time Series
Data Analyst
•
Technical
•
medium
Describe your process for creating an executive-level analytics presentation.
#Executive Reporting
Data Analyst
•
Technical
•
easy
How do you choose between a bar chart, line chart, and scatter plot?
#Charts
#Design
Data Analyst
•
Technical
•
easy
Explain the difference between a HAVING clause and a WHERE clause.
#SQL Basics
Data Analyst
•
Technical
•
medium
How do you handle timezone conversions in SQL analytics?
#Timezones
#Analytics
Data Analyst
•
Technical
•
hard
Daily Active Users dropped 15% yesterday. Walk me through how you'd investigate.
#Root Cause Analysis
#Metrics
Data Analyst
•
Technical
•
medium
What is customer lifetime value (LTV)? How would you calculate it?
#LTV
#Retention
Data Analyst
•
Technical
•
easy
Explain the difference between DAU, WAU, and MAU. Which is most useful and when?
#Engagement
#KPIs
Data Analyst
•
Technical
•
medium
How would you measure the success of a new feature launch?
#Feature Success
#Metrics
Data Analyst
•
Technical
•
easy
What is ARPU (Average Revenue Per User)? How do you segment ARPU analysis?
#ARPU
#Revenue
Data Analyst
•
Technical
•
hard
Explain the concept of attribution modeling. What are last-click vs multi-touch models?
#Marketing Analytics
Data Analyst
•
Technical
•
medium
How would you build a dashboard to monitor e-commerce funnel health?
#Visualization
#Funnel
Data Analyst
•
Technical
•
hard
What metrics would you use to measure the health of a marketplace?
#Marketplace
#Supply & Demand
Data Analyst
•
Technical
•
easy
What is net promoter score (NPS)? How do you analyse NPS trends?
#NPS
#Customer Satisfaction
Data Analyst
•
Technical
•
hard
How would you measure the impact of a pricing change on revenue?
#Pricing
#A/B Test
Data Analyst
•
Technical
•
hard
Explain how you'd set up an A/B test to validate a new checkout flow.
#A/B Testing
#Statistics
Data Analyst
•
Technical
•
hard
What sample size do you need for an A/B test? How do you calculate it?
#Sample Size
#Power
Data Analyst
•
Technical
•
hard
A/B test shows p=0.04, but the effect size is tiny. Would you ship?
#Practical Significance
#Decision Making
Data Analyst
•
Technical
•
medium
What is a novelty effect in experimentation? How do you account for it?
#Novelty Effect
#Bias
Data Analyst
•
Technical
•
hard
How do you handle multiple metrics in an A/B test (metric tradeoffs)?
#Multiple Metrics
#Tradeoffs
Data Analyst
•
Technical
•
medium
What makes a good data visualization? Walk me through your design principles.
#Design
#Communication
Data Analyst
•
Technical
•
medium
How would you explain statistical significance to a non-technical product manager?
#Storytelling
#Statistics
Data Analyst
•
Technical
•
easy
What tools do you use for dashboarding? Compare Tableau, Looker, and Metabase.
#Tableau
#Looker
Data Engineer
•
Behavioral
•
medium
Describe a situation where a data pipeline you owned went down in production. How did you handle it?
#On-Call
#Problem Solving
Data Engineer
•
Behavioral
•
medium
Tell me about a time you simplified a complex data platform decision across multiple teams.
#Communication
#Stakeholders
Data Engineer
•
Behavioral
•
medium
How do you handle disagreements with data analysts or scientists who want features that compromise pipeline reliability?
#Conflict Resolution
Data Engineer
•
Behavioral
•
medium
Tell me about a time you significantly improved the performance of a data system.
#Performance
#Optimization
Data Engineer
•
Behavioral
•
hard
Describe how you've balanced technical debt vs. new feature development in a data platform.
#Prioritization
Data Engineer
•
Behavioral
•
medium
Tell me about a time you onboarded a new data source that had significant quality issues.
#Problem Solving
Data Engineer
•
Behavioral
•
easy
Describe your experience mentoring junior data engineers.
#Mentoring
#Collaboration
Data Engineer
•
Behavioral
•
easy
How do you stay current with rapidly evolving data engineering tools and practices?
#Growth Mindset
Data Engineer
•
Behavioral
•
medium
Tell me about a time you had to dive deep into a complex data discrepancy issue between a source system and your data warehouse. How did you find the root cause?
#Dive Deep
#Debugging
#Root Cause Analysis
Data Engineer
•
Behavioral
•
medium
Tell me about a time you had to make a technical compromise in your data pipeline design to meet an urgent business deadline. How did you handle the tech debt?
#Deliver Results
#Trade-offs
#Tech Debt
Data Engineer
•
Behavioral
•
easy
Tell me about a time you received feedback from a customer (or internal stakeholder) that your data or dashboard was incorrect. How did you respond?
#Customer Obsession
#Earn Trust
#Communication
Data Engineer
•
Coding
•
medium
Write a SQL query to compute a 7-day rolling average of daily sales.
#Window Functions
#Analytics
Data Engineer
•
Coding
•
medium
Write a Python function to flatten a deeply nested JSON object representing an Amazon product catalog, where keys of nested dictionaries should be concatenated with a dot ('.').
#Python
#Recursion
#Data Structures
#JSON Parsing
Data Engineer
•
Coding
•
medium
Given a list of strings representing Amazon search queries, write a Python script to return the top K most frequent queries. Your solution must be optimized for large datasets.
#Python
#Heaps
#Hash Maps
#Big O Notation
Data Engineer
•
Coding
•
hard
Write a SQL query to identify 'loyal' customers who have made at least one purchase in 3 consecutive months.
#Self Joins
#Window Functions
#Gaps and Islands
Data Engineer
•
Coding
•
medium
Write a SQL query to find the rolling 7-day average of daily sales per product category, given an 'orders' table with order_id, product_id, category_id, order_date, and order_amount.
#Window Functions
#Time Series
#Aggregations
Data Engineer
•
Coding
•
medium
Write a SQL query to find the second highest salary per department.
#Window Functions
#SQL
Data Engineer
•
Coding
•
medium
Write a SQL query to calculate the Year-over-Year (YoY) growth rate of total revenue for each product sub-category.
#Date Functions
#CTEs
#Math Operations
Data Engineer
•
System Design
•
hard
Design a data model for an e-commerce platform tracking orders, users, and products.
#ER Modeling
#Dimensional Modeling
Data Engineer
•
System Design
•
hard
Design an ETL pipeline to migrate 100TB of historical order data from an on-premise Oracle database to AWS Redshift, ensuring zero data loss and minimal downtime.
#Data Migration
#AWS DMS
#AWS S3
#AWS Redshift
Data Engineer
•
System Design
•
hard
How would you design a data pipeline that needs exactly-once delivery guarantees?
#Exactly-Once
#Kafka
Data Engineer
•
System Design
•
hard
Design an ETL pipeline that ingests 10TB of raw clickstream data daily.
#ETL
#Batch Processing
Data Engineer
•
System Design
•
hard
Design a real-time streaming pipeline to process and aggregate Amazon clickstream data to detect anomalous user behavior (e.g., bot scraping) within a 1-minute window.
#AWS Kinesis
#Apache Flink
#Stream Processing
#Anomaly Detection
Data Engineer
•
System Design
•
hard
Design a data pipeline for Prime Video's recommendation signals.
#Prime Video
#Pipeline
Data Engineer
•
System Design
•
hard
How would you design a real-time anomaly detection pipeline for 100K events/sec?
#Real-Time
#Anomaly Detection
Data Engineer
•
System Design
•
medium
Design a dimensional data model (Star Schema) for Amazon Prime Video to track user viewership, subscription changes, and content metadata.
#Star Schema
#Fact Tables
#Dimension Tables
#SCD
Data Engineer
•
System Design
•
hard
How would you design a data warehouse for a ride-sharing company from scratch?
#Architecture
#Design
Data Engineer
•
System Design
•
hard
Design a real-time inventory tracking system for Amazon's fulfillment network.
#Inventory
#Streaming
Data Engineer
•
System Design
•
hard
Design a scalable Data Lake architecture on AWS to support both ad-hoc querying by data scientists and daily aggregated reporting by BI tools.
#Data Lake
#AWS S3
#AWS Athena
#AWS Glue
#Parquet/Iceberg
Data Engineer
•
Technical
•
medium
Explain how Parquet and ORC file formats work and when you'd use each.
#Parquet
#ORC
#Columnar
Data Engineer
•
Technical
•
medium
What is the CAP theorem? Give an example of a real-world system tradeoff.
#CAP
#Consistency
#Availability
Data Engineer
•
Technical
•
medium
How does Kafka handle message ordering guarantees?
#Ordering
#Partitions
Data Engineer
•
Technical
•
medium
What is Apache Kafka? Explain topics, partitions, consumer groups, and offsets.
#Kafka
#Streaming
Data Engineer
•
Technical
•
hard
Explain the difference between map-side and reduce-side joins in MapReduce/Spark.
#Joins
#MapReduce
Data Engineer
•
Technical
•
hard
What is data skew in Spark? How do you diagnose and fix it?
#Data Skew
#Performance
Data Engineer
•
Technical
•
hard
Explain how Apache Spark's execution model works. What is a DAG in Spark?
#Spark
#DAG
#Distributed Computing
Data Engineer
•
Technical
•
easy
Explain the difference between push-based and pull-based data ingestion.
#Push
#Pull
#CDC
Data Engineer
•
Technical
•
hard
How would you use AWS Glue and Athena to build a serverless data lake?
#Glue
#Athena
Data Engineer
•
Technical
•
hard
Explain how Amazon Redshift Spectrum enables querying S3 data.
#Spectrum
#S3
Data Engineer
•
Technical
•
hard
How do you implement CDC (Change Data Capture) using AWS DMS?
#DMS
#Replication
Data Engineer
•
Technical
•
hard
What is Amazon's Write Every Read (WEAR) approach and why?
#WEAR
#Data Modeling
Data Engineer
•
Technical
•
hard
How would you optimize a SQL query that is running slowly on a 1 billion row table?
#Query Optimization
#Indexing
Data Engineer
•
Technical
•
hard
What is backfilling? How do you handle a backfill of 2 years of historical data without impacting production?
#Backfill
#Airflow
Data Engineer
•
Technical
•
medium
Describe how you'd implement circuit breakers in a data pipeline.
#Circuit Breakers
#Fault Tolerance
Data Engineer
•
Technical
•
medium
How do you monitor data pipeline health in production? What metrics do you track?
#Monitoring
#Alerting
Data Engineer
•
Technical
•
medium
What is Apache Airflow? How does it differ from Prefect or Dagster?
#Airflow
#Prefect
#Dagster
Data Engineer
•
Technical
•
medium
Explain the difference between OLAP and OLTP systems. When would you use each?
#OLAP
#OLTP
#Databases
Data Engineer
•
Technical
•
medium
Explain the difference between RANK(), DENSE_RANK(), and ROW_NUMBER().
#Window Functions
#SQL
Data Engineer
•
Technical
•
medium
Explain compaction in Delta Lake / Iceberg. Why is it important?
#Compaction
#Performance
Data Engineer
•
Technical
•
medium
How do you handle dependency management, backfilling, and failure recovery in a complex Apache Airflow DAG processing daily e-commerce transactions?
#Apache Airflow
#DAGs
#Fault Tolerance
#Idempotency
Data Engineer
•
Technical
•
medium
What is infrastructure as code (IaC)? Have you used Terraform for data infrastructure?
#Terraform
#IaC
Data Engineer
•
Technical
•
medium
How would you reduce costs in a cloud-based data platform?
#Cloud
#Cost
Data Engineer
•
Technical
•
medium
Explain the difference between S3, HDFS, and GCS for data storage.
#S3
#HDFS
#GCS
Data Engineer
•
Technical
•
hard
How does BigQuery handle large joins efficiently? What is its columnar storage approach?
#BigQuery
#Columnar Storage
Data Engineer
•
Technical
•
hard
Compare AWS Redshift, Google BigQuery, and Snowflake for a petabyte-scale warehouse.
#Redshift
#BigQuery
#Snowflake
Data Engineer
•
Technical
•
medium
Explain the concept of a data catalog. What tools have you used?
#Data Catalog
#Metadata
Data Engineer
•
Technical
•
medium
What is PII (Personally Identifiable Information) and how do you handle it in a data pipeline?
#PII
#Privacy
#Compliance
Data Engineer
•
Technical
•
hard
How would you detect and handle data drift in a production system?
#Data Drift
#Monitoring
Data Engineer
•
Technical
•
medium
What is data lineage and why is it important? How do you implement it?
#Lineage
#Metadata
Data Engineer
•
Technical
•
medium
How do you implement data quality checks in a production pipeline?
#Great Expectations
#Data Validation
Data Engineer
•
Technical
•
medium
What is a medallion architecture (Bronze/Silver/Gold)?
#Medallion
#Data Lake
Data Engineer
•
Technical
•
hard
How do you handle schema evolution in a data pipeline without breaking downstream consumers?
#Schema Evolution
#Backward Compatibility
Data Engineer
•
Technical
•
medium
Explain the concept of a data lakehouse. What are its advantages over a traditional data warehouse?
#Data Lakehouse
#Data Warehouse
Data Engineer
•
Technical
•
hard
What is Data Vault methodology? How does it differ from Kimball?
#Data Vault
#Kimball
Data Engineer
•
Technical
•
medium
What is the star schema vs snowflake schema? When would you use each?
#Star Schema
#Snowflake Schema
Data Engineer
•
Technical
•
hard
What is Delta Lake? How does it provide ACID transactions on data lakes?
#Delta Lake
#ACID
#Time Travel
Data Engineer
•
Technical
•
medium
What is a materialized view? How does it differ from a regular view?
#Materialized Views
#Performance
Data Engineer
•
Technical
•
hard
Describe partitioning strategies in a data warehouse. When would you use range vs hash partitioning?
#Partitioning
#Performance
Data Engineer
•
Technical
•
medium
What are CTEs (Common Table Expressions) and how do they differ from subqueries?
#CTEs
#SQL
Data Engineer
•
Technical
•
medium
Explain ACID properties. Which databases sacrifice ACID for performance and why?
#ACID
#Distributed Systems
Data Engineer
•
Technical
•
hard
How do you handle late-arriving data in a streaming pipeline?
#Kafka
#Watermarks
Data Engineer
•
Technical
•
medium
What is idempotency and why is it critical in data pipelines?
#Idempotency
#Data Quality
Data Engineer
•
Technical
•
hard
Explain the Lambda architecture. What are its tradeoffs vs Kappa architecture?
#Lambda
#Kappa
#Streaming
Data Engineer
•
Technical
•
hard
How would you optimize a slow-running Apache Spark job on AWS EMR that is suffering from severe data skew during a large join operation?
#Apache Spark
#Performance Tuning
#Data Skew
#AWS EMR
Data Engineer
•
Technical
•
medium
Explain the difference between distribution styles (KEY, ALL, EVEN) in Amazon Redshift. Given a massive 'orders' table and a small 'date' dimension table, which distribution styles would you choose and why?
#AWS Redshift
#Distributed Databases
#Query Optimization
Data Engineer
•
Technical
•
hard
What is a slowly changing dimension (SCD)? Describe SCD Type 1, 2, and 3 with examples.
#SCD
#Dimensional Modeling
Data Scientist
•
Behavioral
•
medium
Tell me about a data science project where the results surprised you. What did you do?
#Analytical Thinking
Data Scientist
•
Behavioral
•
hard
Describe a time you used data to challenge a widely held assumption in your organization.
#Influence
#Analytics
Data Scientist
•
Behavioral
•
medium
Describe how you communicated a complex model result to a non-technical stakeholder.
#Storytelling
Data Scientist
•
Behavioral
•
medium
Tell me about a time you strongly disagreed with a Product Manager about the direction of a machine learning project. How did you resolve it?
#Have Backbone
#Disagree and Commit
#Stakeholder Management
Data Scientist
•
Behavioral
•
hard
Tell me about a time you had to push back on a business request for an analysis that would be misleading.
#Ethics
#Communication
Data Scientist
•
Behavioral
•
medium
How do you approach ethical considerations in ML model building?
#Fairness
#Bias
Data Scientist
•
Behavioral
•
medium
Describe a situation where you found a critical flaw in your own model or analysis after it was deployed. What did you do?
#Dive Deep
#Ownership
#Insist on Highest Standards
Data Scientist
•
Behavioral
•
medium
Describe a project where you had to iterate significantly on your initial approach.
#Iteration
#Learning
Data Scientist
•
Behavioral
•
medium
Tell me about a time you used data to uncover a customer pain point that wasn't immediately obvious. How did you address it?
#Customer Obsession
#Dive Deep
#Data Storytelling
Data Scientist
•
Behavioral
•
medium
How do you prioritize between multiple data science requests from different teams?
#Stakeholder Management
Data Scientist
•
Behavioral
•
hard
Tell me about a time your model failed in production. What did you learn?
#Production
#MLOps
Data Scientist
•
Behavioral
•
medium
Tell me about a time you had to deliver a machine learning model under a very tight deadline. What trade-offs did you make?
#Deliver Results
#Bias for Action
#Trade-offs
Data Scientist
•
Coding
•
hard
Write a SQL query to calculate the month-over-month retention rate of Amazon Prime members.
#Self Joins
#Date Functions
#Cohort Analysis
Data Scientist
•
Coding
•
hard
Write a SQL query to calculate 30-day user retention.
#Retention
#Analytics
Data Scientist
•
Coding
•
medium
Write a SQL query to find the top 3 best-selling products in each category for the last 30 days, considering only orders that were successfully delivered.
#Window Functions
#Filtering
#Joins
Data Scientist
•
Coding
•
medium
Given a list of customer reviews (strings) and a list of banned keywords, write a Python function to return the top K most frequent valid words across all reviews.
#Hash Maps
#Heaps
#String Manipulation
#NLP
Data Scientist
•
Coding
•
medium
Write a query to identify duplicate records and deduplicate them.
#Deduplication
#Data Quality
Data Scientist
•
Coding
•
medium
Given a Pandas DataFrame containing user clickstream data (user_id, timestamp, page_url), write code to calculate the average session length. A new session starts if a user is inactive for more than 30 minutes.
#Pandas
#Time Series
#Sessionization
Data Scientist
•
Coding
•
hard
How would you write a funnel analysis query in SQL?
#Funnel
#Analytics
Data Scientist
•
System Design
•
hard
Design a feature store. What are its key components?
#Feature Store
#MLOps
Data Scientist
•
System Design
•
hard
How would you design a recommendation system for Amazon Prime Video to suggest movies to users who have just finished watching a series?
#Collaborative Filtering
#Cold Start
#Deep Learning
#Real-time Inference
Data Scientist
•
System Design
•
hard
How would you build a recommendation system? Compare collaborative vs content-based filtering.
#Collaborative Filtering
#Content-Based
Data Scientist
•
System Design
•
hard
Design a machine learning system to rank search results for Amazon.com. How do you balance relevance, profitability, and shipping speed?
#Learning to Rank
#Multi-objective Optimization
#Personalization
Data Scientist
•
System Design
•
hard
How would you build and deploy a churn prediction model?
#Churn
#MLOps
Data Scientist
•
System Design
•
hard
Design a real-time fraud detection system for a payments platform.
#Fraud Detection
#Real-Time ML
Data Scientist
•
Technical
•
hard
How do you design an A/B test for a new product feature?
#A/B Testing
#Statistics
Data Scientist
•
Technical
•
hard
We want to test a new pricing algorithm for Amazon third-party sellers. If we randomize at the seller level, how might network effects bias the A/B test results, and how would you design the experiment to mitigate this?
#A/B Testing
#Network Effects
#Switchback Testing
#Cluster Randomization
Data Scientist
•
Technical
•
easy
You are evaluating a binary classifier for detecting defective items in an Amazon fulfillment center. The defect rate is 0.1%. Why is accuracy a poor metric here, and what metrics would you use instead?
#Evaluation Metrics
#Precision
#Recall
#PR-AUC
Data Scientist
•
Technical
•
hard
How would you build a time-series forecasting model to predict the inventory demand for a highly seasonal product during Prime Day?
#Time Series
#Forecasting
#ARIMA
#XGBoost
Data Scientist
•
Technical
•
medium
Explain the difference between Random Forest and Gradient Boosting. In an Amazon fraud detection use case where data is highly imbalanced, which would you prefer and why?
#Ensemble Methods
#Imbalanced Data
#Fraud Detection
Data Scientist
•
Technical
•
hard
Amazon is testing a new 'Buy Now' button color. The A/B test shows a statistically significant increase in click-through rate, but overall revenue decreased. How do you investigate this?
#A/B Testing
#Cannibalization
#Metrics
#Causal Inference
Data Scientist
•
Technical
•
hard
How do you monitor model performance in production? What is model drift?
#Model Drift
#Monitoring
Data Scientist
•
Technical
•
easy
Explain the difference between INNER JOIN, LEFT JOIN, and CROSS JOIN.
#Joins
#SQL
Data Scientist
•
Technical
•
easy
What is an experiment holdout group?
#Holdout
#Control Group
Data Scientist
•
Technical
•
hard
How would you identify the root cause of a sudden 20% drop in DAU?
#Root Cause Analysis
#Debugging
Data Scientist
•
Technical
•
easy
Explain the difference between a leading indicator and a lagging indicator.
#Metrics
#KPIs
Data Scientist
•
Technical
•
medium
How do you choose a north star metric for a product?
#Metrics
#Product Strategy
Data Scientist
•
Technical
•
hard
What is a network effect in experimentation? How do you handle SUTVA violation?
#SUTVA
#Network Effects
Data Scientist
•
Technical
•
hard
How would you design an experiment to measure the impact of a new ranking algorithm?
#Experimentation
#Metrics
Data Scientist
•
Technical
•
medium
How would you detect and mitigate overfitting in a neural network?
#Overfitting
#Dropout
#Regularization
Data Scientist
•
Technical
•
medium
Explain batch normalization and why it helps training.
#Batch Normalization
#Training
Data Scientist
•
Technical
•
medium
What is embedding? How do word embeddings like Word2Vec and GloVe work?
#Embeddings
#Word2Vec
Data Scientist
•
Technical
•
medium
How would you approach an NLP problem like sentiment analysis from scratch?
#Sentiment Analysis
#Text Classification
Data Scientist
•
Technical
•
medium
What is transfer learning? How would you fine-tune a pre-trained model?
#Transfer Learning
#Fine-Tuning
Data Scientist
•
Technical
•
hard
Explain the transformer architecture. What are attention mechanisms?
#Transformers
#Attention
#BERT
Data Scientist
•
Technical
•
hard
What is the vanishing gradient problem? How do LSTM and ResNet address it?
#LSTM
#ResNet
#Gradients
Data Scientist
•
Technical
•
medium
Explain how backpropagation works.
#Backpropagation
#Neural Networks
Data Scientist
•
Technical
•
medium
What is principal component analysis (PCA)? What are its limitations?
#PCA
#SVD
Data Scientist
•
Technical
•
medium
Explain the difference between bagging and boosting.
#Bagging
#Boosting
Data Scientist
•
Technical
•
medium
How do you approach feature selection?
#Feature Selection
#LASSO
Data Scientist
•
Technical
•
medium
What is cross-validation? Explain k-fold and stratified k-fold.
#Cross Validation
#k-Fold
Data Scientist
•
Technical
•
medium
Explain the ROC curve and AUC metric. When would you prefer AUC over accuracy?
#ROC
#AUC
#Metrics
Data Scientist
•
Technical
•
medium
How do you handle class imbalance in a classification problem?
#Imbalanced Data
#SMOTE
Data Scientist
•
Technical
•
medium
What is regularization? Explain L1 vs L2 regularization and their effects.
#Regularization
#L1
#L2
Data Scientist
•
Technical
•
medium
How does a Random Forest work? What are its hyperparameters and how do you tune them?
#Random Forest
#Hyperparameter Tuning
Data Scientist
•
Technical
•
hard
Explain gradient boosting. How does XGBoost differ from a standard gradient boosting machine?
#Gradient Boosting
#XGBoost
Data Scientist
•
Technical
•
medium
How would you detect and handle multicollinearity in a regression model?
#Multicollinearity
#Regression
Data Scientist
•
Technical
•
hard
Explain the curse of dimensionality and its implications for ML models.
#Dimensionality
#Feature Engineering
Data Scientist
•
Technical
•
medium
What is a confidence interval? How does it differ from a prediction interval?
#Confidence Interval
#Intervals
Data Scientist
•
Technical
•
hard
Explain Bayesian vs Frequentist statistics. When would you use each?
#Bayesian
#Frequentist
Data Scientist
•
Technical
•
hard
What is the multiple testing problem? How do you correct for it?
#Bonferroni
#FDR
Data Scientist
•
Technical
•
easy
What is the difference between Type I and Type II errors?
#Hypothesis Testing
#Errors
Data Scientist
•
Technical
•
medium
Explain the central limit theorem and its importance in data science.
#CLT
#Sampling
Data Scientist
•
Technical
•
medium
What is a p-value? Why is a p-value of 0.05 not always sufficient?
#Hypothesis Testing
#p-value
Data Scientist
•
Technical
•
medium
Explain the bias-variance tradeoff. How does it influence model selection?
#Bias-Variance
#Model Selection
Machine Learning Engineer
•
Behavioral
•
medium
Tell me about a time you had to push back on a product manager's request because the proposed ML solution would negatively impact the customer experience.
#Customer Obsession
#Have Backbone; Disagree and Commit
#Stakeholder Management
Machine Learning Engineer
•
Behavioral
•
medium
Describe a situation where you simplified a complex machine learning pipeline or model architecture without sacrificing significant performance.
#Invent and Simplify
#Model Compression
#Engineering Excellence
Machine Learning Engineer
•
Behavioral
•
medium
Tell me about a time when a machine learning model you deployed degraded in production. How did you detect it, root-cause the issue, and resolve it?
#Deliver Results
#Dive Deep
#MLOps
#Model Monitoring
Machine Learning Engineer
•
Coding
•
medium
Given an array of Amazon product prices and a target gift card balance, find all unique combinations of products that sum exactly to the target balance. Each product can be used an unlimited number of times.
#Backtracking
#Array
#Combinatorics
Machine Learning Engineer
•
Coding
•
medium
Design a data structure for an Amazon shopping cart analytics tool that supports adding a product ID, removing a product ID, and getting a random product ID in O(1) time.
#Hash Map
#Array
#System Design
Machine Learning Engineer
•
Coding
•
easy
Given the root of a binary tree representing an Amazon product category hierarchy, return the level order traversal of its nodes' values.
#Trees
#BFS
#Queue
Machine Learning Engineer
•
Coding
•
medium
Given a string of customer review text, find the length of the longest substring without repeating characters.
#Sliding Window
#Hash Set
#Strings
Machine Learning Engineer
•
System Design
•
hard
Design a machine learning system to predict the estimated time of arrival (ETA) for Amazon Prime packages. What features would you use and how would you handle weather anomalies?
#Regression
#Geospatial Data
#Time Series
#Feature Engineering
Machine Learning Engineer
•
System Design
•
hard
Design a machine learning system to detect fraudulent reviews on Amazon in real-time. How do you handle cold-start problems for newly created accounts?
#Fraud Detection
#Classification
#Graph Neural Networks
#Cold Start
Machine Learning Engineer
•
System Design
•
hard
Design a real-time product recommendation system for the Amazon homepage that personalizes items based on a user's browsing history from the last 10 minutes.
#Recommender Systems
#Real-time Inference
#Streaming Data
#Collaborative Filtering
Machine Learning Engineer
•
System Design
•
hard
Design the ML architecture for Amazon's autocomplete/typeahead search feature. How do you balance personalization with global popularity while maintaining sub-50ms latency?
#Search Ranking
#NLP
#Tries
#Personalization
Machine Learning Engineer
•
Technical
•
medium
You are building a click-through rate (CTR) prediction model for Amazon Sponsored Ads. The baseline click rate is less than 0.5%. What loss function and evaluation metrics would you use, and why?
#CTR Prediction
#Imbalanced Data
#Evaluation Metrics
#Loss Functions
Machine Learning Engineer
•
Technical
•
medium
Describe how you would set up an automated retraining pipeline for an inventory forecasting model using AWS SageMaker, Step Functions, and EventBridge.
#AWS SageMaker
#CI/CD
#Model Retraining
#Cloud Architecture
Machine Learning Engineer
•
Technical
•
medium
How does the self-attention mechanism work in Transformer models? Explain the mathematical operations involving Query, Key, and Value matrices.
#Deep Learning
#Transformers
#Attention Mechanism
#Mathematics
Machine Learning Engineer
•
Technical
•
hard
Explain how you would fine-tune a Large Language Model for Amazon's customer service chatbot (Rufus) using LoRA. What are the mathematical and computational trade-offs compared to full fine-tuning?
#LLMs
#PEFT
#LoRA
#NLP
ML Engineer
•
Behavioral
•
hard
Describe a time you had to re-architecture a system because the original ML approach didn't scale.
#Scalability
ML Engineer
•
Behavioral
•
hard
Tell me about a time an ML model caused an unexpected real-world impact.
#Responsibility
#AI Safety
ML Engineer
•
Behavioral
•
medium
Describe how you collaborated with data scientists to productionize their research code.
#Research to Production
ML Engineer
•
Behavioral
•
hard
Tell me about a time you had to optimize a model for latency without sacrificing too much accuracy.
#Latency
#Accuracy
ML Engineer
•
Behavioral
•
medium
Describe a model you deployed to production. What were the biggest challenges?
#Deployment
#Challenges
ML Engineer
•
Behavioral
•
medium
Tell me about a time you demonstrated customer obsession in an ML project. (LP)
#Customer Obsession
ML Engineer
•
Behavioral
•
medium
How do you decide when a model is 'good enough' to ship?
#Quality
#Judgment
ML Engineer
•
Behavioral
•
medium
Tell me about a disagreement you had with a researcher. How did you resolve it?
#Communication
ML Engineer
•
Behavioral
•
easy
How do you keep up with the rapidly evolving ML landscape?
#Continuous Learning
ML Engineer
•
Coding
•
hard
How would you write a batched inference pipeline using Python and Triton server?
#Triton
#Batching
ML Engineer
•
Coding
•
medium
Implement a sliding window approach to detect anomalies in a time series.
#Anomaly Detection
#Time Series
ML Engineer
•
Coding
•
hard
Write a custom PyTorch Dataset and DataLoader for irregular time series data.
#PyTorch
#DataLoader
ML Engineer
•
Coding
•
hard
Implement logistic regression with gradient descent in NumPy.
#Logistic Regression
#NumPy
ML Engineer
•
Coding
•
hard
Implement a K-means clustering algorithm from scratch in Python.
#K-Means
#Clustering
ML Engineer
•
System Design
•
hard
Design a system to retrain models automatically when performance degrades.
#Retraining
#Automation
ML Engineer
•
System Design
•
hard
Design a search ranking system for an e-commerce platform.
#Ranking
#Relevance
ML Engineer
•
System Design
•
hard
Design a real-time content moderation system.
#NLP
#Real-Time
ML Engineer
•
System Design
•
hard
How would you build a personalized ad targeting system?
#Targeting
#ML Systems
ML Engineer
•
System Design
•
hard
Design a CI/CD pipeline for ML models.
#CI/CD
#Deployment
ML Engineer
•
System Design
•
hard
Design YouTube's video recommendation system end to end.
#Recommendations
#Ranking
ML Engineer
•
System Design
•
hard
What is a feature store? Design one from scratch.
#Feature Engineering
#MLOps
ML Engineer
•
System Design
•
hard
How would you serve a model that needs to respond in under 10ms?
#Low Latency
#Serving
ML Engineer
•
System Design
•
hard
Design a training and serving architecture for a large language model at scale.
#Infrastructure
#Scale
ML Engineer
•
Technical
•
hard
Explain knowledge distillation. When would you use it?
#Distillation
#Compression
ML Engineer
•
Technical
•
hard
What is the difference between model parallelism and data parallelism in distributed training?
#Parallelism
#Training
ML Engineer
•
Technical
•
medium
How do you version ML models and datasets? What tools do you use?
#Versioning
#DVC
#MLflow
ML Engineer
•
Technical
•
hard
Explain blue-green deployment vs canary deployment for ML models.
#Blue-Green
#Canary
ML Engineer
•
Technical
•
hard
How do you detect data drift vs model drift? How do you respond to each?
#Drift
#Production
ML Engineer
•
Technical
•
medium
What is shadow mode deployment in ML?
#Shadow Mode
#A/B Testing
ML Engineer
•
Technical
•
medium
Explain model serialization formats: ONNX, TorchScript, SavedModel.
#ONNX
#Serialization
ML Engineer
•
Technical
•
medium
What is Kubernetes? How is it used for ML model serving?
#Kubernetes
#Serving
ML Engineer
•
Technical
•
hard
How do you optimize GPU utilization during training?
#GPU
#Performance
ML Engineer
•
Technical
•
hard
Explain mixed precision training (FP16/BF16). What are the risks?
#Mixed Precision
#Performance
ML Engineer
•
Technical
•
medium
What are the differences between PyTorch and TensorFlow for production?
#PyTorch
#TensorFlow
ML Engineer
•
Technical
•
medium
How do you profile and debug a slow training run?
#Profiling
#Debugging
ML Engineer
•
Technical
•
hard
Explain the RLHF (Reinforcement Learning from Human Feedback) training approach.
#RLHF
#Fine-Tuning
ML Engineer
•
Technical
•
hard
What is LoRA (Low-Rank Adaptation)? How does it reduce fine-tuning costs?
#LoRA
#Fine-Tuning
ML Engineer
•
Technical
•
hard
What is RAG (Retrieval-Augmented Generation)? Describe its architecture.
#RAG
#Vector Search
ML Engineer
•
Technical
•
hard
How would you evaluate an LLM for a production use case?
#Evaluation
#Benchmarking
ML Engineer
•
Technical
•
medium
Explain vector databases. What are FAISS, Pinecone, and Weaviate?
#Vector DB
#Embeddings
ML Engineer
•
Technical
•
medium
What is model ensembling? When does it help, and when does it hurt?
#Ensembling
#Performance
ML Engineer
•
Technical
•
hard
How would you deploy a fraud detection model on AWS Lambda?
#Lambda
#Fraud
ML Engineer
•
Technical
•
hard
Explain how Amazon Personalize works internally.
#Personalize
#AWS
ML Engineer
•
Technical
•
hard
How would you use SageMaker for end-to-end MLOps?
#SageMaker
#AWS
ML Engineer
•
Technical
•
easy
What is the difference between a data scientist and an ML engineer?
#Roles
#MLOps
ML Engineer
•
Technical
•
medium
Explain the model training pipeline from raw data to deployment.
#Pipeline
#Training
ML Engineer
•
Technical
•
medium
What is the difference between online learning and offline learning?
#Online Learning
#Batch Learning
ML Engineer
•
Technical
•
medium
How do you handle missing data in ML model features?
#Imputation
#Missing Data
ML Engineer
•
Technical
•
medium
Explain gradient descent variants: batch, stochastic, and mini-batch.
#Gradient Descent
#Optimization
ML Engineer
•
Technical
•
medium
What are learning rate schedulers and why are they important?
#Learning Rate
#Training
ML Engineer
•
Technical
•
hard
Explain the attention mechanism in transformers with mathematical detail.
#Attention
#Transformers
ML Engineer
•
Technical
•
hard
What is quantization in neural networks? How does it reduce inference cost?
#Quantization
#Inference
Product Manager
•
Behavioral
•
medium
Describe a time when you had to make a critical product decision with incomplete data. How did you mitigate the risk?
#Bias for Action
#Risk Management
#Two-Way Doors
Product Manager
•
Behavioral
•
medium
Tell me about a time you used customer feedback to drive a significant pivot in your product roadmap. How did you balance this with existing business goals?
#Customer Obsession
#Roadmap Prioritization
#Stakeholder Management
Product Manager
•
Behavioral
•
medium
Tell me about a time you stepped outside your core PM responsibilities to ensure a product launch was successful.
#Ownership
#Cross-functional Collaboration
#Delivery
Product Manager
•
Behavioral
•
medium
Describe a situation where a key product metric dropped unexpectedly. How did you investigate the root cause, and what was the outcome?
#Dive Deep
#Data Analysis
#Incident Response
Product Manager
•
Behavioral
•
medium
Tell me about a time you had to deliver a critical product feature with a severely constrained timeline and limited engineering resources.
#Deliver Results
#Prioritization
#MVP
Product Manager
•
Behavioral
•
hard
Tell me about a time you strongly disagreed with your manager or a senior stakeholder about a product direction. How did you handle it?
#Have Backbone; Disagree and Commit
#Conflict Resolution
#Data-driven Influence
Product Manager
•
Behavioral
•
hard
Tell me about a time you proposed a bold, radical idea that completely changed the trajectory of your product.
#Think Big
#Innovation
#Risk Taking
Product Manager
•
Behavioral
•
hard
Tell me about a time you inherited a product team that was underperforming or had low morale. How did you build trust and turn the team around?
#Earn Trust
#Team Leadership
#Empathy
Product Manager
•
System Design
•
hard
Design a system for Amazon Lockers that handles package routing, locker availability, and user notifications during peak holiday seasons.
#Scalability
#Logistics
#Microservices
#Edge Cases
Product Manager
•
System Design
•
hard
Design the backend architecture for the 'Customers who bought this item also bought' recommendation widget on the Amazon product detail page.
#Recommendation Engines
#Data Pipelines
#Low Latency
Product Manager
•
Technical
•
medium
Amazon Fresh is seeing a 15% decline in repeat orders month-over-month. Walk me through how you would analyze this and what metrics you would look at.
#Retention
#Root Cause Analysis
#E-commerce Grocery
Product Manager
•
Technical
•
hard
If you were the PM for Amazon Prime Video, how would you increase engagement among users who only log in to watch live sports?
#User Engagement
#Market Expansion
#Prime Video
Product Manager
•
Technical
•
medium
Estimate the total addressable market (TAM) for Amazon introducing a smart, Alexa-enabled microwave in Europe.
#TAM/SAM/SOM
#Fermi Problem
#Smart Home IoT
Product Manager
•
Technical
•
medium
Design a new feature for the Amazon Shopping app specifically tailored for elderly users who are not tech-savvy.
#Accessibility
#UX Design
#Customer Segmentation
Product Manager
•
Technical
•
hard
How would you design the pricing model and feature tiers for a new AWS serverless database product targeting early-stage startups?
#Pricing Strategy
#AWS
#B2B SaaS
#Go-to-Market
Software Engineer
•
Behavioral
•
medium
Tell me about a complex process or system you inherited or worked on that you felt was overly complicated. What steps did you take to simplify it?
#Invent and Simplify
#Refactoring
#Process Improvement
Software Engineer
•
Behavioral
•
medium
Tell me about a time when you had to make a technical decision that prioritized the customer experience over engineering convenience or project timelines.
#Customer Obsession
#Trade-offs
#Decision Making
Software Engineer
•
Behavioral
•
medium
Describe a situation where you had a tight deadline to deliver a feature for a major launch, but you realized the initial architecture was flawed. How did you handle it?
#Deliver Results
#Bias for Action
#Agile Problem Solving
Software Engineer
•
Behavioral
•
hard
Tell me about a time you strongly disagreed with your manager or a senior engineer about a system design choice. How did you push back, and what was the outcome?
#Have Backbone; Disagree and Commit
#Conflict Resolution
#Communication
Software Engineer
•
Behavioral
•
medium
Give me an example of a time when a system was failing intermittently, and the root cause was not obvious. How did you go about diagnosing and fixing the issue?
#Dive Deep
#Debugging
#Root Cause Analysis
Software Engineer
•
Coding
•
medium
Amazon is deploying delivery robots in a grid-like warehouse. You are given a 2D grid representing the warehouse, where '0' is an empty space, '1' is an obstacle, and '2' is a charging station. Find the shortest distance from every empty space to the nearest charging station.
#Breadth-First Search
#Matrix
#Shortest Path
Software Engineer
•
Coding
•
hard
AWS has several data centers. You are given an array of integers representing the processing power of servers. You want to partition the servers into two clusters such that the absolute difference between the total processing power of the two clusters is minimized. Return the minimum difference.
#Dynamic Programming
#Knapsack
#Optimization
Software Engineer
•
Coding
•
medium
You are given a list of Amazon fulfillment centers and the roads connecting them. Find the minimum cost to connect all fulfillment centers such that there is a path between any two centers. If it's not possible, return -1.
#Graphs
#Minimum Spanning Tree
#Union Find
Software Engineer
•
Coding
•
hard
Given a string representing a sequence of customer page views on Amazon.com, find the length of the longest contiguous sequence of page views that contains at most k distinct product categories.
#Sliding Window
#Hash Map
#Strings
Software Engineer
•
Coding
•
medium
Amazon Logistics needs to optimize delivery routes. Given an array of points representing delivery locations on a 2D plane and an integer k, return the k closest delivery locations to the origin (0, 0).
#Heap
#Priority Queue
#Sorting
#Geometry
Software Engineer
•
System Design
•
hard
Design the Amazon shopping cart service. It needs to be highly available, handle millions of concurrent users during Prime Day, and ensure that items added to the cart are never lost, even in the event of database node failures.
#High Availability
#DynamoDB
#Eventual Consistency
#Conflict Resolution
Software Engineer
•
System Design
•
hard
Design a flash sale system for Amazon Prime Day where a highly sought-after item has only 100 units in stock, but millions of users are trying to buy it simultaneously. How do you prevent overselling?
#Distributed Locking
#Redis
#Concurrency
#Database Transactions
Software Engineer
•
System Design
•
hard
Design the backend for Amazon Prime Video's 'Continue Watching' feature, ensuring cross-device synchronization with sub-second latency.
#Microservices
#WebSockets
#Data Synchronization
#Cassandra/DynamoDB
Software Engineer
•
System Design
•
medium
Design a system to track the top 10 best-selling products on Amazon in real-time across different categories.
#Stream Processing
#Top K Problem
#Count-Min Sketch
#Kafka/Kinesis
Software Engineer
•
Technical
•
medium
Explain how you would implement a thread-safe bounded blocking queue for an order processing pipeline. What synchronization primitives would you use and why?
#Multithreading
#Mutex
#Condition Variables
#Producer-Consumer
Difficulty Radar
Based on recent AI-sourced data.
Meet Your Interviewers
The "Standard" Interviewer
Senior EngineerFocuses on core competencies, system constraints, and clear communication.
SimulateUnwritten Rules
Think Out Loud
Always explain your thought process before writing code or drawing architecture.