Meta

Meta

Social media and metaverse company behind Facebook, Instagram, and WhatsApp.

4 Rounds ~21 Days Very Hard
Start Mock Interview

The Interview Loop

Recruiter Screen (30 min)

Standard fit check, behavioral questions, and resume overview.

Technical Loop (3-4 Rounds)

Deep dive into domain knowledge, coding, and system design.

Interview Question Bank

Machine Learning Engineer Behavioral medium

Tell me about a time you had a fundamental disagreement with a cross-functional partner, such as a Product Manager, regarding the choice of an ML metric versus a business metric.

#Conflict Resolution #Cross-functional Collaboration #Business Acumen
Machine Learning Engineer Behavioral medium

Describe a situation where a machine learning model you deployed degraded in production. How did you detect the degradation, and what steps did you take to resolve it?

#Model Monitoring #Incident Response #Ownership
Machine Learning Engineer Behavioral medium

Tell me about a time you had to make a difficult trade-off between model accuracy and inference latency. How did you approach the decision?

#Trade-offs #Optimization #System Constraints
Machine Learning Engineer Behavioral medium

Give an example of a project where you had to pivot your technical strategy halfway through due to changing business requirements or unexpected technical roadblocks.

#Agility #Problem Solving #Resilience
Machine Learning Engineer Coding medium

Given two sparse vectors represented as arrays of non-zero elements and their indices, write a function to compute their dot product. Optimize for both time and space complexity.

#Arrays #Hash Table #Two Pointers
Machine Learning Engineer Coding medium

Write a function to sample a batch of data from a large dataset on disk without loading the entire dataset into memory. Implement a custom PyTorch-like DataLoader class with __iter__ and __next__ methods that supports shuffling and batching.

#Object-Oriented Design #Data Structures #Generators
Machine Learning Engineer Coding medium

Given a binary tree, write an algorithm to find the lowest common ancestor (LCA) of two given nodes. Assume each node has a pointer to its parent.

#Trees #Pointers #Hash Table
Machine Learning Engineer Coding medium

Implement a function to calculate the Intersection over Union (IoU) of two bounding boxes. Extend this to implement Non-Maximum Suppression (NMS) for a list of bounding boxes and their confidence scores.

#Computer Vision #Geometry #Sorting
Machine Learning Engineer System Design hard

Design the machine learning architecture for Instagram Reels recommendations. How would you structure the funnel from candidate generation to final ranking?

#Recommendation Systems #Two-Tower Models #Ranking #Candidate Generation
Machine Learning Engineer System Design hard

Design an Ads Click-Through Rate (CTR) prediction system for Meta's news feed. How do you handle the extreme class imbalance and delayed feedback in ad clicks?

#Ads Ranking #Imbalanced Data #Streaming Pipelines #DLRM
Machine Learning Engineer System Design hard

Design a multimodal content moderation system to detect hate speech in Facebook posts containing both text and images. How do you fuse the modalities?

#Multimodal ML #Classification #NLP #Computer Vision
Machine Learning Engineer System Design hard

Design the 'People You May Know' (PYMK) feature. How would you scale the graph traversals and ML inference to billions of users?

#Graph Neural Networks #Link Prediction #Batch Processing #Scalability
Machine Learning Engineer Technical medium

In a deep learning recommendation model (DLRM), how do you handle the explosion of vocabulary size for categorical features like user IDs or item IDs?

#Embeddings #Hashing #Memory Optimization
Machine Learning Engineer Technical medium

Explain the difference between Contrastive Loss and Triplet Loss. In what scenarios would you choose one over the other for training a retrieval model?

#Loss Functions #Metric Learning #Retrieval
Machine Learning Engineer Technical hard

You are training a large PyTorch model across multiple GPUs using DistributedDataParallel (DDP) and notice that the GPU utilization is consistently low (around 30%). How do you diagnose and fix this?

#PyTorch #Distributed Training #Performance Profiling

Difficulty Radar

Based on recent AI-sourced data.

Meet Your Interviewers

The "Standard" Interviewer

Senior Engineer

Focuses on core competencies, system constraints, and clear communication.

Simulate

Unwritten Rules

Think Out Loud

Always explain your thought process before writing code or drawing architecture.

Practice Now