Twitter / X

Real-time social platform with petabyte-scale data and ML ranking systems.

4 Rounds ~14 Days Hard

Start Mock Interview

The Interview Loop

Recruiter Screen (30 min)

Standard fit check, behavioral questions, and resume overview.

Technical Loop (3-4 Rounds)

Deep dive into domain knowledge, coding, and system design.

Interview Question Bank

All Roles Backend Engineer 35 Cloud Engineer 35 Data Engineer 35 Data Scientist 35 DevOps Engineer 35 Frontend Engineer 35 Full Stack Engineer 35 Machine Learning Engineer 35 Product Manager 35 Software Engineer 35

All Topics Algorithms 9 SQL 7 System Design 7 Culture Fit 5 Big Data Frameworks 5 Data Modeling 1 Data Validation 1

Data Engineer • Behavioral • medium

Tell me about a time you had to ship a critical data pipeline under an extremely tight, almost impossible deadline.

#Time Management #Prioritization #Hardcore Work Ethic

Practice

Data Engineer • Behavioral • medium

Describe a situation where you identified a massive cost inefficiency in cloud infrastructure and the steps you took to fix it.

#Cost Optimization #Proactivity #Cloud Architecture

Practice

Data Engineer • Behavioral • medium

How do you handle working in an environment with high ambiguity, minimal documentation, and rapidly changing product requirements?

#Adaptability #Ambiguity #Communication

Practice

Data Engineer • Behavioral • medium

Tell me about a time you disagreed with a senior engineer about a technical architecture decision. How was it resolved?

#Conflict Resolution #Technical Communication #Ego

Practice

Data Engineer • Behavioral • medium

Walk me through a time when a data pipeline you owned failed in production, causing downstream impact. How did you debug and resolve it?

#Incident Management #Debugging #Accountability

Practice

Data Engineer • Coding • medium

Parse a log file of Twitter events to find the top 10 most active users in a given 1-hour window.

#Log Parsing #Hash Maps #Sorting #Priority Queue

Practice

Data Engineer • Coding • medium

Implement a sliding window algorithm to count the number of tweets containing a specific hashtag in the last 5 minutes.

#Sliding Window #Queues #Real-time Processing

Practice

Data Engineer • Coding • hard

Given K sorted streams of tweet IDs (in chronological order), merge them into a single sorted stream.

#Heaps #Pointers #Stream Processing

Practice

Data Engineer • Coding • hard

Find the shortest path between two users in the Twitter follower graph.

#Graphs #BFS #Bidirectional Search

Practice

Data Engineer • Coding • medium

Implement a rate limiter for the Twitter API using a token bucket algorithm.

#Concurrency #System Design #Object-Oriented Design

Practice

Data Engineer • Coding • easy

Given a list of trending search terms, group the anagrams together.

#Strings #Hash Maps

Practice

Data Engineer • Coding • medium

Design an algorithm to find the Top K frequent words in a continuous stream of tweets (Heavy Hitters problem).

#Count-Min Sketch #Heaps #Streaming Algorithms

Practice

Data Engineer • Coding • medium

Implement an LRU Cache to store recently accessed user profiles.

#Linked Lists #Hash Maps #Caching

Practice

Data Engineer • Coding • medium

Implement a Trie data structure to support Twitter search autocomplete.

#Trees #Trie #String Manipulation

Practice

Data Engineer • Coding • easy

Write a function to validate a JSON payload representing a Tweet object, ensuring all required fields are present and correctly typed.

#JSON #Type Checking #Error Handling

Practice

Data Engineer • System Design • medium

Design a relational data model for Twitter Spaces analytics, tracking hosts, listeners, and duration.

#Entity-Relationship #Normalization #Fact/Dimension Tables

Practice

Data Engineer • System Design • hard

Design the data pipeline for Twitter's View Count feature, ensuring real-time updates and high throughput.

#Stream Processing #Kafka #Redis #Event Sourcing

Practice

Data Engineer • System Design • hard

Design a real-time trending topics system capable of processing millions of tweets per second.

#Heavy Hitters #Stream Processing #Distributed Systems

Practice

Data Engineer • System Design • hard

How would you architect the migration of a massive on-premise Hadoop cluster to GCP BigQuery with zero downtime?

#Cloud Migration #BigQuery #Dual Writes #Data Validation

Practice

Data Engineer • System Design • hard

Design an ad-click attribution pipeline that handles late-arriving events and ensures exactly-once processing.

#Exactly-Once Semantics #Watermarks #Data Pipelines

Practice

Data Engineer • System Design • hard

Architect a streaming system to detect spam and bot activity in real-time as tweets are published.

#Machine Learning Pipelines #Real-time Streaming #Feature Engineering

Practice

Data Engineer • System Design • hard

Design a data lake architecture for storing, partitioning, and querying 10PB of daily tweet logs efficiently.

#Data Lake #Partitioning #Parquet #Iceberg/Hudi

Practice

Data Engineer • System Design • hard

How would you design the batch and streaming data pipelines to generate features for the 'For You' timeline recommendation engine?

#Feature Store #Lambda Architecture #Graph Processing

Practice

Data Engineer • Technical • medium

Write a SQL query to calculate the 7-day rolling average of tweets per user.

#Window Functions #Aggregations #Time Series

Practice

Data Engineer • Technical • medium

Write a SQL query to find the top 3 trending hashtags per country on a given day using window functions.

#Window Functions #Ranking #CTEs

Practice

Data Engineer • Technical • medium

Write a SQL query to find users who have retweeted a specific tweet but do not follow the original author.

#Joins #Subqueries #Set Operations

Practice

Data Engineer • Technical • hard

Write a SQL query to calculate the conversion rate of ad impressions to clicks within a 1-hour window for each ad campaign.

#Time-based Joins #Aggregations #Performance Tuning

Practice

Data Engineer • Technical • easy

Write a SQL query to identify potential bots by finding users who tweeted more than 100 times in a single minute.

#GROUP BY #HAVING #Date Truncation

Practice

Data Engineer • Technical • medium

Write a SQL query to find the median number of followers for users who joined X in 2023.

#Percentiles #Window Functions #Statistics

Practice

Data Engineer • Technical • medium

Given a table of user follows, write a SQL query to find all mutuals (users who follow each other).

#Self Joins #Filtering

Practice

Data Engineer • Technical • hard

Explain how you would optimize a PySpark job that is suffering from severe data skew due to a viral tweet from Elon Musk.

#Spark #Data Skew #Salting #Broadcast Joins

Practice

Data Engineer • Technical • medium

How does Kafka handle message ordering, and how would you ensure ordered processing of a single user's tweets across partitions?

#Kafka #Partitioning #Message Ordering

Practice

Data Engineer • Technical • medium

Compare Apache Flink and Spark Streaming. Which would you choose for calculating real-time engagement metrics at X, and why?

#Flink #Spark Streaming #Micro-batching vs Native Streaming

Practice

Data Engineer • Technical • easy

Explain the differences between Parquet and Avro file formats. When would you use each in our data ecosystem?

#File Formats #Parquet #Avro #Columnar vs Row-based

Practice

Data Engineer • Technical • hard

How would you handle exactly-once processing semantics in a Kafka to BigQuery streaming pipeline?

#Exactly-Once #Kafka #BigQuery #Idempotency

Practice

Difficulty Radar

Based on recent AI-sourced data.

Meet Your Interviewers

The "Standard" Interviewer

Senior Engineer

Focuses on core competencies, system constraints, and clear communication.

Simulate

Unwritten Rules

Think Out Loud

Always explain your thought process before writing code or drawing architecture.

Practice Now