Twitter / X
Real-time social platform with petabyte-scale data and ML ranking systems.
4 Rounds
~14 Days
Hard
The Interview Loop
Recruiter Screen (30 min)
Standard fit check, behavioral questions, and resume overview.
Technical Loop (3-4 Rounds)
Deep dive into domain knowledge, coding, and system design.
Interview Question Bank
DevOps Engineer
•
Behavioral
•
medium
You receive an alert that API latency has spiked by 400% globally. Walk me through your incident response process from acknowledgment to resolution.
#SRE
#Incident Response
#On-call
#Post-mortem
DevOps Engineer
•
Behavioral
•
hard
X moves extremely fast and sometimes breaks things. Tell me about a time you had to bypass standard procedures to ship something critical on a tight deadline.
#Agility
#Risk Management
#Decision Making
DevOps Engineer
•
Behavioral
•
medium
Tell me about a time you had to work extremely long hours to resolve a critical production outage. How did you handle the pressure and team dynamics?
#Resilience
#Teamwork
#High Pressure
DevOps Engineer
•
Behavioral
•
medium
You are given 5 critical P0 issues at the exact same time during a major site-wide event. How do you prioritize and handle them?
#Prioritization
#Incident Management
#Communication
DevOps Engineer
•
Behavioral
•
medium
Tell me about a time you strongly disagreed with a technical decision made by leadership. How did you handle it, and what was the outcome?
#Conflict Resolution
#Communication
#Ownership
DevOps Engineer
•
Coding
•
easy
Write a script in Python or Bash to parse a massive Nginx access log file, extract all IP addresses, and return the top 10 IPs that encountered 5xx HTTP status codes.
#Bash
#Python
#Log Parsing
#Regex
DevOps Engineer
•
Coding
•
hard
Implement a basic rate limiter in Python using a token bucket algorithm to protect our Tweet posting API from abuse.
#Python
#Rate Limiting
#Concurrency
#Algorithms
DevOps Engineer
•
Coding
•
easy
Write a script to recursively find and delete files older than 30 days in a given directory, but keep files with a '.log' extension.
#Bash
#Linux
#File Management
DevOps Engineer
•
Coding
•
medium
Given a list of server logs with timestamps, write a program to find the peak traffic window of exactly 5 minutes.
#Sliding Window
#Data Processing
#Time Series
DevOps Engineer
•
Coding
•
easy
Write a function to validate if a given string is a valid IPv4 or IPv6 address without using built-in IP validation libraries.
#String Manipulation
#Networking
#Validation
DevOps Engineer
•
Coding
•
medium
Write a script to query a Prometheus HTTP API endpoint and trigger a Slack webhook alert if CPU usage across a cluster exceeds 90% for 5 consecutive minutes.
#Python
#Prometheus
#API Integration
#Alerting
DevOps Engineer
•
Coding
•
hard
Write a Terraform module to provision an AWS EKS cluster with managed node groups, VPC CNI, and IAM OIDC integration.
#Terraform
#AWS
#EKS
#IAM
DevOps Engineer
•
Coding
•
medium
Write a Bash script to monitor disk usage. If any partition exceeds 85%, the script should find the top 5 largest directories in that partition and email the report.
#Bash
#Linux
#Monitoring
DevOps Engineer
•
System Design
•
hard
We recently migrated a significant portion of our infrastructure from cloud back to bare-metal to optimize costs. Walk me through how you would architect the automated provisioning of 10,000 bare-metal servers across multiple data centers.
#Bare-metal
#Automation
#PXE
#Ansible
#Data Center
DevOps Engineer
•
System Design
•
hard
Design a multi-region Kubernetes cluster architecture for X's timeline service to ensure 99.99% uptime, even if an entire region goes offline.
#Kubernetes
#High Availability
#Disaster Recovery
#Global Load Balancing
DevOps Engineer
•
System Design
•
hard
Design the caching layer for the X timeline using Redis to handle millions of concurrent reads and thousands of writes per second.
#Redis
#Caching
#High Throughput
#Data Modeling
DevOps Engineer
•
System Design
•
medium
Design a monitoring and alerting stack for a newly launched live video streaming feature on X. What metrics are most critical?
#Prometheus
#Grafana
#Video Streaming
#SLIs/SLOs
DevOps Engineer
•
System Design
•
medium
How do you implement zero-downtime deployments for a stateless microservice receiving 100k requests per second?
#Deployment Strategies
#Load Balancing
#Kubernetes
DevOps Engineer
•
System Design
•
hard
Explain how you would implement aggressive auto-scaling for stateless microservices to handle sudden viral events (like the Super Bowl) while minimizing idle compute costs.
#Auto-scaling
#KEDA
#Kubernetes
#Cost Optimization
DevOps Engineer
•
System Design
•
hard
We are moving away from managed cloud services to self-hosted solutions to save money. How would you design a highly available, self-hosted Kafka cluster across 3 data centers?
#Kafka
#Distributed Systems
#Bare-metal
#High Availability
DevOps Engineer
•
Technical
•
hard
X has a massive monorepo. How would you optimize our CI pipeline to reduce build and test times from 45 minutes to under 10 minutes?
#Bazel
#Caching
#Parallelization
#Monorepo
DevOps Engineer
•
Technical
•
medium
We need to reduce our AWS/GCP cloud bill by $10M a month. Walk me through your strategy to identify and eliminate waste in a Kubernetes-heavy environment.
#FinOps
#Kubernetes
#Cloud Compute
#Resource Requests/Limits
DevOps Engineer
•
Technical
•
medium
What exactly happens under the hood when you run `kubectl apply -f deployment.yaml`?
#Kubernetes Architecture
#API Server
#etcd
#Kubelet
#Controllers
DevOps Engineer
•
Technical
•
easy
How do you troubleshoot a Kubernetes pod that is stuck in CrashLoopBackOff, especially if the container logs are completely empty?
#Kubernetes
#Debugging
#Containers
DevOps Engineer
•
Technical
•
medium
A user complains that images on X are loading slowly in a specific geographic region (e.g., Southeast Asia). How do you troubleshoot this?
#CDN
#DNS
#Latency
#BGP
#Traceroute
DevOps Engineer
•
Technical
•
hard
Explain the TCP 3-way handshake. How would you tune TCP parameters on a Linux kernel to handle millions of high-throughput, low-latency connections?
#TCP/IP
#Linux Kernel
#Sysctl
#Networking
DevOps Engineer
•
Technical
•
hard
How do you perform a schema migration on a massive PostgreSQL database table with billions of rows without locking the table or causing downtime?
#PostgreSQL
#Database Migrations
#Zero-downtime
DevOps Engineer
•
Technical
•
medium
How do you manage Terraform state files in a team of 50 engineers to prevent race conditions, state corruption, and security leaks?
#Terraform
#State Management
#Security
DevOps Engineer
•
Technical
•
easy
What is an inode? What happens when a Linux system runs out of inodes even if there is plenty of disk space left, and how do you fix it?
#Linux Filesystem
#Inodes
#Troubleshooting
DevOps Engineer
•
Technical
•
hard
How would you handle stateful sets in Kubernetes for a high-throughput, distributed database like Cassandra?
#StatefulSets
#Cassandra
#Persistent Volumes
#Storage
DevOps Engineer
•
Technical
•
medium
Explain the difference between eventual consistency and strong consistency. Give an example of where you would use each within X's architecture.
#Distributed Systems
#CAP Theorem
#Databases
DevOps Engineer
•
Technical
•
medium
Describe your approach to managing secrets in a CI/CD pipeline. How do you prevent developers from accidentally hardcoding API keys?
#Secret Management
#Vault
#CI/CD
#DevSecOps
DevOps Engineer
•
Technical
•
medium
How does DNS resolution work? Walk me through the steps, and explain how you would configure DNS failover for a global service.
#DNS
#Failover
#Routing
DevOps Engineer
•
Technical
•
medium
A PostgreSQL database is experiencing 100% CPU utilization and extremely slow queries. How do you identify the root cause and resolve it?
#PostgreSQL
#Performance Tuning
#Troubleshooting
DevOps Engineer
•
Technical
•
medium
How do you calculate and enforce Error Budgets and SLOs for a critical service like the Tweet posting API?
#SLOs
#Error Budgets
#SRE Practices
Difficulty Radar
Based on recent AI-sourced data.
Meet Your Interviewers
The "Standard" Interviewer
Senior EngineerFocuses on core competencies, system constraints, and clear communication.
SimulateUnwritten Rules
Think Out Loud
Always explain your thought process before writing code or drawing architecture.