Anthropic
AI safety and research company behind Claude, focusing on constitutional AI.
5 Rounds
~20 Days
Very Hard
The Interview Loop
Recruiter Screen (30 min)
Standard fit check, behavioral questions, and resume overview.
Technical Loop (3-4 Rounds)
Deep dive into domain knowledge, coding, and system design.
Interview Question Bank
Backend Engineer
•
Coding
•
hard
Write a function to merge K sorted asynchronous streams of data into a single sorted stream. You cannot load all data into memory at once.
#Heaps
#Asynchronous Programming
#Streaming
Backend Engineer
•
Coding
•
medium
Given a massive log file of API requests, write a script to find the 99th percentile latency. The file is too large to fit into memory.
#Data Processing
#Approximation Algorithms
#File I/O
Backend Engineer
•
Coding
•
hard
Given a stream of tokens (strings), implement a data structure to efficiently find the top K most frequent tokens in a sliding window of the last N minutes.
#Streaming Data
#Heaps
#Sliding Window
Backend Engineer
•
Coding
•
hard
Write a program to justify text. Given an array of words and a max width, format the text such that each line has exactly max width characters and is fully (left and right) justified.
#String Manipulation
#Array
#Simulation
Backend Engineer
•
Coding
•
hard
Given a string representing a user prompt, find the longest repeating substring. This is useful for detecting repetitive loops in context windows.
#String Manipulation
#Dynamic Programming
#Suffix Trees
Backend Engineer
•
Coding
•
hard
Implement a streaming JSON parser that can take chunks of a JSON string (as they are generated by an LLM) and yield valid parsed objects as soon as they are complete.
#Parsing
#State Machines
#String Manipulation
Backend Engineer
•
Coding
•
medium
Implement a thread-safe Rate Limiter using the Token Bucket algorithm. It should support multiple users and handle concurrent requests efficiently.
#Concurrency
#Data Structures
#API Design
Backend Engineer
•
Coding
•
hard
Write an asynchronous task scheduler in Python (using asyncio) or Rust (using tokio) that executes a DAG (Directed Acyclic Graph) of tasks with maximum concurrency.
#Graph Theory
#Asynchronous Programming
#Concurrency
Backend Engineer
•
Coding
•
medium
Implement a deep copy function for a complex graph data structure that may contain cycles. Ensure that nodes are duplicated correctly without infinite loops.
#Graph Theory
#Recursion
#Hash Map
Cloud Engineer
•
Coding
•
medium
Write a Go program that concurrently health-checks a list of internal model endpoints. It should implement a worker pool, timeout after 2 seconds per request, and aggregate the results into a summary report.
#Go
#Concurrency
#Networking
#Error Handling
Cloud Engineer
•
Coding
•
hard
Given a JSON response from a cloud API containing nested resource dependencies, write an algorithm to determine the correct deletion order.
#Graphs
#Topological Sort
#DFS
#JSON Parsing
Data Engineer
•
Coding
•
medium
Write a Python generator function to efficiently parse a 500GB JSONL file containing web crawl data, filtering out documents that do not contain a specific set of keywords, without loading the entire file into memory.
#Python
#Generators
#Memory Management
#File I/O
Data Engineer
•
Coding
•
hard
Write a Python function to efficiently find near-duplicate text documents in a large corpus. You do not need to implement the full distributed system, but implement the core hashing logic (e.g., MinHash) and explain how you would scale it across a cluster.
#Hashing
#Text Processing
#Optimization
Data Engineer
•
Coding
•
medium
Write a Python program that takes a massive JSONL file of Wikipedia articles and chunks the text into overlapping segments of exactly 512 tokens (assume a simple whitespace tokenizer for this exercise), while preserving the document metadata in each chunk. The file is larger than available RAM.
#Generators
#Memory Management
#Text Processing
Data Engineer
•
Coding
•
medium
Implement a rate limiter in Python for our API. The rate limiter should allow a user to make up to N requests per minute, but also enforce a maximum of M tokens generated per day. How would you make this distributed across multiple API servers?
#Data Structures
#Concurrency
#API Design
Data Engineer
•
Coding
•
medium
Implement a Trie (Prefix Tree) data structure in Python. Then, write a method to find all words in the Trie that share a given prefix. Explain how this relates to LLM tokenization.
#Data Structures
#Trees
#String Manipulation
Data Engineer
•
Coding
•
hard
You have a stream of incoming chat logs. Write a Python algorithm to maintain the top K most frequent words over a sliding window of 1 hour.
#Streaming Algorithms
#Heaps
#Sliding Window
Data Engineer
•
Coding
•
medium
Write a Python script that implements a custom MapReduce framework using the `multiprocessing` library to count the frequency of n-grams in a large corpus of text files.
#Concurrency
#MapReduce
#Python
Data Engineer
•
Coding
•
hard
Given a directed acyclic graph (DAG) representing data pipeline dependencies, write a Python function to execute the tasks in parallel where possible, respecting the dependency order. Assume each task is a sleep function.
#Graphs
#Topological Sort
#Concurrency
Data Engineer
•
Coding
•
hard
Given a massive string of text, write an algorithm to find the longest repeating substring. This is a simplified version of finding duplicated boilerplate text in web scrapes.
#String Algorithms
#Suffix Arrays
#Dynamic Programming
Data Engineer
•
Coding
•
medium
We need to create a pre-training dataset with a specific language distribution (e.g., 60% English, 20% Spanish, 20% French). Write a script to sample proportionally from a massive, unsorted stream of multilingual documents.
#Sampling
#Probability
#Streaming Algorithms
Data Engineer
•
Coding
•
medium
Write a function that takes a stream of text and a target keyword, and returns a sliding window of N tokens before and after every occurrence of the keyword. Handle edge cases like overlapping windows.
#Sliding Window
#Text Processing
#Queues
Data Engineer
•
Coding
•
hard
Given a massive dataset of text documents, implement a MinHash and Locality-Sensitive Hashing (LSH) algorithm in Python to identify near-duplicate documents. How would you scale this across a distributed cluster?
#Hashing
#Deduplication
#Big Data
#Distributed Systems
Data Engineer
•
Coding
•
hard
Given two large documents, write an algorithm to find the longest common contiguous substring. This is used in our pipeline to detect data contamination between training and evaluation sets.
#Dynamic Programming
#Suffix Trees
#Strings
Data Engineer
•
Coding
•
medium
Write a program to compute the top K most frequent tokens in a continuous, infinite stream of text. Optimize for both time and space complexity.
#Heaps
#Hash Maps
#Streaming
Data Engineer
•
Coding
•
hard
Implement a thread-safe Token Bucket rate limiter in Python. This will be used to throttle incoming requests to our data ingestion API to prevent overwhelming the downstream Kafka cluster.
#Concurrency
#Rate Limiting
#System Design
Data Engineer
•
Coding
•
easy
Given a list of text spans representing PII (Personally Identifiable Information) redactions in a document, where each span is a tuple of (start_index, end_index), write a function to merge all overlapping spans.
#Intervals
#Arrays
#Sorting
Data Engineer
•
Coding
•
medium
Write a Python function to process a 500GB JSONL file of raw text data. You need to filter out documents containing specific blocklisted keywords, compute a basic word count across the valid documents, and output the clean data to a new file. You have 8GB of RAM.
#Python
#Generators
#Memory Management
#I/O
Data Engineer
•
Coding
•
hard
Implement a distributed rate limiter in Python. Assume this will be used to throttle API requests for our Claude models based on a user's tier (e.g., tokens per minute).
#Concurrency
#Redis
#Token Bucket
#Distributed Systems
Data Engineer
•
Coding
•
medium
Given a list of overlapping time intervals representing periods when a GPU cluster was fully utilized, write a function to merge all overlapping intervals and return the total duration of full utilization.
#Sorting
#Intervals
#Python
Data Scientist
•
Coding
•
hard
Implement an algorithm to find the longest common substring between two large text prompts. We use this to identify potential prompt injection templates spreading among users.
#Dynamic Programming
#String Manipulation
#Security
Data Scientist
•
Coding
•
medium
Write a Python function to efficiently deduplicate a massive dataset of text documents (billions of tokens) prior to model pre-training. What algorithmic approach would you use?
#Python
#Data Deduplication
#MinHash
#LSH
Data Scientist
•
Coding
•
medium
Implement a function in Python to calculate the Elo rating update for two LLMs given a human preference rating (win, loss, or tie).
#Python
#Math
#Algorithms
Data Scientist
•
Coding
•
medium
Write a Python function using NumPy to efficiently compute the cosine similarity between a single target embedding vector and a matrix of 1 million document embeddings.
#Python
#NumPy
#Linear Algebra
Data Scientist
•
Coding
•
medium
Implement a stratified sampling algorithm to select 10,000 prompt-response pairs for human evaluation, ensuring the sample exactly matches the real-world distribution of 15 different safety categories.
#Python
#Sampling
#Statistics
DevOps Engineer
•
Coding
•
easy
Write a function to implement a basic Round Robin load balancer. It should take a list of servers and return the next server to route a request to.
#Load Balancing
#Data Structures
DevOps Engineer
•
Coding
•
hard
Given a list of overlapping IP CIDR blocks, write a function to merge them into the minimum number of non-overlapping CIDR blocks.
#Networking
#Algorithms
#Intervals
DevOps Engineer
•
Coding
•
medium
Implement a basic rate limiter class in Python or Go using the Token Bucket algorithm.
#Concurrency
#Algorithms
#System Design
Frontend Engineer
•
Coding
•
medium
Write a utility function to deeply merge two complex JavaScript objects, handling arrays and nested objects appropriately.
#JavaScript
#Recursion
#Data Structures
Frontend Engineer
•
Coding
•
hard
Implement a diff viewer component that takes two strings (e.g., an original prompt and an AI-edited prompt) and highlights the insertions and deletions.
#String Manipulation
#Dynamic Programming
#React
Frontend Engineer
•
Coding
•
medium
Implement a robust retry mechanism with exponential backoff for a fetch request that calls an unreliable LLM inference API.
#Asynchronous JavaScript
#Promises
#Error Handling
Frontend Engineer
•
Coding
•
easy
Implement a rate-limiter utility on the frontend to prevent a user from accidentally spamming the 'Generate' button and exhausting their API quota.
#JavaScript
#Throttling
#UX
Frontend Engineer
•
Coding
•
easy
Write a function that takes a deeply nested JSON object representing an AI's structured output and flattens it into a single-level object with dot-notation keys.
#JavaScript
#Recursion
#Object Manipulation
Full Stack Engineer
•
Coding
•
medium
Implement an LRU (Least Recently Used) cache with a Time-To-Live (TTL) feature to temporarily store frequent, identical prompt responses and reduce inference load.
#Data Structures
#Caching
#Hash Maps
#Linked Lists
Full Stack Engineer
•
Coding
•
medium
Write a function to merge overlapping text highlights. Given an array of objects representing start and end indices of safety flags in a text, return a merged array of non-overlapping intervals.
#Intervals
#Sorting
#Arrays
Full Stack Engineer
•
Coding
•
hard
Write an algorithm to efficiently diff two versions of a large text document and highlight the insertions and deletions. This is used to show users how their prompt edits changed the context.
#Dynamic Programming
#Strings
#Diff Algorithms
Full Stack Engineer
•
Coding
•
hard
Implement a custom JSON parser that can gracefully handle and 'fix' truncated JSON strings. This is common when an LLM output stops mid-generation due to max token limits.
#Parsing
#Strings
#Error Handling
#AST
Full Stack Engineer
•
Coding
•
hard
Implement a Markdown parser function in TypeScript that can render code blocks with syntax highlighting *while* the text is still streaming in chunk by chunk.
#Parsing
#TypeScript
#Streaming
#State Machines
Full Stack Engineer
•
Coding
•
medium
Given a massive log file of API requests, write a Python script to find the top 5 users who consumed the most tokens in any sliding 1-hour window.
#Python
#Sliding Window
#Data Processing
Machine Learning Engineer
•
Coding
•
medium
Write an algorithm to find the longest common substring between two large text documents efficiently.
#Dynamic Programming
#Strings
#Suffix Trees
Machine Learning Engineer
•
Coding
•
medium
Write an algorithm to efficiently sample from a logits distribution using Top-K and Top-P (Nucleus) sampling.
#Probability
#Sampling
#Sorting
Machine Learning Engineer
•
Coding
•
medium
Given a stream of generated tokens, write a highly optimized Trie-based data structure to filter out a dynamic list of toxic phrases in real-time.
#Data Structures
#Trie
#Streaming
Machine Learning Engineer
•
Coding
•
hard
Given a sequence of characters and a vocabulary of merges, implement the Byte-Pair Encoding (BPE) tokenization merging algorithm.
#Tokenization
#NLP
#Greedy Algorithms
Machine Learning Engineer
•
Coding
•
medium
Implement a basic tokenizer using Byte-Pair Encoding (BPE) given a corpus of text and a target vocabulary size.
#NLP
#Tokenization
#String Processing
Machine Learning Engineer
•
Coding
•
easy
Given a string representing a mathematical expression, write a tokenizer that converts it into a list of valid tokens (numbers, operators, parentheses). Handle multi-digit numbers and ignore whitespace.
#Tokenization
#Parsing
#Strings
#State Machines
Machine Learning Engineer
•
Coding
•
medium
Write a Python function to efficiently perform top-k and nucleus (top-p) sampling given a 1D tensor of logits.
#Sampling
#Inference
#Probability
#PyTorch
Machine Learning Engineer
•
Coding
•
medium
Implement a Trie data structure to efficiently filter out a large list of toxic words from a continuous stream of generated tokens.
#Data Structures
#Trie
#String Manipulation
Software Engineer
•
Coding
•
medium
Implement a token bucket rate limiter for an API endpoint. Extend it to handle distributed rate limiting across multiple servers.
#Concurrency
#API Design
#Distributed Systems
Software Engineer
•
Coding
•
hard
Implement a text diffing algorithm. Given two strings (an original prompt and an edited prompt), return a list of operations (Insert, Delete, Keep) to transform the original into the edited version.
#Dynamic Programming
#Strings
Software Engineer
•
Coding
•
easy
Given a list of conversation logs with start and end timestamps, write a function to merge overlapping intervals to find the total continuous time a user spent interacting with the model.
#Sorting
#Arrays
#Intervals
Software Engineer
•
Coding
•
medium
Given a massive log file of API requests, write a script to find the top K users who experienced the highest error rates in a specific 5-minute sliding window.
#Sliding Window
#Heaps
#Log Parsing
Software Engineer
•
Coding
•
medium
Write a rate limiter for an API. The rate limiter should support different limits based on the user's tier (e.g., free vs. paid) and should be based on the number of tokens generated, not just the number of requests.
#Concurrency
#Token Bucket
#Object-Oriented Design
Software Engineer
•
Coding
•
hard
Design a streaming JSON parser. In our LLM inference API, Claude streams responses token by token. Sometimes the output is a JSON object, but the client receives it in incomplete chunks. Write a function that takes a stream of characters and yields the deepest valid JSON structure possible at any given moment.
#Parsing
#State Machines
#Trees
#Streaming
Software Engineer
•
Coding
•
hard
Implement a basic Byte Pair Encoding (BPE) tokenizer. Given a string of text and a target vocabulary size, write a function to iteratively merge the most frequent adjacent pairs of characters or subwords.
#Strings
#Hash Maps
#Priority Queue
#LLM Fundamentals
Software Engineer
•
Coding
•
hard
Write a concurrent web scraper that fetches a list of URLs. It must respect robots.txt, enforce a maximum of N concurrent requests per domain, and handle retries with exponential backoff.
#Concurrency
#Web Scraping
#Error Handling
Software Engineer
•
Coding
•
medium
Implement a text chunking algorithm that takes a large document and splits it into chunks of maximum N tokens, ensuring that chunks only break on sentence boundaries.
#NLP
#String Manipulation
#Edge Cases
Software Engineer
•
Coding
•
medium
Write a function to parse a raw stream of Server-Sent Events (SSE) and yield complete JSON objects. The network can chunk the data at arbitrary byte boundaries.
#String Manipulation
#Networking
#Streaming
Software Engineer
•
Coding
•
hard
Write a simplified Byte Pair Encoding (BPE) tokenizer. Given a corpus of text and a target vocabulary size, implement the training loop to find the most frequent adjacent character pairs and merge them.
#String Manipulation
#Hash Maps
#Heaps
Software Engineer
•
Coding
•
medium
Implement a parser for Server-Sent Events (SSE) that consumes a raw byte stream from an LLM and yields complete JSON objects, handling network interruptions and fragmented chunks.
#I/O Streaming
#State Machines
#String Parsing
Software Engineer
•
Coding
•
hard
Write an asynchronous task batcher. It should accept individual requests, wait for either a maximum batch size or a maximum time window, and then process the batch together.
#Asynchronous Programming
#Concurrency
#System Timers
Software Engineer
•
Coding
•
medium
Implement a Trie-based caching mechanism to store and retrieve LLM prompt prefixes, returning the longest matching cached prefix for a new prompt.
#Trees
#Caching
#String Matching
Software Engineer
•
Coding
•
medium
Write a function to compute the cosine similarity between two dense vectors. Then, optimize it to find the top K most similar vectors from a massive list of vectors (e.g., 1 million) as quickly as possible.
#Math
#Arrays
#Heaps
#Optimization
Software Engineer
•
Coding
•
hard
Implement a basic Key-Value (KV) cache data structure used in transformer attention mechanisms. It needs to support appending new tokens, evicting the oldest tokens when a max length is reached, and fast retrieval.
#Data Structures
#Linked Lists
#Hash Maps
Software Engineer
•
Coding
•
medium
Given a set of Constitutional AI rules represented as a directed acyclic graph (where edges represent dependencies between rules), write a function to determine a valid execution order.
#Graphs
#Topological Sort
#DFS/BFS
Software Engineer
•
Coding
•
medium
Given a string of text and a list of overlapping highlight annotations (start_index, end_index, label), write a function to merge overlapping intervals and return a flattened list of text segments.
#Intervals
#Sorting
#Arrays
Software Engineer
•
Coding
•
easy
Write a function to manage a sliding context window for an LLM. Given a list of messages and a maximum token limit, return the optimal subset of messages that fits, ensuring the system prompt is always included.
#Arrays
#Greedy Algorithms
#Logic
Software Engineer
•
Coding
•
medium
Implement a thread-safe asynchronous queue from scratch using basic concurrency primitives (mutexes, condition variables).
#Concurrency
#Data Structures
#Synchronization
Software Engineer
•
Coding
•
hard
Write a custom JSON parser that can recover from common malformed outputs generated by LLMs (e.g., missing closing brackets, trailing commas, unescaped quotes).
#Parsing
#String Manipulation
#Heuristics
Software Engineer
•
Coding
•
hard
Given an array of integers representing the execution times of tasks and an integer K representing the number of available workers, write a function to assign tasks to workers to minimize the maximum time spent by any worker.
#Binary Search
#Greedy Algorithms
#Optimization
Software Engineer
•
Coding
•
medium
Implement a token bucket rate limiter to throttle incoming API requests based on a user's tier. It should handle concurrent requests safely.
#Concurrency
#Data Structures
#API Design
Software Engineer
•
Coding
•
medium
Write a program to parse a massive log file (e.g., 50GB) to find the top 10 most frequent IP addresses. You have limited RAM (e.g., 1GB).
#File I/O
#Hashing
#Heaps
#Memory Management
Software Engineer
•
Coding
•
easy
Implement a sliding window algorithm to manage an LLM's context window. Given an array of text chunks with token counts and a maximum token limit, find the contiguous subarray of chunks that maximizes the token count without exceeding the limit.
#Sliding Window
#Arrays
#Two Pointers
Software Engineer
•
Coding
•
medium
Given a Directed Acyclic Graph (DAG) representing a chain of LLM prompts where some prompts depend on the outputs of others, write an execution engine that runs the prompts in the correct order, maximizing concurrency.
#Graphs
#Topological Sort
#Concurrency
#Asyncio
Software Engineer
•
Coding
•
easy
Write a retry decorator in Python that implements exponential backoff with jitter. It should take parameters for maximum retries, base delay, and exceptions to catch.
#Python
#Decorators
#Networking
#Math
Software Engineer
•
Coding
•
medium
Write a function that takes a long string of text and a maximum line length, and returns the text word-wrapped. Words longer than the line length should be broken with a hyphen.
#Strings
#Formatting
#Edge Cases
Difficulty Radar
Based on recent AI-sourced data.
Meet Your Interviewers
The "Standard" Interviewer
Senior EngineerFocuses on core competencies, system constraints, and clear communication.
SimulateUnwritten Rules
Think Out Loud
Always explain your thought process before writing code or drawing architecture.