Anthropic

Anthropic

AI safety and research company behind Claude, focusing on constitutional AI.

5 Rounds ~20 Days Very Hard
Start Mock Interview

The Interview Loop

Recruiter Screen (30 min)

Standard fit check, behavioral questions, and resume overview.

Technical Loop (3-4 Rounds)

Deep dive into domain knowledge, coding, and system design.

Interview Question Bank

Backend Engineer Coding hard

Write a function to merge K sorted asynchronous streams of data into a single sorted stream. You cannot load all data into memory at once.

#Heaps #Asynchronous Programming #Streaming
Backend Engineer Coding medium

Given a massive log file of API requests, write a script to find the 99th percentile latency. The file is too large to fit into memory.

#Data Processing #Approximation Algorithms #File I/O
Backend Engineer Coding hard

Given a stream of tokens (strings), implement a data structure to efficiently find the top K most frequent tokens in a sliding window of the last N minutes.

#Streaming Data #Heaps #Sliding Window
Backend Engineer Coding hard

Write a program to justify text. Given an array of words and a max width, format the text such that each line has exactly max width characters and is fully (left and right) justified.

#String Manipulation #Array #Simulation
Backend Engineer Coding hard

Given a string representing a user prompt, find the longest repeating substring. This is useful for detecting repetitive loops in context windows.

#String Manipulation #Dynamic Programming #Suffix Trees
Backend Engineer Coding hard

Implement a streaming JSON parser that can take chunks of a JSON string (as they are generated by an LLM) and yield valid parsed objects as soon as they are complete.

#Parsing #State Machines #String Manipulation
Backend Engineer Coding medium

Implement a thread-safe Rate Limiter using the Token Bucket algorithm. It should support multiple users and handle concurrent requests efficiently.

#Concurrency #Data Structures #API Design
Backend Engineer Coding hard

Write an asynchronous task scheduler in Python (using asyncio) or Rust (using tokio) that executes a DAG (Directed Acyclic Graph) of tasks with maximum concurrency.

#Graph Theory #Asynchronous Programming #Concurrency
Backend Engineer Coding medium

Implement a deep copy function for a complex graph data structure that may contain cycles. Ensure that nodes are duplicated correctly without infinite loops.

#Graph Theory #Recursion #Hash Map
Cloud Engineer Coding medium

Write a Go program that concurrently health-checks a list of internal model endpoints. It should implement a worker pool, timeout after 2 seconds per request, and aggregate the results into a summary report.

#Go #Concurrency #Networking #Error Handling
Cloud Engineer Coding hard

Given a JSON response from a cloud API containing nested resource dependencies, write an algorithm to determine the correct deletion order.

#Graphs #Topological Sort #DFS #JSON Parsing
Data Engineer Coding medium

Write a Python generator function to efficiently parse a 500GB JSONL file containing web crawl data, filtering out documents that do not contain a specific set of keywords, without loading the entire file into memory.

#Python #Generators #Memory Management #File I/O
Data Engineer Coding hard

Write a Python function to efficiently find near-duplicate text documents in a large corpus. You do not need to implement the full distributed system, but implement the core hashing logic (e.g., MinHash) and explain how you would scale it across a cluster.

#Hashing #Text Processing #Optimization
Data Engineer Coding medium

Write a Python program that takes a massive JSONL file of Wikipedia articles and chunks the text into overlapping segments of exactly 512 tokens (assume a simple whitespace tokenizer for this exercise), while preserving the document metadata in each chunk. The file is larger than available RAM.

#Generators #Memory Management #Text Processing
Data Engineer Coding medium

Implement a rate limiter in Python for our API. The rate limiter should allow a user to make up to N requests per minute, but also enforce a maximum of M tokens generated per day. How would you make this distributed across multiple API servers?

#Data Structures #Concurrency #API Design
Data Engineer Coding medium

Implement a Trie (Prefix Tree) data structure in Python. Then, write a method to find all words in the Trie that share a given prefix. Explain how this relates to LLM tokenization.

#Data Structures #Trees #String Manipulation
Data Engineer Coding hard

You have a stream of incoming chat logs. Write a Python algorithm to maintain the top K most frequent words over a sliding window of 1 hour.

#Streaming Algorithms #Heaps #Sliding Window
Data Engineer Coding medium

Write a Python script that implements a custom MapReduce framework using the `multiprocessing` library to count the frequency of n-grams in a large corpus of text files.

#Concurrency #MapReduce #Python
Data Engineer Coding hard

Given a directed acyclic graph (DAG) representing data pipeline dependencies, write a Python function to execute the tasks in parallel where possible, respecting the dependency order. Assume each task is a sleep function.

#Graphs #Topological Sort #Concurrency
Data Engineer Coding hard

Given a massive string of text, write an algorithm to find the longest repeating substring. This is a simplified version of finding duplicated boilerplate text in web scrapes.

#String Algorithms #Suffix Arrays #Dynamic Programming
Data Engineer Coding medium

We need to create a pre-training dataset with a specific language distribution (e.g., 60% English, 20% Spanish, 20% French). Write a script to sample proportionally from a massive, unsorted stream of multilingual documents.

#Sampling #Probability #Streaming Algorithms
Data Engineer Coding medium

Write a function that takes a stream of text and a target keyword, and returns a sliding window of N tokens before and after every occurrence of the keyword. Handle edge cases like overlapping windows.

#Sliding Window #Text Processing #Queues
Data Engineer Coding hard

Given a massive dataset of text documents, implement a MinHash and Locality-Sensitive Hashing (LSH) algorithm in Python to identify near-duplicate documents. How would you scale this across a distributed cluster?

#Hashing #Deduplication #Big Data #Distributed Systems
Data Engineer Coding hard

Given two large documents, write an algorithm to find the longest common contiguous substring. This is used in our pipeline to detect data contamination between training and evaluation sets.

#Dynamic Programming #Suffix Trees #Strings
Data Engineer Coding medium

Write a program to compute the top K most frequent tokens in a continuous, infinite stream of text. Optimize for both time and space complexity.

#Heaps #Hash Maps #Streaming
Data Engineer Coding hard

Implement a thread-safe Token Bucket rate limiter in Python. This will be used to throttle incoming requests to our data ingestion API to prevent overwhelming the downstream Kafka cluster.

#Concurrency #Rate Limiting #System Design
Data Engineer Coding easy

Given a list of text spans representing PII (Personally Identifiable Information) redactions in a document, where each span is a tuple of (start_index, end_index), write a function to merge all overlapping spans.

#Intervals #Arrays #Sorting
Data Engineer Coding medium

Write a Python function to process a 500GB JSONL file of raw text data. You need to filter out documents containing specific blocklisted keywords, compute a basic word count across the valid documents, and output the clean data to a new file. You have 8GB of RAM.

#Python #Generators #Memory Management #I/O
Data Engineer Coding hard

Implement a distributed rate limiter in Python. Assume this will be used to throttle API requests for our Claude models based on a user's tier (e.g., tokens per minute).

#Concurrency #Redis #Token Bucket #Distributed Systems
Data Engineer Coding medium

Given a list of overlapping time intervals representing periods when a GPU cluster was fully utilized, write a function to merge all overlapping intervals and return the total duration of full utilization.

#Sorting #Intervals #Python
Data Scientist Coding hard

Implement an algorithm to find the longest common substring between two large text prompts. We use this to identify potential prompt injection templates spreading among users.

#Dynamic Programming #String Manipulation #Security
Data Scientist Coding medium

Write a Python function to efficiently deduplicate a massive dataset of text documents (billions of tokens) prior to model pre-training. What algorithmic approach would you use?

#Python #Data Deduplication #MinHash #LSH
Data Scientist Coding medium

Implement a function in Python to calculate the Elo rating update for two LLMs given a human preference rating (win, loss, or tie).

#Python #Math #Algorithms
Data Scientist Coding medium

Write a Python function using NumPy to efficiently compute the cosine similarity between a single target embedding vector and a matrix of 1 million document embeddings.

#Python #NumPy #Linear Algebra
Data Scientist Coding medium

Implement a stratified sampling algorithm to select 10,000 prompt-response pairs for human evaluation, ensuring the sample exactly matches the real-world distribution of 15 different safety categories.

#Python #Sampling #Statistics
DevOps Engineer Coding easy

Write a function to implement a basic Round Robin load balancer. It should take a list of servers and return the next server to route a request to.

#Load Balancing #Data Structures
DevOps Engineer Coding hard

Given a list of overlapping IP CIDR blocks, write a function to merge them into the minimum number of non-overlapping CIDR blocks.

#Networking #Algorithms #Intervals
DevOps Engineer Coding medium

Implement a basic rate limiter class in Python or Go using the Token Bucket algorithm.

#Concurrency #Algorithms #System Design
Frontend Engineer Coding medium

Write a utility function to deeply merge two complex JavaScript objects, handling arrays and nested objects appropriately.

#JavaScript #Recursion #Data Structures
Frontend Engineer Coding hard

Implement a diff viewer component that takes two strings (e.g., an original prompt and an AI-edited prompt) and highlights the insertions and deletions.

#String Manipulation #Dynamic Programming #React
Frontend Engineer Coding medium

Implement a robust retry mechanism with exponential backoff for a fetch request that calls an unreliable LLM inference API.

#Asynchronous JavaScript #Promises #Error Handling
Frontend Engineer Coding easy

Implement a rate-limiter utility on the frontend to prevent a user from accidentally spamming the 'Generate' button and exhausting their API quota.

#JavaScript #Throttling #UX
Frontend Engineer Coding easy

Write a function that takes a deeply nested JSON object representing an AI's structured output and flattens it into a single-level object with dot-notation keys.

#JavaScript #Recursion #Object Manipulation
Full Stack Engineer Coding medium

Implement an LRU (Least Recently Used) cache with a Time-To-Live (TTL) feature to temporarily store frequent, identical prompt responses and reduce inference load.

#Data Structures #Caching #Hash Maps #Linked Lists
Full Stack Engineer Coding medium

Write a function to merge overlapping text highlights. Given an array of objects representing start and end indices of safety flags in a text, return a merged array of non-overlapping intervals.

#Intervals #Sorting #Arrays
Full Stack Engineer Coding hard

Write an algorithm to efficiently diff two versions of a large text document and highlight the insertions and deletions. This is used to show users how their prompt edits changed the context.

#Dynamic Programming #Strings #Diff Algorithms
Full Stack Engineer Coding hard

Implement a custom JSON parser that can gracefully handle and 'fix' truncated JSON strings. This is common when an LLM output stops mid-generation due to max token limits.

#Parsing #Strings #Error Handling #AST
Full Stack Engineer Coding hard

Implement a Markdown parser function in TypeScript that can render code blocks with syntax highlighting *while* the text is still streaming in chunk by chunk.

#Parsing #TypeScript #Streaming #State Machines
Full Stack Engineer Coding medium

Given a massive log file of API requests, write a Python script to find the top 5 users who consumed the most tokens in any sliding 1-hour window.

#Python #Sliding Window #Data Processing
Machine Learning Engineer Coding medium

Write an algorithm to find the longest common substring between two large text documents efficiently.

#Dynamic Programming #Strings #Suffix Trees
Machine Learning Engineer Coding medium

Write an algorithm to efficiently sample from a logits distribution using Top-K and Top-P (Nucleus) sampling.

#Probability #Sampling #Sorting
Machine Learning Engineer Coding medium

Given a stream of generated tokens, write a highly optimized Trie-based data structure to filter out a dynamic list of toxic phrases in real-time.

#Data Structures #Trie #Streaming
Machine Learning Engineer Coding hard

Given a sequence of characters and a vocabulary of merges, implement the Byte-Pair Encoding (BPE) tokenization merging algorithm.

#Tokenization #NLP #Greedy Algorithms
Machine Learning Engineer Coding medium

Implement a basic tokenizer using Byte-Pair Encoding (BPE) given a corpus of text and a target vocabulary size.

#NLP #Tokenization #String Processing
Machine Learning Engineer Coding easy

Given a string representing a mathematical expression, write a tokenizer that converts it into a list of valid tokens (numbers, operators, parentheses). Handle multi-digit numbers and ignore whitespace.

#Tokenization #Parsing #Strings #State Machines
Machine Learning Engineer Coding medium

Write a Python function to efficiently perform top-k and nucleus (top-p) sampling given a 1D tensor of logits.

#Sampling #Inference #Probability #PyTorch
Machine Learning Engineer Coding medium

Implement a Trie data structure to efficiently filter out a large list of toxic words from a continuous stream of generated tokens.

#Data Structures #Trie #String Manipulation
Software Engineer Coding medium

Implement a token bucket rate limiter for an API endpoint. Extend it to handle distributed rate limiting across multiple servers.

#Concurrency #API Design #Distributed Systems
Software Engineer Coding hard

Implement a text diffing algorithm. Given two strings (an original prompt and an edited prompt), return a list of operations (Insert, Delete, Keep) to transform the original into the edited version.

#Dynamic Programming #Strings
Software Engineer Coding easy

Given a list of conversation logs with start and end timestamps, write a function to merge overlapping intervals to find the total continuous time a user spent interacting with the model.

#Sorting #Arrays #Intervals
Software Engineer Coding medium

Given a massive log file of API requests, write a script to find the top K users who experienced the highest error rates in a specific 5-minute sliding window.

#Sliding Window #Heaps #Log Parsing
Software Engineer Coding medium

Write a rate limiter for an API. The rate limiter should support different limits based on the user's tier (e.g., free vs. paid) and should be based on the number of tokens generated, not just the number of requests.

#Concurrency #Token Bucket #Object-Oriented Design
Software Engineer Coding hard

Design a streaming JSON parser. In our LLM inference API, Claude streams responses token by token. Sometimes the output is a JSON object, but the client receives it in incomplete chunks. Write a function that takes a stream of characters and yields the deepest valid JSON structure possible at any given moment.

#Parsing #State Machines #Trees #Streaming
Software Engineer Coding hard

Implement a basic Byte Pair Encoding (BPE) tokenizer. Given a string of text and a target vocabulary size, write a function to iteratively merge the most frequent adjacent pairs of characters or subwords.

#Strings #Hash Maps #Priority Queue #LLM Fundamentals
Software Engineer Coding hard

Write a concurrent web scraper that fetches a list of URLs. It must respect robots.txt, enforce a maximum of N concurrent requests per domain, and handle retries with exponential backoff.

#Concurrency #Web Scraping #Error Handling
Software Engineer Coding medium

Implement a text chunking algorithm that takes a large document and splits it into chunks of maximum N tokens, ensuring that chunks only break on sentence boundaries.

#NLP #String Manipulation #Edge Cases
Software Engineer Coding medium

Write a function to parse a raw stream of Server-Sent Events (SSE) and yield complete JSON objects. The network can chunk the data at arbitrary byte boundaries.

#String Manipulation #Networking #Streaming
Software Engineer Coding hard

Write a simplified Byte Pair Encoding (BPE) tokenizer. Given a corpus of text and a target vocabulary size, implement the training loop to find the most frequent adjacent character pairs and merge them.

#String Manipulation #Hash Maps #Heaps
Software Engineer Coding medium

Implement a parser for Server-Sent Events (SSE) that consumes a raw byte stream from an LLM and yields complete JSON objects, handling network interruptions and fragmented chunks.

#I/O Streaming #State Machines #String Parsing
Software Engineer Coding hard

Write an asynchronous task batcher. It should accept individual requests, wait for either a maximum batch size or a maximum time window, and then process the batch together.

#Asynchronous Programming #Concurrency #System Timers
Software Engineer Coding medium

Implement a Trie-based caching mechanism to store and retrieve LLM prompt prefixes, returning the longest matching cached prefix for a new prompt.

#Trees #Caching #String Matching
Software Engineer Coding medium

Write a function to compute the cosine similarity between two dense vectors. Then, optimize it to find the top K most similar vectors from a massive list of vectors (e.g., 1 million) as quickly as possible.

#Math #Arrays #Heaps #Optimization
Software Engineer Coding hard

Implement a basic Key-Value (KV) cache data structure used in transformer attention mechanisms. It needs to support appending new tokens, evicting the oldest tokens when a max length is reached, and fast retrieval.

#Data Structures #Linked Lists #Hash Maps
Software Engineer Coding medium

Given a set of Constitutional AI rules represented as a directed acyclic graph (where edges represent dependencies between rules), write a function to determine a valid execution order.

#Graphs #Topological Sort #DFS/BFS
Software Engineer Coding medium

Given a string of text and a list of overlapping highlight annotations (start_index, end_index, label), write a function to merge overlapping intervals and return a flattened list of text segments.

#Intervals #Sorting #Arrays
Software Engineer Coding easy

Write a function to manage a sliding context window for an LLM. Given a list of messages and a maximum token limit, return the optimal subset of messages that fits, ensuring the system prompt is always included.

#Arrays #Greedy Algorithms #Logic
Software Engineer Coding medium

Implement a thread-safe asynchronous queue from scratch using basic concurrency primitives (mutexes, condition variables).

#Concurrency #Data Structures #Synchronization
Software Engineer Coding hard

Write a custom JSON parser that can recover from common malformed outputs generated by LLMs (e.g., missing closing brackets, trailing commas, unescaped quotes).

#Parsing #String Manipulation #Heuristics
Software Engineer Coding hard

Given an array of integers representing the execution times of tasks and an integer K representing the number of available workers, write a function to assign tasks to workers to minimize the maximum time spent by any worker.

#Binary Search #Greedy Algorithms #Optimization
Software Engineer Coding medium

Implement a token bucket rate limiter to throttle incoming API requests based on a user's tier. It should handle concurrent requests safely.

#Concurrency #Data Structures #API Design
Software Engineer Coding medium

Write a program to parse a massive log file (e.g., 50GB) to find the top 10 most frequent IP addresses. You have limited RAM (e.g., 1GB).

#File I/O #Hashing #Heaps #Memory Management
Software Engineer Coding easy

Implement a sliding window algorithm to manage an LLM's context window. Given an array of text chunks with token counts and a maximum token limit, find the contiguous subarray of chunks that maximizes the token count without exceeding the limit.

#Sliding Window #Arrays #Two Pointers
Software Engineer Coding medium

Given a Directed Acyclic Graph (DAG) representing a chain of LLM prompts where some prompts depend on the outputs of others, write an execution engine that runs the prompts in the correct order, maximizing concurrency.

#Graphs #Topological Sort #Concurrency #Asyncio
Software Engineer Coding easy

Write a retry decorator in Python that implements exponential backoff with jitter. It should take parameters for maximum retries, base delay, and exceptions to catch.

#Python #Decorators #Networking #Math
Software Engineer Coding medium

Write a function that takes a long string of text and a maximum line length, and returns the text word-wrapped. Words longer than the line length should be broken with a hyphen.

#Strings #Formatting #Edge Cases

Difficulty Radar

Based on recent AI-sourced data.

Meet Your Interviewers

The "Standard" Interviewer

Senior Engineer

Focuses on core competencies, system constraints, and clear communication.

Simulate

Unwritten Rules

Think Out Loud

Always explain your thought process before writing code or drawing architecture.

Practice Now