Anthropic

Anthropic

AI safety and research company behind Claude, focusing on constitutional AI.

5 Rounds ~20 Days Very Hard
Start Mock Interview

The Interview Loop

Recruiter Screen (30 min)

Standard fit check, behavioral questions, and resume overview.

Technical Loop (3-4 Rounds)

Deep dive into domain knowledge, coding, and system design.

Interview Question Bank

Data Engineer Coding medium

Given a table of API requests containing `user_id`, `timestamp`, `prompt_tokens`, and `completion_tokens`, write a SQL query to find the top 3 users by total token usage for each day over the last 30 days, including a rolling 7-day average of their token usage.

#Window Functions #Aggregations #Time-series Data
Data Engineer Coding medium

Given a table of raw chat interactions (`interaction_id`, `user_id`, `timestamp`, `message`), write a SQL query to group these interactions into 'sessions'. A new session starts if there is a gap of more than 30 minutes between messages from the same user.

#Gaps and Islands #Window Functions #Data Modeling
Data Engineer Coding medium

Given a table of user prompts, write a SQL query to find the top 3 most frequent prompt categories for each user. Include ties if they exist.

#Window Functions #Ranking #CTEs
Data Engineer Coding medium

Given a massive table of web crawl documents with `doc_id`, `url`, `content_hash`, and `crawled_at`, write a highly optimized SQL query to keep only the most recent version of each document per URL, but flag URLs that have multiple distinct content hashes over time.

#Window Functions #Deduplication #Data Cleaning
Data Engineer Coding hard

Write a SQL query to calculate the 7-day rolling average of token usage per user, but only for users who have exceeded 10,000 tokens in at least three distinct days within the last month.

#Advanced SQL #Rolling Averages #Subqueries
Data Engineer Coding hard

Write a SQL query to find the 'sessionization' of user interactions. Group consecutive user prompts into a single session if they occur within 30 minutes of each other. Output the user_id, session_start, session_end, and prompt_count.

#Sessionization #Window Functions #Time Series
Data Engineer Coding medium

Write a SQL query to find the top 3 most frequently used prompt templates per user, but exclude templates that consist entirely of stop words (assume a `stop_words` table exists).

#Joins #Filtering #Window Functions
Data Engineer Coding medium

Write a SQL query to calculate the 30-day rolling average of tokens processed per model version, given a table of daily token usage logs.

#Window Functions #Aggregations #Time Series
Data Engineer Coding hard

We have a log table of safety filter triggers. Write a SQL query to identify all user sessions where a user triggered a safety filter more than 3 times within any 5-minute window.

#Self Joins #Time Series #Complex Window Functions
Data Engineer Coding hard

Write a SQL query to find the median model response latency per day from a massive logs table, assuming your SQL dialect does not have a built-in MEDIAN() function.

#Percentiles #Math #Advanced SQL
Data Engineer Coding medium

In our distributed logging system, log IDs are supposed to be sequential. Write a SQL query to find all gaps (missing sequential IDs) in the log table.

#Gaps and Islands #Sequences #Self Joins
Data Engineer Coding medium

Write a SQL query to calculate the Day-1, Day-7, and Day-30 retention rate of users interacting with the Claude API, grouped by the month they signed up.

#Cohorts #Retention #Date Math
Data Engineer Coding medium

You have a table of model evaluation scores in a long format: (model_id, eval_metric, score). Write a SQL query to pivot this table so that 'Helpfulness', 'Honesty', and 'Harmlessness' are columns.

#Pivot #Data Transformation #Aggregations

Difficulty Radar

Based on recent AI-sourced data.

Meet Your Interviewers

The "Standard" Interviewer

Senior Engineer

Focuses on core competencies, system constraints, and clear communication.

Simulate

Unwritten Rules

Think Out Loud

Always explain your thought process before writing code or drawing architecture.

Practice Now