Coinbase

Coinbase

Cryptocurrency exchange platform

4 Rounds ~21 Days Hard
Start Mock Interview

The Interview Loop

Recruiter Screen (30 min)

Standard fit check, behavioral questions, and resume overview.

Technical Loop (3-4 Rounds)

Deep dive into domain knowledge, coding, and system design.

Interview Question Bank

Data Engineer Behavioral medium

Tell me about a time you took ownership of a failing or highly unstable data pipeline and turned it around. What was your approach?

#Ownership #Problem Solving #Resilience
Data Engineer Behavioral medium

Describe a situation where you had to disagree with a senior engineer or manager about a technical architecture choice. How did you handle it?

#Conflict Resolution #Communication #Data-Driven Decisions
Data Engineer Behavioral medium

Coinbase values clear communication. Give an example of how you explained a complex data engineering concept to a non-technical stakeholder.

#Communication #Stakeholder Management
Data Engineer Behavioral medium

Tell me about a time you had to learn a new technology completely from scratch to deliver a project on a tight deadline.

#Continuous Learning #Adaptability #Execution
Data Engineer Behavioral medium

Describe a time you identified a major data quality issue in production. How did you handle it and prevent it from happening again?

#Data Quality #Proactiveness #Process Improvement
Data Engineer Behavioral hard

Tell me about a time you made a mistake that impacted production data or systems. What was the aftermath and what did you learn?

#Accountability #Post-mortems #Growth Mindset
Data Engineer Behavioral medium

How do you prioritize addressing technical debt versus building new data features requested by the product team?

#Prioritization #Technical Debt #Product Alignment
Data Engineer Behavioral easy

Give an example of a time you stepped outside your defined role as a Data Engineer to help the team or company succeed.

#Teamwork #Initiative #Cross-functional
Data Engineer Behavioral easy

Why do you want to work at Coinbase, and what are your thoughts on the future of decentralized finance (DeFi) from a data perspective?

#Domain Interest #Crypto #Company Mission
Data Engineer Behavioral medium

Tell me about a time you had to 'act like an owner' and take responsibility for a failing data pipeline that wasn't originally yours.

#Ownership #Troubleshooting #Accountability
Data Engineer Behavioral medium

Describe a situation where you disagreed with a product manager or stakeholder about a data architecture decision. How did you resolve it?

#Conflict Resolution #Communication #Stakeholder Management
Data Engineer Behavioral medium

Tell me about a time you had to learn a new technology or framework very quickly to deliver a critical project.

#Adaptability #Continuous Learning #Execution
Data Engineer Behavioral medium

Give an example of a time you simplified a complex data process or architecture. What was the impact?

#Simplification #Efficiency #Engineering Excellence
Data Engineer Behavioral medium

Tell me about a time you discovered a critical data discrepancy that impacted business reporting. How did you communicate and fix it?

#Integrity #Communication #Incident Management
Data Engineer Behavioral medium

Describe a project where you had to balance shipping quickly versus building a perfect, highly scalable solution.

#Trade-offs #Execution #Pragmatism
Data Engineer Behavioral medium

Tell me about a time you received critical feedback from a peer or manager. How did you handle it?

#Self-Awareness #Growth Mindset #Feedback
Data Engineer Behavioral medium

How do you prioritize addressing technical debt versus building new feature requests in a data engineering sprint?

#Prioritization #Stakeholder Management #Agile
Data Engineer Behavioral medium

Describe a time you mentored a junior engineer through a difficult technical challenge. What was your approach?

#Mentorship #Team Building #Empathy
Data Engineer Coding hard

Design a class to keep track of the top K most traded cryptocurrencies in a sliding window of the last 1 hour.

#Heaps #Sliding Window #Hash Maps
Data Engineer Coding easy

Write a Python function to parse a massive JSON log file of crypto trades and return the total trading volume per currency pair.

#JSON Parsing #Dictionaries #File I/O
Data Engineer Coding medium

Implement a rate limiter for a crypto price API that allows a maximum of 100 requests per minute per user.

#Data Structures #Queues #System Clocks
Data Engineer Coding medium

Given a list of user login session intervals [start_time, end_time], write a function to merge all overlapping sessions.

#Arrays #Sorting #Intervals
Data Engineer Coding easy

Write a Python function to validate if a given string is a valid Bitcoin address based on specific prefix and length constraints.

#String Manipulation #Regex #Validation
Data Engineer Coding medium

Write a recursive function to flatten a deeply nested dictionary representing a complex blockchain transaction payload into a single-level dictionary.

#Recursion #Dictionaries #Data Transformation
Data Engineer Coding medium

Write a Python script using Pandas to backfill missing daily price data for a crypto asset using linear interpolation.

#Pandas #Data Cleaning #Interpolation
Data Engineer Coding hard

Implement a function to detect cycles in a directed graph representing crypto wallet transfers, to help flag potential money laundering.

#Graphs #DFS #Cycle Detection
Data Engineer Coding medium

Write a SQL query to find the 7-day rolling average of daily transaction volume per user.

#SQL #Window Functions #Time-series
Data Engineer Coding medium

Write a SQL query to identify 'whale' wallets that have transferred more than $1,000,000 in aggregate volume in a single calendar day.

#SQL #Aggregations #GROUP BY
Data Engineer Coding medium

Parse a log file of cryptocurrency trades and calculate the VWAP (Volume Weighted Average Price) for a specific asset over a given time period.

#Python #Data Processing #Math
Data Engineer Coding medium

Write a function to flatten a deeply nested JSON object representing a blockchain transaction payload into a single-level dictionary.

#Python #JSON #Recursion
Data Engineer Coding medium

Given a list of user login sessions with start and end timestamps, merge overlapping intervals to calculate the total active time for a user.

#Python #Arrays #Sorting
Data Engineer Coding hard

Implement a rate limiter for an internal API endpoint that pulls real-time Bitcoin prices, allowing a maximum of N requests per M seconds per IP address.

#Python #Concurrency #Data Structures
Data Engineer Coding hard

Given a continuous stream of order book updates (bids and asks), write a class to maintain and efficiently query the current best bid and best ask.

#Python #Heaps #Data Structures
Data Engineer Coding hard

Write a script to detect cycles in a directed graph representing cryptocurrency wallet transfers to flag potential money laundering.

#Python #Graphs #DFS
Data Engineer Coding medium

Find the top K most frequently traded assets in a rolling window of N minutes from a stream of trade events.

#Python #Sliding Window #Heaps #Queues
Data Engineer Coding easy

Write a function to validate if a given string is a valid Ethereum address (starts with 0x, followed by 40 hexadecimal characters).

#Python #Regex #Strings
Data Engineer Coding medium

Given a table of user logins and a table of trades, write a query to find the percentage of users who made a trade within 1 hour of logging in.

#SQL #Joins #Date Math
Data Engineer Coding hard

Write a SQL query to find the maximum consecutive days a user has made at least one trade (trading streak).

#SQL #Gaps and Islands #Window Functions
Data Engineer Coding medium

Write a query to calculate the cumulative balance of a wallet address given a ledger of debit and credit transactions.

#SQL #Window Functions #Running Totals
Data Engineer System Design hard

Design a real-time ETL pipeline to ingest, validate, and store transaction data from multiple blockchain nodes (Bitcoin, Ethereum, Solana).

#Streaming #Kafka #Data Lake #Architecture
Data Engineer System Design hard

Design a data warehouse architecture for Coinbase's compliance team to run daily Anti-Money Laundering (AML) reports on petabytes of data.

#Data Warehousing #Batch Processing #Security #Snowflake
Data Engineer System Design hard

How would you design a system to calculate and serve real-time crypto portfolio balances to millions of concurrent users?

#Caching #Event Sourcing #High Availability #Redis
Data Engineer System Design medium

Design an Airflow DAG architecture to handle dependencies between fiat deposits, crypto trades, and daily financial reconciliation.

#Airflow #Orchestration #DAG Design #Idempotency
Data Engineer System Design hard

Explain how you would migrate a massive legacy Redshift database to Snowflake with zero downtime for downstream analytics consumers.

#Cloud Migration #Dual Writes #Data Validation
Data Engineer System Design hard

Design a streaming pipeline using Kafka and Flink to detect fraudulent login attempts based on IP geolocation and velocity.

#Stream Processing #Flink #Fraud Detection #Stateful Processing
Data Engineer System Design medium

How do you handle schema evolution in a data lake storing raw JSON payloads from various rapidly updating blockchain networks?

#Schema Registry #Data Lake #Parquet/Avro
Data Engineer System Design medium

Design a system to ingest exchange rate data from 10 different external crypto exchanges and calculate a consolidated, volume-weighted global price.

#API Integration #Data Aggregation #Fault Tolerance
Data Engineer System Design hard

Design a real-time data pipeline to ingest and process blockchain node data (e.g., Ethereum blocks) into a data warehouse for analytics.

#Streaming #Kafka #Data Lake #Architecture
Data Engineer System Design hard

Design a system to detect anomalous trading patterns (potential wash trading) in near real-time.

#Real-time #Flink #Fraud Detection #Event Processing
Data Engineer System Design medium

How would you design a scalable ETL pipeline to aggregate daily trading fees across millions of users and multiple assets?

#Batch Processing #Airflow #Spark #ETL
Data Engineer System Design hard

Design a data lakehouse architecture for Coinbase's compliance team to run ad-hoc queries on petabytes of historical transaction data.

#Iceberg/Hudi #Snowflake #Storage #Data Governance
Data Engineer System Design medium

Explain how you would handle late-arriving data in a daily batch pipeline calculating user portfolio balances.

#Data Engineering #Airflow #Idempotency #Backfilling
Data Engineer System Design hard

Design a system to ingest, process, and serve real-time exchange rates for 10,000+ crypto pairs to internal microservices.

#Pub/Sub #Caching #Redis #Microservices
Data Engineer System Design medium

How would you migrate a legacy daily batch pipeline to a streaming architecture using Kafka and Flink?

#Kafka #Flink #Migration #Architecture
Data Engineer System Design medium

Design a data pipeline to sync user account balances from a highly transactional PostgreSQL database to Snowflake.

#CDC #Debezium #Snowflake #Data Replication
Data Engineer System Design medium

How do you ensure data quality and anomaly detection in a pipeline that ingests third-party market data?

#Data Quality #Anomaly Detection #Observability
Data Engineer System Design hard

Design a metrics aggregation system for Coinbase Wallet telemetry data (e.g., button clicks, screen views) handling millions of events per second.

#High Throughput #OLAP #Druid/ClickHouse #Telemetry
Data Engineer Technical medium

Write a SQL query to calculate the 7-day moving average of trading volume for each cryptocurrency asset on the platform.

#Window Functions #Time Series #Aggregations
Data Engineer Technical medium

Given a 'transactions' table, write a query to find the top 3 users by total transaction volume for each month.

#CTEs #Ranking Functions #Date Truncation
Data Engineer Technical hard

Write a SQL query to identify potential 'wash trading'—instances where a user buys and sells the same asset to themselves within a 1-minute window.

#Self Joins #Time Intervals #Fraud Detection
Data Engineer Technical hard

Given a table of user login and logout timestamps, write a query to find the maximum number of concurrent active user sessions on the exchange.

#Cumulative Sum #Event Sourcing #Advanced SQL
Data Engineer Technical medium

Calculate the Daily Active Users (DAU) and the month-over-month retention rate for the Coinbase Pro platform.

#Cohort Analysis #Retention #Aggregations
Data Engineer Technical medium

Given a table of order book snapshots (bids and asks), write a query to calculate the bid-ask spread for BTC-USD at the end of every hour.

#Window Functions #Filtering #Financial Data
Data Engineer Technical hard

Design a relational data model for a new NFT marketplace. Then, write a query to find the highest-grossing NFT collection of the week.

#Schema Design #Foreign Keys #Joins
Data Engineer Technical easy

Write a SQL query to find the first and last transaction timestamp, along with the respective amounts, for every wallet address.

#Aggregations #Window Functions
Data Engineer Technical hard

How would you handle late-arriving blockchain transaction data in a daily aggregate table? Write a SQL MERGE statement to update the aggregates.

#Idempotency #MERGE/UPSERT #Data Warehousing
Data Engineer Technical medium

Write a query to identify users who made a fiat deposit but did not execute any crypto trades within 24 hours of that deposit.

#Left Joins #Time Intervals #Funnel Analysis
Data Engineer Technical medium

Design a relational data model for a cryptocurrency exchange order book, including tables for users, orders, and executions.

#Schema Design #RDBMS #Normalization
Data Engineer Technical hard

Design a dimensional model (star schema) for tracking NFT mints and secondary market sales for Coinbase NFT.

#Star Schema #Data Warehouse #Fact/Dimension Tables
Data Engineer Technical medium

How would you handle slowly changing dimensions (SCD Type 2) for user KYC (Know Your Customer) status changes in a data warehouse?

#SCD #Data Warehouse #ETL

Difficulty Radar

Based on recent AI-sourced data.

Meet Your Interviewers

The "Standard" Interviewer

Senior Engineer

Focuses on core competencies, system constraints, and clear communication.

Simulate

Unwritten Rules

Think Out Loud

Always explain your thought process before writing code or drawing architecture.

Practice Now