Leading technology company specializing in search, cloud, and AI.
4 Rounds
~21 Days
Very Hard
The Interview Loop
Recruiter Screen (30 min)
Standard fit check, behavioral questions, and resume overview.
Technical Loop (3-4 Rounds)
Deep dive into domain knowledge, coding, and system design.
Interview Question Bank
AI Engineer
•
Behavioral
•
hard
Tell me about an AI project where you had to balance innovation with reliability.
#Reliability
#Innovation
AI Engineer
•
Behavioral
•
medium
How do you handle stakeholder uncertainty around AI capabilities and limitations?
#Stakeholders
#Expectations
AI Engineer
•
Behavioral
•
medium
Tell me about a time you optimized an LLM application for cost or latency.
#Cost
#Latency
AI Engineer
•
Behavioral
•
medium
Describe a time you had to choose between using an AI model and a simpler rule-based system.
#Tradeoffs
#Pragmatism
AI Engineer
•
Behavioral
•
easy
How do you stay current with the fast-moving AI/ML research landscape?
#Research
#Continuous Learning
AI Engineer
•
Behavioral
•
hard
Tell me about a time an AI system you built produced unexpected or harmful outputs.
#Responsibility
#Ethics
AI Engineer
•
Behavioral
•
hard
Describe an AI product you built from scratch. What were the key technical decisions?
#Product Development
AI Engineer
•
Behavioral
•
hard
Describe a situation where you had to debug a hard-to-reproduce AI model failure.
#Problem Solving
AI Engineer
•
Coding
•
hard
Implement a semantic chunking strategy for long documents.
#Chunking
#Embeddings
AI Engineer
•
Coding
•
medium
Write a Python class to manage conversation history for a multi-turn chatbot.
#Chatbot
#Memory
AI Engineer
•
Coding
•
hard
Implement a simple RAG pipeline using Python, LangChain, and FAISS.
#RAG
#Python
AI Engineer
•
Coding
•
medium
Write a retry mechanism with exponential backoff for LLM API calls.
#Reliability
#APIs
AI Engineer
•
System Design
•
hard
How would you architect an AI platform that supports 1000 concurrent LLM requests?
#Scaling
#LLM Serving
AI Engineer
•
System Design
•
hard
Design an AI-powered customer support chatbot for an e-commerce platform.
#Chatbot
#LLM
AI Engineer
•
System Design
•
hard
Design a document question-answering system using RAG.
#RAG
#Vector Search
AI Engineer
•
System Design
•
hard
Design an AI code review system that integrates with GitHub PRs.
#Code Review
#LLM
AI Engineer
•
System Design
•
hard
How would you build a multi-modal AI system that processes both text and images?
#Multi-Modal
#Vision
AI Engineer
•
System Design
•
hard
Design a real-time AI safety filter for user-generated content.
#Content Moderation
#Real-Time
AI Engineer
•
System Design
•
hard
Design an AI agent system that can autonomously browse the web and complete tasks.
#Agents
#Tool Use
AI Engineer
•
Technical
•
medium
How do you choose the right embedding model for a domain-specific search task?
#Embedding Models
#Search
AI Engineer
•
Technical
•
hard
Explain positional encoding in transformers. What are the differences between absolute and rotary position embeddings?
#Positional Encoding
#RoPE
AI Engineer
•
Technical
•
hard
What is hallucination in LLMs? How do you detect and mitigate it?
#Hallucination
#Safety
AI Engineer
•
Technical
•
medium
Explain the difference between autoregressive and masked language modeling.
#Autoregressive
#Masked LM
AI Engineer
•
Technical
•
hard
What is a mixture of experts (MoE) architecture? How does it scale?
#MoE
#Scaling
AI Engineer
•
Technical
•
hard
What is Constitutional AI? How does Anthropic use it?
#Constitutional AI
#Anthropic
AI Engineer
•
Technical
•
hard
How do you red-team an AI system?
#Red Teaming
#Security
AI Engineer
•
Technical
•
medium
What are guardrails in LLM applications? How do they work?
#Guardrails
#Output Filtering
AI Engineer
•
Technical
•
medium
How do you integrate OpenAI API or Gemini API into a production application?
#OpenAI
#Gemini
AI Engineer
•
Technical
•
medium
What is LangChain? What are its key components (Chains, Agents, Tools)?
#LangChain
#Agents
AI Engineer
•
Technical
•
medium
What is streaming response from an LLM API? How do you implement it in a web app?
#Streaming
#API
AI Engineer
•
Technical
•
medium
Explain structured output generation from LLMs (JSON mode, Instructor library).
#Structured Output
#JSON
AI Engineer
•
Technical
•
hard
Explain how vector similarity search works. What are HNSW and IVF indices?
#HNSW
#Similarity Search
AI Engineer
•
Technical
•
medium
Compare vector databases: Pinecone, Weaviate, Qdrant, and pgvector.
#Vector DB
#Embeddings
AI Engineer
•
Technical
•
medium
What is semantic search? How does it differ from keyword-based search?
#Semantic Search
#NLP
AI Engineer
•
Technical
•
hard
Explain the difference between dense and sparse retrieval in RAG.
#Dense Retrieval
#BM25
AI Engineer
•
Technical
•
hard
How do you evaluate retrieval quality in a RAG system?
#Evaluation
#Retrieval
AI Engineer
•
Technical
•
hard
How do you evaluate the quality of an LLM-generated response?
#LLM Evaluation
#RAGAS
AI Engineer
•
Technical
•
hard
What is AI alignment? What are the key safety concerns with large-scale AI deployment?
#Alignment
#Safety
AI Engineer
•
Technical
•
hard
Explain the concept of AI bias. How do you detect and mitigate it in production?
#Bias
#Fairness
AI Engineer
•
Technical
•
medium
How do you manage LLM API rate limits and costs in production?
#Rate Limiting
#Cost
AI Engineer
•
Technical
•
hard
Explain function calling / tool use in LLMs. How do you implement it?
#Function Calling
#Tool Use
AI Engineer
•
Technical
•
hard
Explain the difference between GPT, BERT, and T5 architectures.
#GPT
#BERT
#T5
AI Engineer
•
Technical
•
medium
What is prompt engineering? What are few-shot, zero-shot, and chain-of-thought prompting?
#Prompt Engineering
#Few-Shot
AI Engineer
•
Technical
•
hard
Explain how RLHF (Reinforcement Learning from Human Feedback) improves LLMs.
#RLHF
#Alignment
AI Engineer
•
Technical
•
hard
What is RAG (Retrieval-Augmented Generation)? When would you use it over fine-tuning?
#RAG
#Fine-Tuning
AI Engineer
•
Technical
•
medium
Explain the difference between fine-tuning and in-context learning.
#Fine-Tuning
#ICL
AI Engineer
•
Technical
•
medium
What is token context window? How do you handle documents longer than the context limit?
#Context Window
#Chunking
Cloud Engineer
•
Behavioral
•
medium
Tell me about a time you significantly reduced cloud infrastructure costs.
#FinOps
#Impact
Cloud Engineer
•
Behavioral
•
medium
Describe a situation where a critical production system went down, and there was no runbook. How did you handle it?
#Incident Management
#Ambiguity
#Ownership
#SRE
Cloud Engineer
•
Behavioral
•
medium
Tell me about a time you had to work with a difficult stakeholder or team member who strongly disagreed with your technical approach. How did you resolve it?
#Conflict Resolution
#Communication
#Teamwork
#Influence
Cloud Engineer
•
Behavioral
•
medium
Tell me about a time you had to push back on a customer's architectural choice because you knew it would lead to scalability issues down the line.
#Customer Empathy
#Communication
#Pushback
#Consulting
Cloud Engineer
•
Behavioral
•
medium
Describe your experience with incident post-mortems. What do you include?
#Post-Mortem
#Learning
Cloud Engineer
•
Behavioral
•
medium
How do you communicate a complex cloud architecture to non-technical stakeholders?
#Stakeholders
Cloud Engineer
•
Behavioral
•
medium
Tell me about a time you improved the reliability of a cloud-based data system.
#SRE
#Impact
Cloud Engineer
•
Behavioral
•
medium
Describe a situation where you had to choose between two cloud architectures. How did you decide?
#Architecture
#Tradeoffs
Cloud Engineer
•
Behavioral
•
hard
Tell me about a major cloud outage you experienced. How did you respond?
#Outage
#On-Call
Cloud Engineer
•
Behavioral
•
hard
Describe a time you migrated a critical workload to the cloud with zero downtime.
#Cloud Migration
Cloud Engineer
•
Behavioral
•
easy
How do you stay updated with new cloud services and features?
#Continuous Learning
Cloud Engineer
•
Coding
•
medium
Write a function to validate if a given string is a valid IPv4 address, and then extend it to check if it belongs to a specific CIDR block.
#String Manipulation
#Bitwise Operations
#Networking
Cloud Engineer
•
Coding
•
medium
Given a list of log entries with timestamps and error codes, write a function to find the top 3 most frequent error codes within a sliding window of 5 minutes.
#Sliding Window
#Hash Map
#Queue
#Data Structures
Cloud Engineer
•
Coding
•
medium
Write a Python script to find and delete all unattached persistent disks in a GCP project that are older than 30 days to save costs.
#Python
#GCP API
#Cost Optimization
#Scripting
Cloud Engineer
•
System Design
•
hard
How would you set up a streaming data pipeline on GCP using Pub/Sub and Dataflow?
#GCP
#Pub/Sub
#Dataflow
Cloud Engineer
•
System Design
•
hard
Design a data lake on AWS using S3, Glue, and Athena.
#AWS
#S3
#Athena
Cloud Engineer
•
System Design
•
hard
Design a real-time streaming data pipeline on GCP to ingest, process, and analyze millions of IoT sensor events per second.
#Pub/Sub
#Dataflow
#BigQuery
#IoT
Cloud Engineer
•
System Design
•
hard
Design a highly available, globally distributed web application on GCP that handles sudden, massive spikes in traffic (e.g., a viral news site).
#Global Load Balancer
#Cloud CDN
#Cloud Run
#Cloud Spanner
Cloud Engineer
•
System Design
•
hard
A customer wants to migrate a monolithic on-premise application backed by an Oracle database to GCP. Walk me through your migration strategy.
#Cloud Migration
#Strangler Fig
#Database Migration Service
#Bare Metal Solution
Cloud Engineer
•
System Design
•
hard
How do you implement disaster recovery for a cloud data warehouse?
#DR
#RTO
#RPO
Cloud Engineer
•
System Design
•
hard
How would you architect a data platform that reduces spend by 40% without impacting performance?
#FinOps
#Cloud
Cloud Engineer
•
Technical
•
medium
What is the shared responsibility model in cloud security?
#Cloud Security
#IAM
Cloud Engineer
•
Technical
•
hard
What is a VPC (Virtual Private Cloud)? How do you design a secure VPC architecture?
#VPC
#Security
Cloud Engineer
•
Technical
•
easy
Explain the difference between regions, availability zones, and edge locations.
#Regions
#AZs
Cloud Engineer
•
Technical
•
medium
How does auto-scaling work? What are the different scaling strategies?
#Auto-Scaling
#EC2
Cloud Engineer
•
Technical
•
medium
What is a cloud-native application? How does it differ from a lifted-and-shifted one?
#Cloud Native
#Migration
Cloud Engineer
•
Technical
•
hard
Explain multi-cloud vs hybrid cloud architectures and their tradeoffs.
#Multi-Cloud
#Hybrid
Cloud Engineer
•
Technical
•
hard
Explain Kubernetes architecture: control plane, nodes, pods, and services.
#K8s
#Containers
Cloud Engineer
•
Technical
•
hard
What is a Kubernetes Operator and when would you build one?
#Operators
#CRD
Cloud Engineer
•
Technical
•
hard
How does container networking work in Kubernetes?
#Networking
#CNI
Cloud Engineer
•
Technical
•
hard
Explain how you would design a cross-project IAM strategy for a large enterprise using Shared VPCs and least privilege principles.
#IAM
#Shared VPC
#Security
#Resource Hierarchy
Cloud Engineer
•
Technical
•
hard
What happens exactly when you type `ls -l` in a Linux terminal? Go as deep into the OS level as possible.
#Linux
#Syscalls
#File Systems
#Process Management
Cloud Engineer
•
Technical
•
medium
A customer complains that their GKE pods cannot reach an external API. Walk me through your troubleshooting steps.
#GKE
#Networking
#VPC
#Cloud NAT
Cloud Engineer
•
Technical
•
medium
How would you implement a zero-downtime deployment strategy for a microservice running on Cloud Run?
#Cloud Run
#CI/CD
#Traffic Splitting
#SRE
Cloud Engineer
•
Technical
•
medium
Explain the difference between a Readiness probe and a Liveness probe in Kubernetes. What happens if you misconfigure them?
#Kubernetes
#GKE
#Reliability
#Microservices
Cloud Engineer
•
Technical
•
easy
Compare and contrast Cloud Storage, Persistent Disk, and Filestore. Give specific use cases for when you would choose one over the others.
#Storage
#GCS
#Block Storage
#File Storage
Cloud Engineer
•
Technical
•
medium
What is OpenTelemetry? How does it standardize observability?
#OpenTelemetry
#Tracing
Cloud Engineer
•
Technical
•
medium
How would you set up CloudWatch dashboards for a data pipeline?
#CloudWatch
#AWS
Cloud Engineer
•
Technical
•
medium
Explain the three pillars of observability: logs, metrics, and traces.
#Logs
#Metrics
#Traces
Cloud Engineer
•
Technical
•
easy
What is a runbook? How do you create effective runbooks for data infrastructure?
#Runbook
#On-Call
Cloud Engineer
•
Technical
•
medium
How do you do capacity planning for a cloud data platform?
#Scaling
#Planning
Cloud Engineer
•
Technical
•
hard
Explain chaos engineering. How would you implement it for a data pipeline?
#Chaos Engineering
#Fault Injection
Cloud Engineer
•
Technical
•
medium
What are SLOs, SLAs, and SLIs? How do you define them for a data platform?
#SLO
#Reliability
Cloud Engineer
•
Technical
•
hard
How would you implement network segmentation for a multi-tier application?
#Security
#Subnets
Cloud Engineer
•
Technical
•
medium
What is AWS PrivateLink? When would you use it?
#PrivateLink
#VPC
Cloud Engineer
•
Technical
•
medium
How do cloud IAM roles and policies work? Explain least-privilege principle.
#IAM
#Permissions
Cloud Engineer
•
Technical
•
medium
Explain TLS/SSL termination in a cloud load balancer.
#TLS
#Load Balancer
Cloud Engineer
•
Technical
•
hard
What is zero-trust networking? How do you implement it on cloud?
#Zero Trust
#Networking
Cloud Engineer
•
Technical
•
medium
How does AWS Glue Data Catalog work with Athena?
#Glue
#Athena
Cloud Engineer
•
Technical
•
medium
Explain AWS S3 storage classes and lifecycle policies.
#S3
#Cost
Cloud Engineer
•
Technical
•
hard
What is BigQuery Slots? How do you optimize BigQuery query costs?
#GCP
#Cost
Cloud Engineer
•
Technical
•
medium
Explain the difference between AWS Lambda and EC2 for data processing.
#Lambda
#Serverless
Cloud Engineer
•
Technical
•
hard
Compare AWS EMR, GCP Dataproc, and Azure HDInsight for Spark workloads.
#EMR
#Dataproc
#Spark
Cloud Engineer
•
Technical
•
hard
How do you handle Terraform state across multiple teams?
#State Management
#Collaboration
Cloud Engineer
•
Technical
•
medium
Explain idempotency in infrastructure provisioning.
#Idempotency
#Terraform
Cloud Engineer
•
Technical
•
medium
How do you manage secrets in cloud infrastructure? (HashiCorp Vault, AWS Secrets Manager)
#Secrets Management
#Vault
Cloud Engineer
•
Technical
•
medium
What is the difference between Terraform and Pulumi?
#Terraform
#Pulumi
Cloud Engineer
•
Technical
•
hard
Explain Terraform's state management. What happens if the state file is corrupted?
#IaC
#State
Cloud Engineer
•
Technical
•
medium
How does a Kubernetes Ingress controller work?
#Ingress
#Load Balancing
Cloud Engineer
•
Technical
•
medium
Explain the difference between Docker and containerd.
#Docker
#containerd
Cloud Engineer
•
Technical
•
hard
How would you set up horizontal pod autoscaling based on custom metrics?
#HPA
#Custom Metrics
Cloud Engineer
•
Technical
•
hard
What is a service mesh? Explain how Istio works.
#Istio
#Service Mesh
Cloud Engineer
•
Technical
•
medium
Explain Kubernetes resource requests vs limits. What happens if a pod exceeds its memory limit?
#Resources
#OOM
Cloud Engineer
•
Technical
•
hard
Compare AWS, GCP, and Azure for a data-intensive workload. What are the key differentiators?
#AWS
#GCP
#Azure
Cloud Engineer
•
Technical
•
easy
Explain IaaS, PaaS, and SaaS with examples.
#IaaS
#PaaS
#SaaS
Data Analyst
•
Behavioral
•
medium
Tell me about an analysis that changed a major business decision.
#Business Impact
#Influence
Data Analyst
•
Behavioral
•
medium
How do you handle a situation where a stakeholder challenges your analysis?
#Stakeholders
#Confidence
Data Analyst
•
Behavioral
•
medium
Describe a time you found an insight that was counterintuitive.
#Curiosity
Data Analyst
•
Behavioral
•
hard
Tell me about a time you had incomplete data but still needed to deliver analysis.
#Ambiguity
Data Analyst
•
Behavioral
•
easy
How do you ensure your analyses are reproducible?
#Reproducibility
Data Analyst
•
Behavioral
•
medium
Tell me about a time you discovered data quality issues mid-analysis. What did you do?
#Problem Solving
Data Analyst
•
Behavioral
•
medium
How do you prioritize analytical requests when multiple teams need you?
#Time Management
Data Analyst
•
Behavioral
•
medium
Describe a dashboard you built that was widely adopted. What made it successful?
#Visualization
Data Analyst
•
Coding
•
hard
Write a SQL query to find customers who made purchases in both January and February but not March.
#Set Operations
Data Analyst
•
Coding
•
easy
Explain how groupby and agg work in pandas with an example.
#Pandas
#GroupBy
Data Analyst
•
Coding
•
hard
What is a funnel query? Write one for a 3-step user onboarding flow.
#Funnel Analysis
Data Analyst
•
Coding
•
medium
Explain window functions. Write a query using LAG() to compute day-over-day change.
#Window Functions
Data Analyst
•
Coding
•
hard
Write a SQL query to calculate the rolling 28-day average session duration per user.
#Rolling Average
#Sessions
Data Analyst
•
Coding
•
hard
How would you detect anomalies in a daily revenue time series using SQL?
#Anomaly Detection
#SQL
Data Analyst
•
Coding
•
medium
What is a pivot table in SQL? How would you implement it without native PIVOT support?
#Pivot
#Data Transformation
Data Analyst
•
Coding
•
medium
How would you merge two large DataFrames efficiently in pandas?
#Pandas
#Merging
Data Analyst
•
Coding
•
medium
Describe how to detect and handle outliers in a dataset using Python.
#Outliers
#Data Cleaning
Data Analyst
•
Coding
•
easy
Write Python code to load a CSV, clean missing values, and compute summary statistics.
#Data Cleaning
#Pandas
Data Analyst
•
Coding
•
medium
Write a SQL query to calculate month-over-month revenue growth.
#Revenue
#Growth Analytics
Data Analyst
•
Coding
•
hard
How would you build a cohort analysis for user retention in SQL?
#Cohort Analysis
#Retention
Data Analyst
•
Coding
•
medium
How would you use pandas to compute a 7-day rolling average of sessions?
#Pandas
#Time Series
Data Analyst
•
Technical
•
medium
Describe your process for creating an executive-level analytics presentation.
#Executive Reporting
Data Analyst
•
Technical
•
easy
How do you choose between a bar chart, line chart, and scatter plot?
#Charts
#Design
Data Analyst
•
Technical
•
easy
Explain the difference between a HAVING clause and a WHERE clause.
#SQL Basics
Data Analyst
•
Technical
•
medium
How do you handle timezone conversions in SQL analytics?
#Timezones
#Analytics
Data Analyst
•
Technical
•
hard
Daily Active Users dropped 15% yesterday. Walk me through how you'd investigate.
#Root Cause Analysis
#Metrics
Data Analyst
•
Technical
•
medium
What is customer lifetime value (LTV)? How would you calculate it?
#LTV
#Retention
Data Analyst
•
Technical
•
easy
Explain the difference between DAU, WAU, and MAU. Which is most useful and when?
#Engagement
#KPIs
Data Analyst
•
Technical
•
medium
How would you measure the success of a new feature launch?
#Feature Success
#Metrics
Data Analyst
•
Technical
•
easy
What is ARPU (Average Revenue Per User)? How do you segment ARPU analysis?
#ARPU
#Revenue
Data Analyst
•
Technical
•
hard
Explain the concept of attribution modeling. What are last-click vs multi-touch models?
#Marketing Analytics
Data Analyst
•
Technical
•
medium
How would you build a dashboard to monitor e-commerce funnel health?
#Visualization
#Funnel
Data Analyst
•
Technical
•
hard
What metrics would you use to measure the health of a marketplace?
#Marketplace
#Supply & Demand
Data Analyst
•
Technical
•
easy
What is net promoter score (NPS)? How do you analyse NPS trends?
#NPS
#Customer Satisfaction
Data Analyst
•
Technical
•
hard
How would you measure the impact of a pricing change on revenue?
#Pricing
#A/B Test
Data Analyst
•
Technical
•
hard
Explain how you'd set up an A/B test to validate a new checkout flow.
#A/B Testing
#Statistics
Data Analyst
•
Technical
•
hard
What sample size do you need for an A/B test? How do you calculate it?
#Sample Size
#Power
Data Analyst
•
Technical
•
hard
A/B test shows p=0.04, but the effect size is tiny. Would you ship?
#Practical Significance
#Decision Making
Data Analyst
•
Technical
•
medium
What is a novelty effect in experimentation? How do you account for it?
#Novelty Effect
#Bias
Data Analyst
•
Technical
•
hard
How do you handle multiple metrics in an A/B test (metric tradeoffs)?
#Multiple Metrics
#Tradeoffs
Data Analyst
•
Technical
•
medium
What makes a good data visualization? Walk me through your design principles.
#Design
#Communication
Data Analyst
•
Technical
•
medium
How would you explain statistical significance to a non-technical product manager?
#Storytelling
#Statistics
Data Analyst
•
Technical
•
easy
What tools do you use for dashboarding? Compare Tableau, Looker, and Metabase.
#Tableau
#Looker
Data Engineer
•
Behavioral
•
medium
Describe a situation where you disagreed with a senior engineer or product manager on a technical design choice. How did you resolve it?
#Conflict Resolution
#Collaboration
#Googleyness
Data Engineer
•
Behavioral
•
medium
Tell me about a time you simplified a complex data platform decision across multiple teams.
#Communication
#Stakeholders
Data Engineer
•
Behavioral
•
medium
Describe a situation where a data pipeline you owned went down in production. How did you handle it?
#On-Call
#Problem Solving
Data Engineer
•
Behavioral
•
medium
How do you handle disagreements with data analysts or scientists who want features that compromise pipeline reliability?
#Conflict Resolution
Data Engineer
•
Behavioral
•
medium
Tell me about a time you significantly improved the performance of a data system.
#Performance
#Optimization
Data Engineer
•
Behavioral
•
hard
Describe how you've balanced technical debt vs. new feature development in a data platform.
#Prioritization
Data Engineer
•
Behavioral
•
medium
Tell me about a time you onboarded a new data source that had significant quality issues.
#Problem Solving
Data Engineer
•
Behavioral
•
easy
Describe your experience mentoring junior data engineers.
#Mentoring
#Collaboration
Data Engineer
•
Behavioral
•
easy
How do you stay current with rapidly evolving data engineering tools and practices?
#Growth Mindset
Data Engineer
•
Behavioral
•
medium
Tell me about a time you had to design a data pipeline with highly ambiguous requirements. How did you figure out what to build?
#Ambiguity
#Googleyness
#Communication
Data Engineer
•
Behavioral
•
medium
Tell me about a time a data pipeline you owned failed in production. What was the business impact, and what steps did you take to fix it and prevent it from happening again?
#Incident Management
#Ownership
#Post-mortem
Data Engineer
•
Coding
•
medium
Given a list of user session time intervals on YouTube represented as [start_time, end_time], write a Python function to merge all overlapping sessions and return the consolidated active viewing periods.
#Arrays
#Sorting
#Intervals
Data Engineer
•
Coding
•
hard
Implement a rate limiter in Python for an API using a sliding window approach. The rate limiter should allow a maximum of N requests per minute per user.
#Queues
#Sliding Window
#Concurrency
Data Engineer
•
Coding
•
medium
Write a SQL query to calculate the rolling 7-day average of daily video views per category on YouTube, ensuring days with zero views are still accounted for in the average.
#Window Functions
#Aggregation
#Rolling Averages
Data Engineer
•
Coding
•
medium
Write a Python function to parse a massive (100GB+) log file of Google Search queries and return the top K most frequent IP addresses. You have limited RAM.
#Heaps
#Hash Maps
#Log Parsing
#Big O
Data Engineer
•
Coding
•
medium
Write a SQL query to find the top 3 highest-grossing apps in each region, but only include regions that have at least 100 active apps.
#CTEs
#Window Functions
#Filtering
Data Engineer
•
Coding
•
medium
Write a SQL query to find the second highest salary per department.
#Window Functions
#SQL
Data Engineer
•
Coding
•
medium
Write a SQL query to compute a 7-day rolling average of daily sales.
#Window Functions
#Analytics
Data Engineer
•
Coding
•
hard
Write a SQL query to find the longest streak of consecutive days a user has logged into Google Workspace. The input table has user_id and login_date.
#Window Functions
#Gaps and Islands
#CTEs
Data Engineer
•
System Design
•
hard
Design the data warehouse architecture for Google Play Store analytics. Stakeholders need daily reports on app downloads, revenue, and crash rates by region and device type.
#Data Warehousing
#BigQuery
#Schema Design
#ETL
Data Engineer
•
System Design
•
hard
Design a data pipeline for Google Search query logs at 100K QPS.
#Scale
#Google
Data Engineer
•
System Design
•
hard
Design a data model for an e-commerce platform tracking orders, users, and products.
#ER Modeling
#Dimensional Modeling
Data Engineer
•
System Design
•
hard
Design an ETL pipeline that ingests 10TB of raw clickstream data daily.
#ETL
#Batch Processing
Data Engineer
•
System Design
•
hard
How would you design a data pipeline that needs exactly-once delivery guarantees?
#Exactly-Once
#Kafka
Data Engineer
•
System Design
•
hard
How would you design a real-time anomaly detection pipeline for 100K events/sec?
#Real-Time
#Anomaly Detection
Data Engineer
•
System Design
•
hard
Design a real-time streaming data pipeline to detect click fraud in Google Ads. How would you ingest, process, and store the data to flag fraudulent clicks within seconds?
#Streaming
#Pub/Sub
#Dataflow
#Fraud Detection
Data Engineer
•
System Design
•
hard
How would you design a data warehouse for a ride-sharing company from scratch?
#Architecture
#Design
Data Engineer
•
System Design
•
hard
Design a batch processing pipeline to update Google Maps ETA prediction models based on daily historical traffic data. The data volume is petabytes per day.
#Batch Processing
#MapReduce
#DAGs
#Orchestration
Data Engineer
•
Technical
•
medium
Explain the difference between S3, HDFS, and GCS for data storage.
#S3
#HDFS
#GCS
Data Engineer
•
Technical
•
medium
Explain the concept of a data lakehouse. What are its advantages over a traditional data warehouse?
#Data Lakehouse
#Data Warehouse
Data Engineer
•
Technical
•
hard
How do you handle schema evolution in a data pipeline without breaking downstream consumers?
#Schema Evolution
#Backward Compatibility
Data Engineer
•
Technical
•
medium
What is a medallion architecture (Bronze/Silver/Gold)?
#Medallion
#Data Lake
Data Engineer
•
Technical
•
medium
How do you implement data quality checks in a production pipeline?
#Great Expectations
#Data Validation
Data Engineer
•
Technical
•
medium
What is data lineage and why is it important? How do you implement it?
#Lineage
#Metadata
Data Engineer
•
Technical
•
hard
How would you detect and handle data drift in a production system?
#Data Drift
#Monitoring
Data Engineer
•
Technical
•
medium
What is PII (Personally Identifiable Information) and how do you handle it in a data pipeline?
#PII
#Privacy
#Compliance
Data Engineer
•
Technical
•
medium
Explain the concept of a data catalog. What tools have you used?
#Data Catalog
#Metadata
Data Engineer
•
Technical
•
hard
Compare AWS Redshift, Google BigQuery, and Snowflake for a petabyte-scale warehouse.
#Redshift
#BigQuery
#Snowflake
Data Engineer
•
Technical
•
hard
How does BigQuery handle large joins efficiently? What is its columnar storage approach?
#BigQuery
#Columnar Storage
Data Engineer
•
Technical
•
medium
How would you reduce costs in a cloud-based data platform?
#Cloud
#Cost
Data Engineer
•
Technical
•
medium
What is infrastructure as code (IaC)? Have you used Terraform for data infrastructure?
#Terraform
#IaC
Data Engineer
•
Technical
•
hard
What is Data Vault methodology? How does it differ from Kimball?
#Data Vault
#Kimball
Data Engineer
•
Technical
•
hard
You have a PySpark job running on Dataproc that joins a massive user table with a smaller transaction table. The job is taking hours and failing with OOM errors due to data skew. How do you optimize it?
#Spark
#Data Skew
#Salting
#Performance Tuning
Data Engineer
•
Technical
•
medium
Explain the difference between partitioning and clustering in BigQuery. When would you use one over the other, and when would you use both?
#BigQuery
#Partitioning
#Clustering
#Optimization
Data Engineer
•
Technical
•
medium
In Apache Beam or Google Cloud Dataflow, how do you handle late-arriving data in a windowed streaming pipeline?
#Streaming
#Watermarks
#Late Data
#Apache Beam
Data Engineer
•
Technical
•
medium
What is the star schema vs snowflake schema? When would you use each?
#Star Schema
#Snowflake Schema
Data Engineer
•
Technical
•
medium
Explain compaction in Delta Lake / Iceberg. Why is it important?
#Compaction
#Performance
Data Engineer
•
Technical
•
hard
What is Delta Lake? How does it provide ACID transactions on data lakes?
#Delta Lake
#ACID
#Time Travel
Data Engineer
•
Technical
•
medium
Explain how Parquet and ORC file formats work and when you'd use each.
#Parquet
#ORC
#Columnar
Data Engineer
•
Technical
•
medium
What is the CAP theorem? Give an example of a real-world system tradeoff.
#CAP
#Consistency
#Availability
Data Engineer
•
Technical
•
medium
How does Kafka handle message ordering guarantees?
#Ordering
#Partitions
Data Engineer
•
Technical
•
medium
What is Apache Kafka? Explain topics, partitions, consumer groups, and offsets.
#Kafka
#Streaming
Data Engineer
•
Technical
•
hard
Explain the difference between map-side and reduce-side joins in MapReduce/Spark.
#Joins
#MapReduce
Data Engineer
•
Technical
•
hard
What is data skew in Spark? How do you diagnose and fix it?
#Data Skew
#Performance
Data Engineer
•
Technical
•
hard
Explain how Apache Spark's execution model works. What is a DAG in Spark?
#Spark
#DAG
#Distributed Computing
Data Engineer
•
Technical
•
easy
Explain the difference between push-based and pull-based data ingestion.
#Push
#Pull
#CDC
Data Engineer
•
Technical
•
medium
What is Apache Airflow? How does it differ from Prefect or Dagster?
#Airflow
#Prefect
#Dagster
Data Engineer
•
Technical
•
medium
How do you monitor data pipeline health in production? What metrics do you track?
#Monitoring
#Alerting
Data Engineer
•
Technical
•
medium
Describe how you'd implement circuit breakers in a data pipeline.
#Circuit Breakers
#Fault Tolerance
Data Engineer
•
Technical
•
hard
What is backfilling? How do you handle a backfill of 2 years of historical data without impacting production?
#Backfill
#Airflow
Data Engineer
•
Technical
•
hard
Explain the Lambda architecture. What are its tradeoffs vs Kappa architecture?
#Lambda
#Kappa
#Streaming
Data Engineer
•
Technical
•
medium
What is idempotency and why is it critical in data pipelines?
#Idempotency
#Data Quality
Data Engineer
•
Technical
•
hard
How do you handle late-arriving data in a streaming pipeline?
#Kafka
#Watermarks
Data Engineer
•
Technical
•
medium
Explain ACID properties. Which databases sacrifice ACID for performance and why?
#ACID
#Distributed Systems
Data Engineer
•
Technical
•
medium
What are CTEs (Common Table Expressions) and how do they differ from subqueries?
#CTEs
#SQL
Data Engineer
•
Technical
•
hard
Describe partitioning strategies in a data warehouse. When would you use range vs hash partitioning?
#Partitioning
#Performance
Data Engineer
•
Technical
•
medium
What is a materialized view? How does it differ from a regular view?
#Materialized Views
#Performance
Data Engineer
•
Technical
•
medium
Explain the difference between RANK(), DENSE_RANK(), and ROW_NUMBER().
#Window Functions
#SQL
Data Engineer
•
Technical
•
hard
How would you optimize a SQL query that is running slowly on a 1 billion row table?
#Query Optimization
#Indexing
Data Engineer
•
Technical
•
hard
What is a slowly changing dimension (SCD)? Describe SCD Type 1, 2, and 3 with examples.
#SCD
#Dimensional Modeling
Data Engineer
•
Technical
•
medium
Explain the difference between OLAP and OLTP systems. When would you use each?
#OLAP
#OLTP
#Databases
Data Engineer
•
Technical
•
hard
How would you model and optimize a BigQuery dataset for petabyte-scale ad-click attribution?
#BigQuery
#Attribution
Data Engineer
•
Technical
•
hard
Explain how Google Spanner achieves global consistency with TrueTime.
#Spanner
#TrueTime
Data Engineer
•
Technical
•
hard
How would you use Dataflow (Apache Beam) for a streaming aggregation job?
#Dataflow
#Beam
Data Engineer
•
Technical
•
hard
What is Bigtable? When would you choose it over BigQuery?
#Bigtable
#GCP
Data Engineer
•
Technical
•
medium
How do you optimize BigQuery costs for ad-hoc analytical queries?
#Cost
#Optimization
Data Scientist
•
Behavioral
•
medium
Describe a project where you had to iterate significantly on your initial approach.
#Iteration
#Learning
Data Scientist
•
Behavioral
•
medium
How do you prioritize between multiple data science requests from different teams?
#Stakeholder Management
Data Scientist
•
Behavioral
•
medium
How do you approach ethical considerations in ML model building?
#Fairness
#Bias
Data Scientist
•
Behavioral
•
hard
Tell me about a time your model failed in production. What did you learn?
#Production
#MLOps
Data Scientist
•
Behavioral
•
hard
Describe a time you used data to challenge a widely held assumption in your organization.
#Influence
#Analytics
Data Scientist
•
Behavioral
•
medium
Tell me about a data science project where the results surprised you. What did you do?
#Analytical Thinking
Data Scientist
•
Behavioral
•
medium
Describe how you communicated a complex model result to a non-technical stakeholder.
#Storytelling
Data Scientist
•
Behavioral
•
hard
Tell me about a time you had to push back on a business request for an analysis that would be misleading.
#Ethics
#Communication
Data Scientist
•
Coding
•
hard
Write a SQL query to calculate 30-day user retention.
#Retention
#Analytics
Data Scientist
•
Coding
•
hard
How would you write a funnel analysis query in SQL?
#Funnel
#Analytics
Data Scientist
•
Coding
•
medium
Write a query to identify duplicate records and deduplicate them.
#Deduplication
#Data Quality
Data Scientist
•
System Design
•
hard
Design a feature store. What are its key components?
#Feature Store
#MLOps
Data Scientist
•
System Design
•
hard
How would you build and deploy a churn prediction model?
#Churn
#MLOps
Data Scientist
•
System Design
•
hard
Design a real-time fraud detection system for a payments platform.
#Fraud Detection
#Real-Time ML
Data Scientist
•
System Design
•
hard
How would you build a recommendation system? Compare collaborative vs content-based filtering.
#Collaborative Filtering
#Content-Based
Data Scientist
•
Technical
•
medium
How do you choose a north star metric for a product?
#Metrics
#Product Strategy
Data Scientist
•
Technical
•
medium
How do you handle class imbalance in a classification problem?
#Imbalanced Data
#SMOTE
Data Scientist
•
Technical
•
medium
What is regularization? Explain L1 vs L2 regularization and their effects.
#Regularization
#L1
#L2
Data Scientist
•
Technical
•
medium
How does a Random Forest work? What are its hyperparameters and how do you tune them?
#Random Forest
#Hyperparameter Tuning
Data Scientist
•
Technical
•
hard
Explain gradient boosting. How does XGBoost differ from a standard gradient boosting machine?
#Gradient Boosting
#XGBoost
Data Scientist
•
Technical
•
medium
How would you detect and handle multicollinearity in a regression model?
#Multicollinearity
#Regression
Data Scientist
•
Technical
•
hard
Explain the curse of dimensionality and its implications for ML models.
#Dimensionality
#Feature Engineering
Data Scientist
•
Technical
•
medium
What is a confidence interval? How does it differ from a prediction interval?
#Confidence Interval
#Intervals
Data Scientist
•
Technical
•
hard
Explain Bayesian vs Frequentist statistics. When would you use each?
#Bayesian
#Frequentist
Data Scientist
•
Technical
•
hard
How do you evaluate the quality of a search ranking change at Google's scale?
#Search Ranking
#Evaluation
Data Scientist
•
Technical
•
hard
How do you design an A/B test for a new product feature?
#A/B Testing
#Statistics
Data Scientist
•
Technical
•
easy
What is the difference between Type I and Type II errors?
#Hypothesis Testing
#Errors
Data Scientist
•
Technical
•
medium
Explain the central limit theorem and its importance in data science.
#CLT
#Sampling
Data Scientist
•
Technical
•
medium
What is a p-value? Why is a p-value of 0.05 not always sufficient?
#Hypothesis Testing
#p-value
Data Scientist
•
Technical
•
hard
Explain how Google's NDCG metric works for search relevance.
#NDCG
#Relevance
Data Scientist
•
Technical
•
medium
Explain the bias-variance tradeoff. How does it influence model selection?
#Bias-Variance
#Model Selection
Data Scientist
•
Technical
•
hard
What statistical techniques would you use to analyse Search CTR experiments?
#CTR
#Statistics
Data Scientist
•
Technical
•
hard
What is the multiple testing problem? How do you correct for it?
#Bonferroni
#FDR
Data Scientist
•
Technical
•
medium
Explain the ROC curve and AUC metric. When would you prefer AUC over accuracy?
#ROC
#AUC
#Metrics
Data Scientist
•
Technical
•
hard
What is a network effect in experimentation? How do you handle SUTVA violation?
#SUTVA
#Network Effects
Data Scientist
•
Technical
•
hard
How would you design an experiment to measure the impact of a new ranking algorithm?
#Experimentation
#Metrics
Data Scientist
•
Technical
•
medium
How would you detect and mitigate overfitting in a neural network?
#Overfitting
#Dropout
#Regularization
Data Scientist
•
Technical
•
medium
Explain batch normalization and why it helps training.
#Batch Normalization
#Training
Data Scientist
•
Technical
•
medium
What is embedding? How do word embeddings like Word2Vec and GloVe work?
#Embeddings
#Word2Vec
Data Scientist
•
Technical
•
medium
How would you approach an NLP problem like sentiment analysis from scratch?
#Sentiment Analysis
#Text Classification
Data Scientist
•
Technical
•
medium
What is transfer learning? How would you fine-tune a pre-trained model?
#Transfer Learning
#Fine-Tuning
Data Scientist
•
Technical
•
hard
Explain the transformer architecture. What are attention mechanisms?
#Transformers
#Attention
#BERT
Data Scientist
•
Technical
•
hard
What is the vanishing gradient problem? How do LSTM and ResNet address it?
#LSTM
#ResNet
#Gradients
Data Scientist
•
Technical
•
medium
Explain how backpropagation works.
#Backpropagation
#Neural Networks
Data Scientist
•
Technical
•
medium
What is principal component analysis (PCA)? What are its limitations?
#PCA
#SVD
Data Scientist
•
Technical
•
medium
Explain the difference between bagging and boosting.
#Bagging
#Boosting
Data Scientist
•
Technical
•
medium
How do you approach feature selection?
#Feature Selection
#LASSO
Data Scientist
•
Technical
•
medium
What is cross-validation? Explain k-fold and stratified k-fold.
#Cross Validation
#k-Fold
Data Scientist
•
Technical
•
hard
How do you monitor model performance in production? What is model drift?
#Model Drift
#Monitoring
Data Scientist
•
Technical
•
easy
Explain the difference between INNER JOIN, LEFT JOIN, and CROSS JOIN.
#Joins
#SQL
Data Scientist
•
Technical
•
easy
What is an experiment holdout group?
#Holdout
#Control Group
Data Scientist
•
Technical
•
hard
How would you identify the root cause of a sudden 20% drop in DAU?
#Root Cause Analysis
#Debugging
Data Scientist
•
Technical
•
easy
Explain the difference between a leading indicator and a lagging indicator.
#Metrics
#KPIs
Machine Learning Engineer
•
Behavioral
•
medium
Describe a situation where you had to push back on a Product Manager who wanted to launch an ML feature that achieved high accuracy but failed to meet the required P99 latency SLA.
#Conflict Resolution
#Prioritization
#Cross-functional Collaboration
Machine Learning Engineer
•
Behavioral
•
medium
Tell me about a time you discovered a significant bias or data leakage in your ML model right before deployment. How did you handle it, and how did you communicate the delay to stakeholders?
#Googleyness
#Communication
#Model Debugging
#Ethics
Machine Learning Engineer
•
Coding
•
medium
Given a 2D grid representing a cluster of TPU v5e pods where '1' is an active pod and '0' is inactive, write an algorithm to find the maximum area of connected active pods. Pods are connected horizontally or vertically.
#Graph Theory
#Depth-First Search
#Breadth-First Search
#Matrix
Machine Learning Engineer
•
Coding
•
medium
Implement a custom sparse matrix-vector multiplication (SpMV) algorithm. Assume the sparse matrix is provided in Compressed Sparse Row (CSR) format.
#Linear Algebra
#Data Structures
#Performance Optimization
Machine Learning Engineer
•
Coding
•
hard
Write a function to sample a random node from a massive, distributed graph where you only have access to an API `get_neighbors(node_id)`. You do not know the total number of nodes.
#Randomized Algorithms
#Graph Theory
#Reservoir Sampling
#Markov Chains
Machine Learning Engineer
•
Coding
•
hard
Given an array of integers representing the execution times of ML training jobs and an integer K representing the number of available GPUs, partition the jobs to minimize the maximum execution time on any single GPU. Jobs must be scheduled in contiguous subarrays.
#Binary Search
#Greedy Algorithms
#Dynamic Programming
Machine Learning Engineer
•
System Design
•
hard
Design an autocomplete/typeahead system for Google Docs using a neural language model. The system must run within strict latency constraints (<50ms). How do you optimize the model and serving infrastructure?
#Low Latency Serving
#Model Quantization
#Sequence-to-Sequence
#Edge ML
Machine Learning Engineer
•
System Design
•
hard
Design the recommendation system for YouTube Shorts. Specifically, how would you handle the cold-start problem for new creators and optimize for real-time engagement metrics like watch time and swipe-aways?
#Recommendation Systems
#Two-Tower Models
#Cold Start
#Real-time Streaming
Machine Learning Engineer
•
System Design
•
hard
Design a system to predict Ad Click-Through Rate (CTR) for Google Search. How do you handle categorical features with massive cardinality, and how do you update the model with fresh data throughout the day?
#CTR Prediction
#Feature Engineering
#Continuous Training
#Embeddings
Machine Learning Engineer
•
System Design
•
medium
Design a system to detect policy-violating images (e.g., hate speech, extreme violence) uploaded to Google Drive. The system must process millions of images per minute with extreme precision to avoid false positives on user data.
#Computer Vision
#High Throughput
#Active Learning
#Anomaly Detection
Machine Learning Engineer
•
Technical
•
medium
Explain the mathematical difference between Layer Normalization and Batch Normalization. Why is Layer Normalization almost exclusively used in Transformer architectures instead of Batch Normalization?
#Normalization
#Transformers
#Mathematics
Machine Learning Engineer
•
Technical
•
hard
Explain how you would implement KV-caching in a Transformer model during autoregressive inference. What are the memory bottlenecks, and how do techniques like PagedAttention address them?
#Transformers
#LLM Inference
#Memory Optimization
#Attention Mechanisms
Machine Learning Engineer
•
Technical
•
medium
How would you evaluate the quality of a Retrieval-Augmented Generation (RAG) system built for Google Cloud enterprise search? What specific metrics would you use for the retrieval component vs. the generation component?
#RAG
#LLM Evaluation
#Information Retrieval
#Metrics
Machine Learning Engineer
•
Technical
•
medium
You are training a multimodal model (text and image) using a contrastive loss similar to CLIP. You notice the text loss converges much faster than the image loss, leading to poor alignment. How do you diagnose and fix this?
#Multimodal ML
#Loss Optimization
#Contrastive Learning
#Debugging
Machine Learning Engineer
•
Technical
•
hard
How does Distributed Data Parallel (DDP) differ from Fully Sharded Data Parallel (FSDP) or ZeRO optimization when training large language models? When would you choose one over the other?
#Distributed Training
#Model Parallelism
#Data Parallelism
#Memory Management
ML Engineer
•
Behavioral
•
medium
Tell me about a disagreement you had with a researcher. How did you resolve it?
#Communication
ML Engineer
•
Behavioral
•
easy
How do you keep up with the rapidly evolving ML landscape?
#Continuous Learning
ML Engineer
•
Behavioral
•
hard
Tell me about a time an ML model caused an unexpected real-world impact.
#Responsibility
#AI Safety
ML Engineer
•
Behavioral
•
medium
Describe how you collaborated with data scientists to productionize their research code.
#Research to Production
ML Engineer
•
Behavioral
•
hard
Tell me about a time you had to optimize a model for latency without sacrificing too much accuracy.
#Latency
#Accuracy
ML Engineer
•
Behavioral
•
medium
Describe a model you deployed to production. What were the biggest challenges?
#Deployment
#Challenges
ML Engineer
•
Behavioral
•
medium
How do you decide when a model is 'good enough' to ship?
#Quality
#Judgment
ML Engineer
•
Behavioral
•
hard
Describe a time you had to re-architecture a system because the original ML approach didn't scale.
#Scalability
ML Engineer
•
Coding
•
hard
Write a custom PyTorch Dataset and DataLoader for irregular time series data.
#PyTorch
#DataLoader
ML Engineer
•
Coding
•
hard
Implement logistic regression with gradient descent in NumPy.
#Logistic Regression
#NumPy
ML Engineer
•
Coding
•
hard
Implement a K-means clustering algorithm from scratch in Python.
#K-Means
#Clustering
ML Engineer
•
Coding
•
hard
How would you write a batched inference pipeline using Python and Triton server?
#Triton
#Batching
ML Engineer
•
Coding
•
medium
Implement a sliding window approach to detect anomalies in a time series.
#Anomaly Detection
#Time Series
ML Engineer
•
System Design
•
hard
Design a system to retrain models automatically when performance degrades.
#Retraining
#Automation
ML Engineer
•
System Design
•
hard
Design YouTube's video recommendation system end to end.
#Recommendations
#Ranking
ML Engineer
•
System Design
•
hard
How would you build a personalized ad targeting system?
#Targeting
#ML Systems
ML Engineer
•
System Design
•
hard
Design a training and serving architecture for a large language model at scale.
#Infrastructure
#Scale
ML Engineer
•
System Design
•
hard
Design a search ranking system for an e-commerce platform.
#Ranking
#Relevance
ML Engineer
•
System Design
•
hard
What is a feature store? Design one from scratch.
#Feature Engineering
#MLOps
ML Engineer
•
System Design
•
hard
Design a CI/CD pipeline for ML models.
#CI/CD
#Deployment
ML Engineer
•
System Design
•
hard
Design a real-time content moderation system.
#NLP
#Real-Time
ML Engineer
•
System Design
•
hard
How would you serve a model that needs to respond in under 10ms?
#Low Latency
#Serving
ML Engineer
•
Technical
•
hard
How do you detect data drift vs model drift? How do you respond to each?
#Drift
#Production
ML Engineer
•
Technical
•
medium
How would you deploy a model with Vertex AI Predictions?
#Vertex AI
#GCP
ML Engineer
•
Technical
•
medium
What is model ensembling? When does it help, and when does it hurt?
#Ensembling
#Performance
ML Engineer
•
Technical
•
medium
Explain vector databases. What are FAISS, Pinecone, and Weaviate?
#Vector DB
#Embeddings
ML Engineer
•
Technical
•
hard
How would you evaluate an LLM for a production use case?
#Evaluation
#Benchmarking
ML Engineer
•
Technical
•
medium
What is Vertex AI? How does it compare to SageMaker?
#Vertex AI
#SageMaker
ML Engineer
•
Technical
•
hard
What is quantization in neural networks? How does it reduce inference cost?
#Quantization
#Inference
ML Engineer
•
Technical
•
hard
Explain knowledge distillation. When would you use it?
#Distillation
#Compression
ML Engineer
•
Technical
•
hard
What is the difference between model parallelism and data parallelism in distributed training?
#Parallelism
#Training
ML Engineer
•
Technical
•
medium
How do you version ML models and datasets? What tools do you use?
#Versioning
#DVC
#MLflow
ML Engineer
•
Technical
•
hard
Explain blue-green deployment vs canary deployment for ML models.
#Blue-Green
#Canary
ML Engineer
•
Technical
•
medium
What is shadow mode deployment in ML?
#Shadow Mode
#A/B Testing
ML Engineer
•
Technical
•
hard
What is RAG (Retrieval-Augmented Generation)? Describe its architecture.
#RAG
#Vector Search
ML Engineer
•
Technical
•
hard
What is LoRA (Low-Rank Adaptation)? How does it reduce fine-tuning costs?
#LoRA
#Fine-Tuning
ML Engineer
•
Technical
•
hard
Explain the RLHF (Reinforcement Learning from Human Feedback) training approach.
#RLHF
#Fine-Tuning
ML Engineer
•
Technical
•
medium
How do you profile and debug a slow training run?
#Profiling
#Debugging
ML Engineer
•
Technical
•
medium
What are the differences between PyTorch and TensorFlow for production?
#PyTorch
#TensorFlow
ML Engineer
•
Technical
•
hard
Explain mixed precision training (FP16/BF16). What are the risks?
#Mixed Precision
#Performance
ML Engineer
•
Technical
•
hard
How do you optimize GPU utilization during training?
#GPU
#Performance
ML Engineer
•
Technical
•
medium
What is Kubernetes? How is it used for ML model serving?
#Kubernetes
#Serving
ML Engineer
•
Technical
•
medium
Explain model serialization formats: ONNX, TorchScript, SavedModel.
#ONNX
#Serialization
ML Engineer
•
Technical
•
hard
Explain how Google's Two-Tower model works for recommendations.
#Two-Tower
#Embeddings
ML Engineer
•
Technical
•
hard
How does Google's TensorFlow Extended (TFX) pipeline work?
#TFX
#Pipelines
ML Engineer
•
Technical
•
easy
What is the difference between a data scientist and an ML engineer?
#Roles
#MLOps
ML Engineer
•
Technical
•
medium
Explain the model training pipeline from raw data to deployment.
#Pipeline
#Training
ML Engineer
•
Technical
•
medium
What is the difference between online learning and offline learning?
#Online Learning
#Batch Learning
ML Engineer
•
Technical
•
medium
How do you handle missing data in ML model features?
#Imputation
#Missing Data
ML Engineer
•
Technical
•
medium
Explain gradient descent variants: batch, stochastic, and mini-batch.
#Gradient Descent
#Optimization
ML Engineer
•
Technical
•
medium
What are learning rate schedulers and why are they important?
#Learning Rate
#Training
ML Engineer
•
Technical
•
hard
Explain the attention mechanism in transformers with mathematical detail.
#Attention
#Transformers
Product Manager
•
Behavioral
•
medium
Tell me about a time you had to align conflicting stakeholders across engineering and design on a tight deadline.
#Stakeholder Management
#Conflict Resolution
#Cross-functional Collaboration
Product Manager
•
Behavioral
•
hard
Describe a situation where you had to convince an engineering team to build a feature they strongly disagreed with.
#Influence without Authority
#Engineering Collaboration
#Communication
Product Manager
•
Behavioral
•
medium
Tell me about a product or feature you launched that failed. What metrics indicated it failed, and what did you learn?
#Post-mortem
#Resilience
#Data-driven
Product Manager
•
Coding
•
easy
Write a SQL query to find the top 3 most watched YouTube video categories in the last 30 days, given a 'views' table and a 'videos' table.
#SQL
#Data Analysis
#YouTube
Product Manager
•
System Design
•
hard
Explain how Google Search autocomplete works at a high level and how you would scale it for a newly supported language.
#Google Search
#Latency
#Scalability
#Data Structures
Product Manager
•
System Design
•
hard
Design the backend architecture for a real-time collaborative editing feature in Google Docs.
#Google Docs
#Concurrency
#Distributed Systems
Product Manager
•
Technical
•
medium
You are the PM for Google Chrome. You have a proposed feature that increases page load speed by 5% but increases memory usage by 15%. Do you launch it?
#Google Chrome
#Trade-offs
#Performance Metrics
Product Manager
•
Technical
•
medium
Design a product for Google Maps that helps users find parking using AI.
#Google Maps
#Artificial Intelligence
#User Experience
Product Manager
•
Technical
•
hard
Microsoft is aggressively integrating ChatGPT into Bing. What should be Google Search's strategic response over the next 2 years?
#Google Search
#Competitive Analysis
#Generative AI
#Gemini
Product Manager
•
Technical
•
medium
YouTube Shorts engagement has dropped by 10% week-over-week. How do you investigate and resolve this?
#YouTube Shorts
#Root Cause Analysis
#Data Analytics
Product Manager
•
Technical
•
hard
How would you integrate Gemini (Google's LLM) into Google Workspace (Docs/Sheets) specifically for enterprise B2B users?
#Generative AI
#Google Workspace
#B2B Enterprise
Product Manager
•
Technical
•
medium
Estimate the total bandwidth consumed by Google Photos backups globally in a single day.
#Google Photos
#Fermi Problem
#Data Storage
Product Manager
•
Technical
•
easy
Design a Google Nest smart display device specifically for the elderly.
#Google Nest
#Accessibility
#Hardware PM
Product Manager
•
Technical
•
medium
How would you monetize Google Maps further without showing intrusive ads on the core map interface?
#Google Maps
#Monetization
#B2B APIs
Product Manager
•
Technical
•
hard
Should Google spin off YouTube into a completely separate company? Walk me through your strategic reasoning.
#YouTube
#Business Strategy
#Market Dynamics
Software Engineer
•
Behavioral
•
medium
Tell me about a time you pushed back on a product manager's feature request because you believed it would negatively impact system latency or reliability. How did you resolve the conflict?
#Communication
#Googlyness
#Prioritization
#Conflict Resolution
Software Engineer
•
Behavioral
•
medium
Describe a time when you had to lead a project across multiple timezones or distributed teams. How did you ensure alignment and handle communication breakdowns?
#Cross-functional Collaboration
#Leadership
#Project Management
Software Engineer
•
Behavioral
•
easy
Tell me about a time you discovered a critical bug or security vulnerability in a production system. What was your immediate action, and how did you ensure it wouldn't happen again?
#Incident Management
#Post-mortem
#Ownership
Software Engineer
•
Behavioral
•
medium
Tell me about a time you had to pivot your technical approach completely because of changing business requirements from product managers. How did you manage the transition and team morale?
#Adaptability
#Conflict Resolution
#Team Dynamics
Software Engineer
•
Behavioral
•
medium
Describe a situation where you had to navigate a highly ambiguous project with no clear technical direction. How did you define the milestones and deliver the solution?
#Ambiguity
#Project Management
#Googlyness
Software Engineer
•
Behavioral
•
medium
Tell me about a time you discovered a significant flaw in a system's architecture right before a major launch. How did you balance the need to ship on time with the need to fix the technical debt?
#Googlyness
#Decision Making
#Risk Management
#Communication
Software Engineer
•
Coding
•
medium
You are given a stream of Google Ads click events. Implement a sliding window counter that returns the number of clicks in the exact last 5 minutes. The stream is highly concurrent.
#Concurrency
#Sliding Window
#Queues
#Data Streams
Software Engineer
•
Coding
•
medium
Given a list of Google Calendar events for 'n' users, where each event consists of a start and end time, and a required meeting duration 'k', find all available time slots where all 'n' users can attend a meeting.
#Intervals
#Two Pointers
#Sorting
Software Engineer
•
Coding
•
medium
Implement a system for Google Calendar that takes in a list of N users' daily schedules (lists of busy intervals) and their working hours, and returns all available time slots of duration T where all N users can meet.
#Intervals
#Sorting
#Two Pointers
#Arrays
Software Engineer
•
Coding
•
hard
Given a list of daily budgets and expected returns for various Google Ads campaigns, write an algorithm to allocate a total budget B to maximize the overall return. You can allocate partial budgets to campaigns, but returns diminish non-linearly.
#Dynamic Programming
#Greedy Algorithms
#Optimization
#Math
Software Engineer
•
Coding
•
hard
Design a data structure for Google Search Autocomplete that supports adding new search queries, updating the frequency of a query, and retrieving the top 'k' most frequent queries that start with a given prefix in real-time.
#Trie
#Heap
#Hash Map
#Design
Software Engineer
•
Coding
•
hard
You are given a map represented as a weighted directed graph. You need to route an electric vehicle from point A to point B. The EV has a maximum battery capacity, and certain nodes have charging stations. Find the shortest path such that the EV never runs out of battery.
#Graphs
#Dijkstra's Algorithm
#State Space Search
Software Engineer
•
Coding
•
medium
You are given a map represented as a 2D grid where 0 is a road, 1 is a building, and 2 is an EV charging station. Given a starting position, find the shortest path to an EV charging station. Follow-up: How would you optimize this if we have millions of queries per second for different starting positions on a static map?
#Graph Theory
#BFS
#Dynamic Programming
#Caching
Software Engineer
•
Coding
•
medium
Implement a thread-safe LRU (Least Recently Used) cache. This cache will be used in a high-throughput microservice. Explain how you would minimize lock contention.
#Concurrency
#Data Structures
#Linked List
#Hash Map
Software Engineer
•
Coding
•
hard
You are building a feature for Google Docs. Given a string representing a document and a list of operations (insert, delete, replace at specific indices), apply the operations efficiently. How do you handle overlapping operations?
#String Manipulation
#Operational Transformation
#Design
Software Engineer
•
Coding
•
medium
Given a stream of user watch events on YouTube (user_id, video_id, timestamp, duration_watched), write a function to find the longest contiguous sequence of videos a user watched where they completed at least 90% of each video.
#Sliding Window
#Hash Map
#Stream Processing
Software Engineer
•
Coding
•
hard
Design a data structure for Google Search autocomplete. It must support inserting a string with a frequency, and querying the top K most frequent strings that start with a given prefix. Optimize for query latency.
#Trie
#Priority Queue
#Design
#String Manipulation
Software Engineer
•
Coding
•
medium
Implement a function to evaluate a mathematical expression given as a string (e.g., '3 + 5 / (2 - 1) * 4'). The expression can contain parentheses, and you must follow standard order of operations. This is used in Google Search's calculator widget.
#Stacks
#String Parsing
#Math
Software Engineer
•
Coding
•
medium
You are given an array of integers representing the CPU load of a Google server cluster over time. Find the maximum contiguous subarray sum, but you are allowed to delete exactly one element to maximize the sum.
#Dynamic Programming
#Arrays
#Kadane's Algorithm
Software Engineer
•
Coding
•
hard
Given a 2D grid representing a Google Maps satellite image where '1' is land and '0' is water, find the minimum number of days to connect two disconnected islands. You can change one '0' to '1' per day.
#BFS
#DFS
#Matrix
Software Engineer
•
System Design
•
medium
Design a distributed rate limiter for Google Cloud API Gateway that can handle millions of requests per second with minimal latency overhead.
#Redis
#Token Bucket
#Distributed Systems
#Hashing
Software Engineer
•
System Design
•
hard
Design the real-time view counter for YouTube live streams. The system must handle massive spikes in traffic (e.g., a Super Bowl stream) and provide eventually consistent view counts to the UI with sub-second latency.
#Stream Processing
#Event Sourcing
#Scalability
#Data Aggregation
Software Engineer
•
System Design
•
hard
Design Google Photos' auto-backup feature for mobile devices. How do you handle intermittent network connectivity, deduplication, and efficient storage?
#Blob Storage
#Checksum
#Mobile Sync
#Resumable Uploads
Software Engineer
•
System Design
•
hard
Design Google Search's ranking pipeline.
#Ranking
#Scale
Software Engineer
•
System Design
•
hard
Design the backend for Google Docs collaborative editing. How do you handle concurrent edits from multiple users offline and online to ensure eventual consistency?
#Operational Transformation
#CRDTs
#WebSockets
#Concurrency
Software Engineer
•
System Design
•
hard
Design a distributed, highly available rate limiter for Google Cloud APIs. It needs to support millions of requests per second, enforce limits per customer per API, and add minimal latency to the critical path.
#Distributed Caching
#Token Bucket
#Redis
#High Availability
Software Engineer
•
System Design
•
hard
How would you design a distributed key-value store like Bigtable?
#Key-Value Store
Software Engineer
•
System Design
•
hard
Design the block-level file synchronization mechanism for Google Drive. How do you handle concurrent edits offline, minimize bandwidth for large files, and resolve conflicts when the client reconnects?
#Distributed Systems
#Data Synchronization
#Concurrency
#Network Optimization
Software Engineer
•
System Design
•
hard
Design the video recommendation feed for YouTube. Focus on how you would fetch, rank, and serve the recommendations at scale within a 200ms latency budget.
#Machine Learning Infra
#Caching
#Microservices
#Recommendation Systems
Software Engineer
•
System Design
•
hard
Design a distributed web crawler for Google Search. How do you handle DNS resolution bottlenecks, avoid crawler traps, prioritize high-quality domains, and ensure you don't DDoS the target servers?
#Distributed Systems
#Graph Traversal
#Politeness Policies
#Queueing
Software Engineer
•
Technical
•
medium
Implement a thread-safe LRU cache in your language of choice. It must support get() and put() in O(1) time, and handle concurrent access from multiple threads without race conditions or deadlocks.
#Concurrency
#Hash Map
#Doubly Linked List
#Mutex
Software Engineer
•
Technical
•
medium
Given two tables: `search_logs` (query_id, user_id, query_string, timestamp, region) and `clicks` (query_id, url_clicked, rank_position), write an optimized SQL query to find the top 3 queries with the highest click-through rate in the 'US' region over the last 7 days, partitioned by day.
#Window Functions
#Joins
#Aggregations
#Performance Optimization
Software Engineer
•
Technical
•
hard
What is MapReduce? How does it work at Google's scale?
#MapReduce
Difficulty Radar
Based on recent AI-sourced data.
Meet Your Interviewers
The "Standard" Interviewer
Senior EngineerFocuses on core competencies, system constraints, and clear communication.
SimulateUnwritten Rules
Think Out Loud
Always explain your thought process before writing code or drawing architecture.