KPMG
Multinational professional services network, and one of the Big Four accounting organizations.
4 Rounds
~21 Days
Medium
The Interview Loop
Recruiter Screen (30 min)
Standard fit check, behavioral questions, and resume overview.
Technical Loop (3-4 Rounds)
Deep dive into domain knowledge, coding, and system design.
Interview Question Bank
Backend Engineer
•
Behavioral
•
medium
Tell me about a time you had to explain a complex backend architectural decision to a non-technical client or stakeholder.
#Communication
#Consulting
#Stakeholder Management
Backend Engineer
•
Behavioral
•
medium
Describe a situation where you had to push back on a client's unrealistic technical requirement. How did you handle it?
#Negotiation
#Client Management
#Requirements Gathering
Backend Engineer
•
Behavioral
•
medium
How do you handle working on a legacy codebase that lacks documentation, which is a common scenario during our audit technology migrations?
#Adaptability
#Legacy Systems
#Problem Solving
Backend Engineer
•
Behavioral
•
hard
Tell me about a time you found a critical security or compliance vulnerability in a system you were developing.
#Security
#Compliance
#Integrity
Backend Engineer
•
Behavioral
•
medium
Describe a time when you had to deliver a backend feature under a very tight deadline for a major tax or advisory client.
#Time Management
#Prioritization
#Stress Management
Backend Engineer
•
Behavioral
•
easy
How do you prioritize tasks when assigned to multiple client engagements simultaneously?
#Organization
#Multitasking
#Agile
Backend Engineer
•
Behavioral
•
medium
Tell me about a time you disagreed with a Senior Architect's design decision. How did you resolve it?
#Conflict Resolution
#Technical Discussions
#Teamwork
Backend Engineer
•
Coding
•
medium
Find the Kth largest element in an unsorted array.
#Heaps
#Quickselect
#Arrays
Backend Engineer
•
Coding
•
medium
Write a function to group anagrams together from an array of strings.
#Hash Tables
#Strings
#Sorting
Backend Engineer
•
Coding
•
hard
Implement a rate limiter algorithm (e.g., Token Bucket) in code.
#System Design
#Concurrency
#Algorithms
Backend Engineer
•
Coding
•
easy
Given a string containing just the characters '(', ')', '{', '}', '[' and ']', determine if the input string is valid.
#Stacks
#Strings
Backend Engineer
•
Coding
•
medium
Write an algorithm to merge overlapping intervals. (Useful for scheduling audit tasks).
#Arrays
#Sorting
Backend Engineer
•
Coding
•
easy
Given an array of integers, return indices of the two numbers such that they add up to a specific target.
#Arrays
#Hash Tables
Backend Engineer
•
Coding
•
easy
Write a function to detect if there is a cycle in a linked list.
#Linked Lists
#Two Pointers
Backend Engineer
•
Coding
•
medium
Design and implement a Least Recently Used (LRU) cache.
#Data Structures
#Hash Tables
#Linked Lists
Backend Engineer
•
Coding
•
medium
Write a SQL query to find the top 3 highest-paid employees in each department.
#SQL
#Window Functions
Backend Engineer
•
Coding
•
hard
Given a binary tree, write a function to serialize and deserialize it.
#Trees
#BFS
#DFS
#Design
Backend Engineer
•
System Design
•
hard
Design a system to synchronize offline audit data collected by field consultants once they regain internet connectivity.
#Offline-First
#Conflict Resolution
#Synchronization
Backend Engineer
•
System Design
•
hard
How would you design a data ingestion pipeline that processes daily CSV dumps from legacy client systems into our modern data lake?
#ETL
#Data Pipelines
#Cloud Architecture
Backend Engineer
•
System Design
•
medium
Design a real-time notification system to alert consultants when a client uploads a time-sensitive compliance document.
#WebSockets
#Pub/Sub
#Real-time
Backend Engineer
•
System Design
•
medium
Design an API rate-limiting service to protect our public-facing advisory APIs from DDoS attacks or abuse.
#API Gateway
#Caching
#Security
Backend Engineer
•
System Design
•
hard
How would you design a distributed reporting engine that generates end-of-month financial reports for millions of accounts?
#Batch Processing
#Distributed Systems
#MapReduce
Backend Engineer
•
System Design
•
medium
Design a scalable audit logging service that records every action taken by users across multiple KPMG microservices.
#Event Sourcing
#High Throughput
#Databases
Backend Engineer
•
System Design
•
hard
Design a secure document management system for tax professionals to upload, parse, and store client financial records.
#Storage
#Security
#Asynchronous Processing
#Microservices
Backend Engineer
•
System Design
•
medium
Design a URL shortener service. How would you ensure high availability and low latency?
#Hashing
#Caching
#Scalability
Backend Engineer
•
Technical
•
hard
How do you implement secure authentication and authorization in a Spring Boot or .NET Core application handling sensitive financial data?
#OAuth2
#JWT
#Spring Security
#Identity Access Management
Backend Engineer
•
Technical
•
medium
Explain the differences between Monolithic and Microservices architectures. When would you recommend a client stick with a Monolith?
#Microservices
#Monolith
#System Architecture
Backend Engineer
•
Technical
•
medium
What are the common ways to optimize a slow-running SQL query in SQL Server or PostgreSQL?
#Query Optimization
#Indexing
#Execution Plans
Backend Engineer
•
Technical
•
medium
Describe the differences between REST and GraphQL. When would you choose one over the other for a client portal?
#REST
#GraphQL
#Web Services
Backend Engineer
•
Technical
•
hard
How do you handle database migrations and schema changes with zero downtime?
#Database Migrations
#CI/CD
#Zero Downtime Deployment
Backend Engineer
•
Technical
•
easy
Explain the concept of Dependency Injection and its benefits in enterprise application development.
#Dependency Injection
#Inversion of Control
#Testing
Backend Engineer
•
Technical
•
easy
What is the difference between an Inner Join and a Left Join? How do they impact performance on large financial datasets?
#SQL
#Database Optimization
#Joins
Backend Engineer
•
Technical
•
hard
How do you ensure ACID properties in a distributed microservices environment?
#Distributed Systems
#Transactions
#Saga Pattern
#Two-Phase Commit
Backend Engineer
•
Technical
•
hard
Explain how Garbage Collection works in Java or .NET and how you would troubleshoot a memory leak in a production backend.
#Garbage Collection
#Memory Management
#Profiling
Backend Engineer
•
Technical
•
medium
What are the SOLID principles? Can you give an example of how you violated one in the past and fixed it?
#OOP
#SOLID
#Clean Code
Cloud Engineer
•
Behavioral
•
hard
Tell me about a time you disagreed with a senior architect or a client regarding a cloud design choice (e.g., choosing IaaS over PaaS). How did you resolve it?
#Conflict Resolution
#Technical Influence
#Collaboration
Cloud Engineer
•
Behavioral
•
medium
Tell me about a time you had to explain a complex cloud architecture to a non-technical client stakeholder. How did you ensure they understood the value and risks?
#Communication
#Client Management
#Consulting
Cloud Engineer
•
Behavioral
•
medium
Describe a situation where a client's cloud migration project was falling behind schedule due to unforeseen technical debt. How did you handle it?
#Project Management
#Problem Solving
#Agile
Cloud Engineer
•
Behavioral
•
medium
Describe a time you had to quickly learn a new cloud service or tool to meet a strict client deadline.
#Adaptability
#Continuous Learning
Cloud Engineer
•
Behavioral
•
easy
Why are you interested in a Cloud Engineering role at KPMG specifically, as opposed to a product-focused tech company?
#Motivation
#Consulting Mindset
Cloud Engineer
•
Coding
•
medium
Implement a retry mechanism with exponential backoff in Python for an API call to a cloud service that frequently rate-limits.
#Python
#API
#Resiliency
#Algorithms
Cloud Engineer
•
Coding
•
medium
Write a Python script using Boto3 (or Azure SDK) to find and stop all EC2 instances (or VMs) that do not have a specific 'Environment' tag.
#Python
#Boto3
#Automation
#Cloud SDK
Cloud Engineer
•
Coding
•
medium
Write a Bash script to parse a web server log file, count the number of 500 HTTP status codes, and output the top 5 IP addresses causing them.
#Bash
#Linux
#Log Parsing
#Awk/Sed
Cloud Engineer
•
Coding
•
easy
Given a JSON payload representing cloud billing data, write a Python function to aggregate the total cost per service.
#Python
#Data Manipulation
#JSON
Cloud Engineer
•
Coding
•
hard
Write a Python function to check if a given CIDR block overlaps with a list of existing CIDR blocks in a VPC.
#Python
#Networking
#IP Addressing
Cloud Engineer
•
System Design
•
hard
Design a centralized logging and monitoring solution for a multi-cloud environment spanning AWS and Azure.
#Multi-cloud
#Logging
#Monitoring
#SIEM
Cloud Engineer
•
System Design
•
medium
Explain how you would design an event-driven architecture using serverless components for a tax document processing system.
#Serverless
#Event-Driven
#AWS Lambda
#Azure Functions
Cloud Engineer
•
System Design
•
hard
Design a secure hub-and-spoke network topology in Azure. How do you handle routing, firewall rules, and isolation for different client departments?
#Azure Networking
#Hub and Spoke
#VNet Peering
#Azure Firewall
Cloud Engineer
•
System Design
•
medium
A client is experiencing unexpectedly high cloud costs. How would you architect a cost-optimization strategy for their AWS environment?
#FinOps
#Cost Optimization
#AWS
#Right-sizing
Cloud Engineer
•
System Design
•
hard
Design a highly available, multi-region web application on Azure for a financial services client with strict data residency and compliance requirements.
#Azure
#High Availability
#Compliance
#Traffic Manager
Cloud Engineer
•
System Design
•
medium
A client wants to migrate their legacy monolithic application to AWS. Walk me through your assessment and migration strategy.
#AWS
#Migration
#6 R's
#Assessment
Cloud Engineer
•
System Design
•
hard
Walk me through the architecture of a secure data lake in GCP for an audit analytics platform.
#GCP
#Data Lake
#BigQuery
#Security
Cloud Engineer
•
System Design
•
hard
How would you design a disaster recovery strategy for an enterprise database hosted in the cloud with an RPO of 5 minutes and RTO of 1 hour?
#Disaster Recovery
#RPO/RTO
#Database
#Replication
Cloud Engineer
•
Technical
•
hard
How does Azure Active Directory (Entra ID) integrate with on-premises Active Directory? Explain the authentication flow.
#Azure AD
#Hybrid Identity
#Authentication
Cloud Engineer
•
Technical
•
medium
Explain the concept of least privilege in IAM. How do you audit and enforce it in a large AWS environment?
#IAM
#Security
#AWS IAM Access Analyzer
Cloud Engineer
•
Technical
•
hard
How do you troubleshoot a scenario where a pod in a Kubernetes cluster cannot connect to an external managed database?
#Kubernetes
#Troubleshooting
#Networking
Cloud Engineer
•
Technical
•
easy
Explain how you would use Azure DevOps to enforce branch policies and code quality checks before merging.
#Azure DevOps
#Git
#Code Quality
Cloud Engineer
•
Technical
•
hard
How do you implement infrastructure drift detection and remediation using Terraform?
#Terraform
#Drift Detection
#Automation
Cloud Engineer
•
Technical
•
medium
How do you write unit and integration tests for Infrastructure as Code?
#IaC
#Testing
#Terratest
#Checkov
Cloud Engineer
•
Technical
•
medium
Describe your approach to blue/green deployments versus canary deployments. When is each appropriate for a client?
#Deployment Strategies
#Blue/Green
#Canary
Cloud Engineer
•
Technical
•
hard
How do you ensure compliance (e.g., HIPAA or PCI-DSS) when designing a cloud infrastructure for a healthcare client?
#Compliance
#Security
#Encryption
#Audit
Cloud Engineer
•
Technical
•
medium
Compare and contrast AKS (Azure Kubernetes Service) and Azure App Service. When would you recommend one over the other to a client?
#Azure
#Containers
#PaaS
#Kubernetes
Cloud Engineer
•
Technical
•
medium
How do you manage Terraform state files securely in a team environment, specifically for a client with strict access controls?
#Terraform
#IaC
#Security
Cloud Engineer
•
Technical
•
medium
Explain the difference between Terraform modules and workspaces. Give an example of how you've used them in an enterprise environment.
#Terraform
#IaC
#Code Reusability
Cloud Engineer
•
Technical
•
hard
Describe the steps to build a CI/CD pipeline to deploy a Dockerized application to an EKS cluster securely.
#CI/CD
#Kubernetes
#Docker
#Security
Cloud Engineer
•
Technical
•
medium
What is GitOps, and how would you implement it using ArgoCD or Flux for a client's Kubernetes workloads?
#GitOps
#Kubernetes
#ArgoCD
#Continuous Deployment
Cloud Engineer
•
Technical
•
medium
How do you handle secrets management in a CI/CD pipeline and within the cloud environment (e.g., Azure Key Vault, HashiCorp Vault)?
#Secrets Management
#CI/CD
#Azure Key Vault
Cloud Engineer
•
Technical
•
hard
A deployment to production just failed and brought down the client's live environment. Walk me through your troubleshooting and rollback steps.
#Incident Response
#Troubleshooting
#Rollback
Cloud Engineer
•
Technical
•
medium
Explain the difference between a NAT Gateway, an Internet Gateway, and a Transit Gateway in AWS.
#AWS Networking
#VPC
#Routing
Cloud Engineer
•
Technical
•
medium
What are VPC endpoints (or Azure Private Link), and why are they critical for enterprise security?
#Cloud Networking
#Security
#Private Link
Data Engineer
•
Behavioral
•
medium
Tell me about a time you discovered a critical data quality issue right before a client deliverable was due.
#Data Quality
#Crisis Management
#Integrity
Data Engineer
•
Behavioral
•
medium
Describe a situation where a client changed the requirements of a data pipeline midway through the sprint. How did you handle it?
#Agile
#Scope Creep
#Client Management
Data Engineer
•
Behavioral
•
medium
Tell me about a time you had to explain a complex data engineering concept to a non-technical client stakeholder.
#Stakeholder Management
#Communication
#Consulting
Data Engineer
•
Behavioral
•
medium
Describe a time you disagreed with a senior architect or manager on a technical design. How was it resolved?
#Teamwork
#Technical Disagreement
#Professionalism
Data Engineer
•
Behavioral
•
easy
How do you prioritize tasks when working on multiple client engagements with competing deadlines?
#Prioritization
#Consulting
#Organization
Data Engineer
•
Coding
•
easy
Implement a Python function to detect and remove duplicate records based on a composite key, keeping the most recently updated record.
#Deduplication
#Pandas
#Data Cleaning
Data Engineer
•
Coding
•
medium
Calculate a rolling 7-day average of daily transactions for a financial client using SQL.
#Window Functions
#Time Series
#Financial Data
Data Engineer
•
Coding
•
medium
Write a Python algorithm to find the longest consecutive sequence of days a user logged into a client portal.
#Python
#Arrays
#Logic
Data Engineer
•
Coding
•
easy
Given a list of dictionaries representing financial transactions, write a Python function to aggregate total spend by category without using external libraries.
#Data Structures
#Dictionaries
#Aggregation
Data Engineer
•
Coding
•
medium
Write a Python script to process and merge multiple large CSV files (50GB+) that do not fit into memory.
#Chunking
#Generators
#Memory Management
Data Engineer
•
Coding
•
medium
Write a SQL query to identify gaps in sequential invoice numbers for an audit client.
#Audit Data
#Sequential Gaps
#LEAD/LAG
Data Engineer
•
Coding
•
medium
Write a SQL query to find the top 3 highest-paid employees in each department, handling ties appropriately.
#Window Functions
#DENSE_RANK
#Aggregations
Data Engineer
•
Coding
•
medium
Write a Python function to parse a deeply nested JSON file from a REST API and flatten it into a tabular pandas DataFrame.
#JSON Parsing
#Pandas
#Data Transformation
Data Engineer
•
Coding
•
hard
Write a SQL query to identify overlapping date ranges in a client's software subscription dataset.
#Self Joins
#Date Functions
#Complex Logic
Data Engineer
•
System Design
•
hard
Design a system to migrate on-premise legacy SQL Server data to a cloud-native Snowflake environment with minimal downtime.
#Cloud Migration
#Snowflake
#Change Data Capture (CDC)
Data Engineer
•
System Design
•
hard
Design a batch ETL pipeline to ingest daily transaction data from 50 different regional banks into a centralized Azure Data Lake.
#Batch Processing
#Azure
#Data Ingestion
#Scalability
Data Engineer
•
System Design
•
hard
Design a real-time fraud detection data pipeline for a credit card company.
#Streaming
#Kafka
#Real-time Processing
#Fraud Detection
Data Engineer
•
System Design
•
medium
How would you design a data model for a retail client's Customer 360 dashboard?
#Dimensional Modeling
#Customer 360
#Star Schema
Data Engineer
•
System Design
•
medium
Architect a logging, alerting, and monitoring solution for a complex data pipeline to ensure data quality and pipeline reliability.
#Observability
#Monitoring
#Data Quality
Data Engineer
•
Technical
•
medium
What strategies do you use for testing data pipelines before deploying them to production?
#Data Testing
#Unit Testing
#Integration Testing
Data Engineer
•
Technical
•
hard
How do you optimize a slow-running SQL query with multiple joins and aggregations that is timing out on a client's database?
#Query Optimization
#Indexing
#Execution Plans
Data Engineer
•
Technical
•
easy
Explain the difference between RANK(), DENSE_RANK(), and ROW_NUMBER(). Give a specific use case for each in an audit context.
#Window Functions
#Data Ranking
Data Engineer
•
Technical
•
medium
How do you handle missing, null, or corrupted data in a large dataset before loading it into a data warehouse?
#Data Cleansing
#Imputation
#ETL
Data Engineer
•
Technical
•
medium
Explain how Spark handles data partitioning and why it matters for pipeline performance.
#PySpark
#Partitioning
#Distributed Computing
Data Engineer
•
Technical
•
hard
How do you handle data skewness in a PySpark join operation where one key has millions of records and others have very few?
#PySpark
#Data Skew
#Performance Tuning
Data Engineer
•
Technical
•
medium
What is a broadcast join in Spark, and when would you use it in a client's ETL pipeline?
#PySpark
#Joins
#Optimization
Data Engineer
•
Technical
•
easy
Explain the difference between transformations and actions in Spark. Give examples of each.
#PySpark
#Lazy Evaluation
Data Engineer
•
Technical
•
hard
How would you troubleshoot and optimize a PySpark job that is failing with an OutOfMemory (OOM) error on the driver node?
#PySpark
#Troubleshooting
#Memory Management
Data Engineer
•
Technical
•
medium
Describe how you would set up a CI/CD pipeline for Databricks notebooks and data pipelines using Azure DevOps.
#CI/CD
#Databricks
#Azure DevOps
Data Engineer
•
Technical
•
easy
What is the difference between a Data Warehouse, a Data Lake, and a Data Lakehouse? Why are clients moving towards Lakehouses?
#Data Lakehouse
#Data Warehouse
#Delta Lake
Data Engineer
•
Technical
•
medium
How do you implement Slowly Changing Dimensions (SCD Type 2) in Snowflake or Databricks?
#SCD Type 2
#Data Warehousing
#Snowflake
#Databricks
Data Engineer
•
Technical
•
medium
Explain the architecture of Azure Data Factory (ADF). How do you use it to orchestrate complex ETL pipelines?
#Azure Data Factory
#Orchestration
#ETL
Data Engineer
•
Technical
•
hard
How do you ensure data security, masking, and governance when building a cloud data platform for a highly regulated healthcare or financial client?
#Data Security
#PII/PHI
#RBAC
#Data Governance
Data Engineer
•
Technical
•
medium
Explain the concept of Delta Lake and its advantages over traditional Parquet files in a data lake.
#Delta Lake
#ACID Transactions
#Time Travel
Data Engineer
•
Technical
•
medium
How would you implement incremental loading in an ETL pipeline using a watermark column?
#Incremental Load
#Watermarking
#Data Integration
Data Scientist
•
Behavioral
•
medium
Describe a time you had to push back on a client's request because the data did not support their hypothesis.
#Stakeholder Management
#Communication
#Conflict Resolution
Data Scientist
•
Behavioral
•
easy
Describe a time when you had to learn a completely new technology or tool on the fly to complete a client project.
#Adaptability
#Continuous Learning
#Consulting
Data Scientist
•
Behavioral
•
hard
Tell me about a time you discovered a significant error in your analysis after you had already presented the preliminary findings to a stakeholder.
#Integrity
#Accountability
#Communication
Data Scientist
•
Behavioral
•
medium
Describe a situation where you had to meet a tight deadline for a client deliverable but encountered a major technical roadblock.
#Time Management
#Problem Solving
#Resilience
Data Scientist
•
Behavioral
•
easy
Why do you want to work in Data Science consulting at KPMG specifically, rather than a traditional tech company?
#Motivation
#Consulting
#Company Knowledge
Data Scientist
•
Behavioral
•
easy
Tell me about a time you had to clean and process a severely messy dataset from a client. What steps did you take?
#Data Cleaning
#Problem Solving
#Attention to Detail
Data Scientist
•
Coding
•
medium
Write a SQL query using window functions to find the top 3 highest-paid employees in each department.
#Window Functions
#Ranking
#CTEs
Data Scientist
•
Coding
•
easy
Write a Pandas script to find the percentage of missing values in each column of a DataFrame and drop columns where the missing percentage exceeds 40%.
#Pandas
#Data Cleaning
Data Scientist
•
Coding
•
easy
Given a string containing just the characters '(', ')', '{', '}', '[' and ']', determine if the input string is valid.
#Stacks
#String Parsing
Data Scientist
•
Coding
•
easy
Given an array of integers, write a Python function to return the indices of the two numbers that add up to a specific target.
#Arrays
#Hash Maps
#Optimization
Data Scientist
•
Coding
•
medium
Write a Python function to merge overlapping time intervals. This is often used when analyzing user session logs.
#Arrays
#Sorting
#Intervals
Data Scientist
•
Coding
•
medium
Write a SQL query to calculate the cumulative sum of revenue per client, ordered by the transaction date.
#Window Functions
#Cumulative Sum
Data Scientist
•
Coding
•
medium
Write a SQL query to calculate the 7-day rolling average of daily transactions for a client's retail dataset.
#Window Functions
#Time Series
#Aggregation
Data Scientist
•
Coding
•
medium
Write a Python function to group a list of strings into anagrams.
#Strings
#Hash Maps
Data Scientist
•
Coding
•
hard
Write a SQL query to find all clients who have made a purchase in every single month of the year 2023.
#Aggregation
#Filtering
#Date Functions
Data Scientist
•
System Design
•
hard
Design an end-to-end machine learning pipeline to automatically extract and classify entities from unstructured tax documents.
#NLP
#OCR
#Pipeline Design
#Azure ML
Data Scientist
•
System Design
•
hard
Design a credit risk scoring system for a regional bank. What data would you need, and what models would you evaluate?
#Credit Risk
#Classification
#Feature Engineering
#Explainability
Data Scientist
•
System Design
•
hard
Design a churn prediction architecture for a telecommunications client. Include data ingestion, modeling, and deployment on a cloud platform like Azure.
#Churn Prediction
#Cloud Architecture
#Azure ML
#MLOps
Data Scientist
•
System Design
•
hard
Design an anomaly detection system to identify potentially fraudulent expense claims within an organization's internal audit data.
#Anomaly Detection
#Audit
#Fraud
#Unsupervised Learning
Data Scientist
•
System Design
•
hard
Design a recommendation system for a retail client to suggest products to users based on their browsing history and past purchases.
#Recommendation Engines
#Collaborative Filtering
#Matrix Factorization
#Cold Start
Data Scientist
•
Technical
•
easy
In SQL, explain the difference between a LEFT JOIN and an INNER JOIN, and provide a scenario where you would strictly use a LEFT JOIN.
#Joins
#Relational Databases
#Data Manipulation
Data Scientist
•
Technical
•
medium
A client notices a sudden 15% drop in user engagement on their platform. Walk me through your analytical approach to find the root cause.
#Root Cause Analysis
#Metrics
#Hypothesis Testing
Data Scientist
•
Technical
•
medium
Explain the concept of p-value to a business stakeholder who is deciding whether to launch a new marketing campaign based on your A/B test results.
#A/B Testing
#Hypothesis Testing
#Communication
Data Scientist
•
Technical
•
hard
How do you ensure that your machine learning models are fair and unbiased, especially when dealing with sensitive attributes in financial lending?
#AI Ethics
#Bias Mitigation
#Fairness
#Explainability
Data Scientist
•
Technical
•
medium
What is the curse of dimensionality, and how do you handle it when working with high-dimensional client datasets?
#Dimensionality Reduction
#PCA
#Feature Selection
Data Scientist
•
Technical
•
medium
How does a Gradient Boosting Machine (GBM) differ from a Random Forest? When would you choose one over the other?
#Ensemble Methods
#Trees
#GBM
Data Scientist
•
Technical
•
medium
Explain how you would deploy a trained machine learning model into production using Docker and an API framework like FastAPI or Flask.
#Model Deployment
#Docker
#API
#FastAPI
Data Scientist
•
Technical
•
easy
What evaluation metrics would you use for a highly imbalanced classification problem, and why is accuracy a poor choice?
#Evaluation Metrics
#Precision
#Recall
#F1-Score
Data Scientist
•
Technical
•
medium
Explain how a Random Forest model works to a non-technical audit partner.
#Random Forest
#Communication
#Ensemble Methods
Data Scientist
•
Technical
•
medium
How do you handle highly imbalanced datasets when building a fraud detection model for a financial services client?
#Imbalanced Data
#Fraud Detection
#SMOTE
#Class Weights
Data Scientist
•
Technical
•
medium
What is the difference between L1 (Lasso) and L2 (Ridge) regularization, and when would you use each in a risk scoring model?
#Regularization
#Regression
#Feature Selection
Data Scientist
•
Technical
•
hard
How would you approach a time series forecasting problem to predict next quarter's revenue for a manufacturing client?
#Time Series
#Forecasting
#ARIMA
#Prophet
Data Scientist
•
Technical
•
medium
Explain the trade-off between bias and variance. How do you identify if your model is suffering from high bias or high variance?
#Model Evaluation
#Bias-Variance Tradeoff
#Overfitting/Underfitting
Data Scientist
•
Technical
•
medium
Given a dataset of client feedback text, how would you approach building a sentiment analysis model from scratch?
#Sentiment Analysis
#Text Preprocessing
#NLP
#TF-IDF
Data Scientist
•
Technical
•
medium
How do you evaluate the performance of an unsupervised learning model, such as K-Means clustering used for customer segmentation?
#Clustering
#Unsupervised Learning
#Evaluation Metrics
DevOps Engineer
•
Behavioral
•
easy
How do you stay updated with the latest DevOps tools, and how do you decide when it is appropriate to introduce a new tool to a client's stack?
#Continuous Learning
#Consulting
#Tooling
DevOps Engineer
•
Behavioral
•
medium
Tell me about a time you had to push back on a client's architectural request because it did not meet security or compliance standards.
#Stakeholder Management
#Security
#Conflict Resolution
DevOps Engineer
•
Behavioral
•
easy
Tell me about a time you automated a manual process that saved your team or client significant time. What was the impact?
#Automation
#Efficiency
#Impact
DevOps Engineer
•
Behavioral
•
hard
Describe a situation where a deployment failed in production. How did you handle the rollback, the root cause analysis, and the client communication?
#Troubleshooting
#Client Communication
#Post-mortem
DevOps Engineer
•
Behavioral
•
medium
KPMG often works with highly regulated clients. Tell me about your experience working with compliance frameworks like SOC2, HIPAA, or PCI-DSS in your DevOps practices.
#Security
#Audit
#Frameworks
DevOps Engineer
•
Behavioral
•
medium
Describe a time you had to explain a complex technical DevOps concept (like containerization or IaC) to a non-technical audit partner or client executive.
#Communication
#Consulting
#Empathy
DevOps Engineer
•
Coding
•
medium
Write a Python script using Boto3 to find and delete all unattached EBS volumes in an AWS account.
#Python
#AWS
#Automation
DevOps Engineer
•
Coding
•
medium
Write a Dockerfile for a Python Flask application that ensures the application runs as a non-root user for security compliance.
#Docker
#Security
#Python
DevOps Engineer
•
Coding
•
easy
Write a Bash script to parse an Nginx access log and count the number of occurrences of 500 HTTP status codes.
#Bash
#Linux
#Log Analysis
DevOps Engineer
•
Coding
•
medium
Write a Terraform snippet to provision an AWS S3 bucket with versioning enabled, server-side encryption (KMS), and public access blocked.
#Terraform
#AWS
#Security
DevOps Engineer
•
Coding
•
medium
Write a Python function that takes a domain name as input and returns the number of days until its SSL certificate expires.
#Python
#Networking
#Security
DevOps Engineer
•
System Design
•
hard
Explain how you would design a secure CI/CD pipeline for a financial client using Azure DevOps.
#Azure DevOps
#DevSecOps
#Pipelines
DevOps Engineer
•
System Design
•
hard
Describe how you would migrate a legacy monolithic application to a containerized microservices architecture on Azure Kubernetes Service (AKS).
#Azure
#AKS
#Microservices
#Migration
DevOps Engineer
•
System Design
•
hard
Design a centralized logging and monitoring solution for a multi-cloud environment (AWS and Azure).
#Multi-cloud
#Logging
#Architecture
DevOps Engineer
•
System Design
•
hard
Design a highly available and disaster-recovery-ready architecture for a 3-tier web application on AWS.
#AWS
#High Availability
#Disaster Recovery
DevOps Engineer
•
Technical
•
hard
How do you handle database schema migrations in an automated deployment pipeline without causing downtime?
#Databases
#Pipelines
#Zero-Downtime
DevOps Engineer
•
Technical
•
medium
Explain how VPC peering works in AWS and discuss the limitations associated with it.
#AWS
#Networking
#VPC
DevOps Engineer
•
Technical
•
medium
What metrics do you monitor to ensure the health and performance of a Kubernetes cluster?
#Kubernetes
#Prometheus
#Metrics
DevOps Engineer
•
Technical
•
medium
How do you implement Role-Based Access Control (RBAC) in Kubernetes?
#Kubernetes
#RBAC
#IAM
DevOps Engineer
•
Technical
•
medium
Explain the GitOps workflow. How does it differ from traditional push-based CI/CD?
#GitOps
#ArgoCD
#Flux
DevOps Engineer
•
Technical
•
hard
How do you enforce compliance and governance policies across an enterprise Azure environment?
#Azure Policy
#Governance
#Compliance
DevOps Engineer
•
Technical
•
easy
What is the purpose of a Terraform provider, and how do you lock provider versions to ensure idempotent deployments?
#Terraform
#Version Control
DevOps Engineer
•
Technical
•
medium
How do you integrate Static Application Security Testing (SAST) and Dynamic Application Security Testing (DAST) into a Jenkins pipeline?
#Jenkins
#Security Scanning
#Pipelines
DevOps Engineer
•
Technical
•
easy
What are the differences between AWS IAM Roles and IAM Policies?
#AWS
#IAM
DevOps Engineer
•
Technical
•
medium
Explain the difference between a Kubernetes Deployment and a StatefulSet. When would you use each?
#Kubernetes
#Architecture
DevOps Engineer
•
Technical
•
medium
How do you manage dependencies and avoid 'dependency hell' in your infrastructure code (e.g., Terraform modules)?
#Terraform
#Architecture
#Best Practices
DevOps Engineer
•
Technical
•
medium
Describe the process of setting up an Azure Application Gateway with Web Application Firewall (WAF) for a web application.
#Azure
#Networking
#Security
DevOps Engineer
•
Technical
•
medium
How do you handle secrets management in a CI/CD pipeline to ensure credentials are never exposed in logs or source control?
#Secrets Management
#CI/CD
#DevSecOps
DevOps Engineer
•
Technical
•
medium
How do you troubleshoot a pod in Kubernetes that is stuck in the CrashLoopBackOff state?
#Kubernetes
#Debugging
DevOps Engineer
•
Technical
•
medium
How do you manage Terraform state files in a multi-developer, multi-environment setup?
#Terraform
#State Management
#Collaboration
DevOps Engineer
•
Technical
•
easy
What does it mean to 'taint' or 'replace' a resource in Terraform, and in what scenarios would you use this feature?
#Terraform
#State Management
DevOps Engineer
•
Technical
•
medium
How do you optimize Docker image sizes for faster deployments and reduced attack surfaces?
#Docker
#Optimization
#Security
DevOps Engineer
•
Technical
•
medium
Explain the concept of immutable infrastructure. Why is it particularly beneficial for auditability in a consulting environment like KPMG?
#Architecture
#Security
#Compliance
DevOps Engineer
•
Technical
•
medium
How do you configure cross-account access in AWS using IAM roles?
#AWS
#IAM
#Architecture
DevOps Engineer
•
Technical
•
hard
What is the difference between Blue/Green and Canary deployments? How would you implement a Canary deployment in Kubernetes?
#Kubernetes
#Deployment Strategies
#Traffic Routing
Frontend Engineer
•
Behavioral
•
medium
Tell me about a time you disagreed with a backend engineer regarding an API contract. How did you resolve it?
#Conflict Resolution
#Collaboration
#API Design
Frontend Engineer
•
Behavioral
•
medium
Tell me about a time you had to deliver bad news to a client or stakeholder, such as a delayed feature or a critical bug in production.
#Client Management
#Communication
#Transparency
Frontend Engineer
•
Behavioral
•
medium
Tell me about a time you had to push back on a non-technical client or partner at a consulting firm who requested a feature that was technically unfeasible within the deadline.
#Consulting
#Stakeholder Management
#Communication
Frontend Engineer
•
Behavioral
•
medium
Describe a situation where you had to explain technical debt to a business stakeholder to justify spending a sprint on refactoring rather than new features.
#Communication
#Technical Debt
#Agile
Frontend Engineer
•
Behavioral
•
medium
Consulting often requires jumping into unfamiliar codebases. Tell me about a time you had to learn a new frontend technology or framework very quickly to deliver a client project.
#Learning
#Adaptability
#Consulting
Frontend Engineer
•
Behavioral
•
medium
Give me an example of a time you went above and beyond the basic requirements of a ticket to improve the user experience or code quality for a client.
#Client Success
#Initiative
#Quality
Frontend Engineer
•
Behavioral
•
medium
Describe a time when you had to deliver a project under a strict regulatory deadline. How did you prioritize your tasks and ensure quality?
#Time Management
#Stress
#Consulting
Frontend Engineer
•
Coding
•
medium
Implement an autocomplete/type-ahead input component that fetches suggestions from a mock API. Handle loading states, empty results, and keyboard navigation.
#Debounce
#API
#DOM
#Accessibility
Frontend Engineer
•
Coding
•
medium
Write a function to flatten a deeply nested JSON object containing tax hierarchy data into a single-level object with dot-separated keys.
#JavaScript
#Recursion
#Data Transformation
Frontend Engineer
•
Coding
•
medium
Implement a data grid component in React/Angular that supports client-side sorting, pagination, and filtering for a dataset of 5,000 audit records.
#React
#Angular
#State Management
#Performance
Frontend Engineer
•
Coding
•
medium
Implement a nested dropdown menu component that supports infinite levels of sub-categories, driven by a recursive JSON structure.
#DOM Manipulation
#Recursion
#Events
Frontend Engineer
•
Coding
•
medium
Write a function to deep clone a JavaScript object without using `JSON.parse(JSON.stringify())` or external libraries like Lodash.
#JavaScript
#Recursion
#Memory
Frontend Engineer
•
Coding
•
medium
Implement an LRU (Least Recently Used) Cache class in JavaScript with `get` and `put` methods operating in O(1) time complexity.
#Data Structures
#JavaScript
#Caching
Frontend Engineer
•
Coding
•
easy
Create a progress bar component that takes a percentage prop and smoothly animates to that percentage. Ensure it is accessible.
#CSS
#JavaScript
#Animation
#Accessibility
Frontend Engineer
•
Coding
•
easy
Write a function to validate a password string. It must be at least 8 characters, contain one uppercase, one lowercase, one number, and one special character, without using a single massive Regex.
#String Manipulation
#Validation
#Logic
Frontend Engineer
•
Coding
•
easy
Given a string, find and return the first non-repeating character. If it doesn't exist, return null. Optimize for time complexity.
#Hash Maps
#Strings
#Optimization
Frontend Engineer
•
Coding
•
easy
Implement a debounce function from scratch and explain how you would use it to optimize a client search input field.
#JavaScript
#Closures
#Performance
Frontend Engineer
•
Coding
•
medium
Write a custom React Hook `useFetch` that handles data fetching, loading state, error handling, and implements a basic in-memory cache.
#React Hooks
#API
#Caching
Frontend Engineer
•
System Design
•
medium
Design a Role-Based Access Control (RBAC) system for a frontend application where different users (Admin, Auditor, Client) see entirely different navigation and features.
#Security
#RBAC
#State Management
#Routing
Frontend Engineer
•
System Design
•
medium
Design a secure document upload portal for clients to submit sensitive tax documents. It must support large files, progress tracking, and resumable uploads.
#File Upload
#UI/UX
#Network
#Security
Frontend Engineer
•
System Design
•
hard
We are migrating a monolithic legacy internal tool to a modern stack. Design a micro-frontend architecture using Webpack Module Federation.
#Micro-frontends
#Webpack
#Architecture
#Migration
Frontend Engineer
•
System Design
•
hard
Design an offline-first data collection web application for field auditors who frequently lose internet connectivity.
#Offline Storage
#Service Workers
#Data Synchronization
Frontend Engineer
•
System Design
•
hard
Design the frontend architecture for a real-time financial risk dashboard that receives high-frequency updates via WebSockets and displays complex D3.js charts.
#Frontend Architecture
#WebSockets
#Data Visualization
#Performance
Frontend Engineer
•
System Design
•
hard
Design a collaborative spreadsheet application (similar to Excel Online) for internal tax teams to edit data simultaneously.
#Operational Transformation
#WebSockets
#Canvas/DOM
#Concurrency
Frontend Engineer
•
Technical
•
hard
Security is critical at KPMG. How do you securely store JWT tokens on the client side, and what are the specific mechanisms to prevent XSS and CSRF attacks?
#OWASP
#XSS
#CSRF
#Authentication
Frontend Engineer
•
Technical
•
medium
Explain the difference between React Server Components and Client Components. In an enterprise portal, when would you choose one over the other?
#React
#Next.js
#Architecture
#Rendering
Frontend Engineer
•
Technical
•
medium
How do you ensure a public-facing government or tax portal meets WCAG 2.1 AA accessibility standards? Walk me through your auditing and implementation process.
#Accessibility
#WCAG
#HTML
#ARIA
Frontend Engineer
•
Technical
•
easy
Explain the differences between CSS Grid and Flexbox. How would you choose between them when building a complex enterprise dashboard layout?
#CSS
#Layout
#Responsive Design
Frontend Engineer
•
Technical
•
medium
You have a list of 10,000 client records that needs to be rendered on a single page. How do you optimize the frontend to ensure 60fps scrolling?
#Virtualization
#React
#DOM
#Performance
Frontend Engineer
•
Technical
•
easy
What is Event Delegation in JavaScript? Why is it useful, and can you provide an example of where you would use it in a large table?
#JavaScript
#DOM Events
#Performance
Frontend Engineer
•
Technical
•
medium
Explain the differences between `useMemo` and `useCallback` in React. More importantly, explain when you should NOT use them.
#React
#Performance
#Hooks
Frontend Engineer
•
Technical
•
hard
Walk me through the Critical Rendering Path. How does the browser convert HTML, CSS, and JavaScript into pixels on the screen?
#Browser Architecture
#Rendering
#Performance
Frontend Engineer
•
Technical
•
medium
Compare Redux, Context API, and modern atomic state managers (like Zustand or Jotai). How do you decide which to use for a new enterprise application?
#State Management
#Architecture
#React
Frontend Engineer
•
Technical
•
medium
What is CORS? Explain why it exists, how the preflight request works, and how you typically resolve CORS issues during local development vs production.
#Network
#Security
#HTTP
Frontend Engineer
•
Technical
•
medium
KPMG operates globally. What are your strategies for managing internationalization (i18n) and localization (l10n) in a large-scale frontend application?
#i18n
#Localization
#Frontend
Full Stack Engineer
•
Behavioral
•
easy
Give an example of how you mentored a junior developer or consultant on your team.
#Mentorship
#Team Building
#Code Reviews
Full Stack Engineer
•
Behavioral
•
medium
Tell me about a time you had to push back on a client or stakeholder's technical request because it wasn't feasible or secure.
#Communication
#Stakeholder Management
#Negotiation
Full Stack Engineer
•
Behavioral
•
medium
Describe a situation where you had to explain a complex technical architecture to a non-technical audit partner or business stakeholder.
#Consulting
#Cross-functional Collaboration
Full Stack Engineer
•
Behavioral
•
medium
How do you handle changing requirements midway through a sprint, especially when driven by sudden regulatory changes?
#Adaptability
#Project Management
#Prioritization
Full Stack Engineer
•
Behavioral
•
easy
Why do you want to work at KPMG? How does our focus on trust, compliance, and consulting align with your engineering career goals?
#Motivation
#Company Knowledge
#Career Goals
Full Stack Engineer
•
Behavioral
•
medium
How do you prioritize addressing technical debt versus delivering new features requested by a client?
#Prioritization
#Technical Debt
#Client Management
Full Stack Engineer
•
Behavioral
•
medium
Tell me about a time you found a critical bug in production. How did you handle the immediate crisis and the post-mortem?
#Incident Response
#Debugging
#Accountability
Full Stack Engineer
•
Coding
•
hard
Write a SQL query to calculate the rolling 7-day average of daily deposits for each client account.
#Window Functions
#Time Series
#Aggregations
Full Stack Engineer
•
Coding
•
easy
Given an array of transaction amounts and a target reconciliation value, return the indices of the two amounts that add up to the target.
#Arrays
#Hash Tables
#Two Pointers
Full Stack Engineer
•
Coding
•
medium
Write a SQL query to find the second highest transaction amount for each regional department in the 'transactions' table.
#Window Functions
#Subqueries
#Grouping
Full Stack Engineer
•
Coding
•
medium
Given an array of strings representing vendor names, group the anagrams together to help identify duplicate vendor entries.
#Strings
#Hash Tables
#Sorting
Full Stack Engineer
•
Coding
•
medium
Find the length of the longest substring without repeating characters in a given data stream string.
#Strings
#Sliding Window
#Hash Tables
Full Stack Engineer
•
Coding
•
medium
Implement a Least Recently Used (LRU) Cache to store recent database query results.
#Design
#Hash Tables
#Linked Lists
Full Stack Engineer
•
Coding
•
easy
Write a function to validate if a given string representing a mathematical tax formula has balanced and properly nested parentheses, brackets, and braces.
#Strings
#Stacks
Full Stack Engineer
•
Coding
•
easy
Write a function to reverse a singly linked list. This is often used as a sub-problem in our data pipeline transformations.
#Linked Lists
#Pointers
Full Stack Engineer
•
Coding
•
medium
Given a list of audit engagement timeframes represented as intervals [start_date, end_date], merge all overlapping engagements.
#Arrays
#Sorting
Full Stack Engineer
•
Coding
•
medium
Find the Kth largest transaction amount in an unsorted array of daily transactions.
#Heaps
#Quickselect
#Arrays
Full Stack Engineer
•
System Design
•
hard
Design an immutable audit logging system for a microservices architecture that processes financial transactions.
#Event Sourcing
#Message Queues
#Database Design
#Compliance
Full Stack Engineer
•
System Design
•
medium
Design a data ingestion pipeline that pulls daily CSV files from external banking partners, cleans the data, and loads it into a central data warehouse.
#ETL
#Data Pipelines
#Cloud Services
#Batch Processing
Full Stack Engineer
•
System Design
•
medium
Design a highly available REST API using AWS or Azure services. How do you ensure it survives a single availability zone failure?
#Cloud Architecture
#High Availability
#Load Balancing
Full Stack Engineer
•
System Design
•
medium
Design a notification system to alert thousands of clients about upcoming tax deadlines via Email, SMS, and In-App push.
#Asynchronous Processing
#Microservices
#Third-party Integrations
Full Stack Engineer
•
System Design
•
medium
Design a secure document upload and storage portal for KPMG tax clients to submit sensitive financial documents.
#Cloud Storage
#Security
#Encryption
#Microservices
Full Stack Engineer
•
System Design
•
hard
Design a Role-Based Access Control (RBAC) system for a multi-tenant advisory platform where different client organizations have different permission levels.
#Security
#Database Schema
#Multi-tenancy
#Authorization
Full Stack Engineer
•
System Design
•
hard
Design a scalable reporting dashboard that aggregates millions of tax records and allows users to filter by date, region, and tax type in real-time.
#Data Warehousing
#Caching
#Indexing
#Frontend Performance
Full Stack Engineer
•
Technical
•
medium
Explain Dependency Injection and how you implement it in a framework like Spring Boot or .NET Core.
#Design Patterns
#OOP
#Spring Boot
#.NET
Full Stack Engineer
•
Technical
•
easy
What is a JWT (JSON Web Token), and how does it prevent tampering of authentication data?
#Authentication
#Cryptography
#Web Security
Full Stack Engineer
•
Technical
•
hard
Explain the concept of Eventual Consistency versus Strong Consistency. Which would you choose for a core banking ledger system and why?
#CAP Theorem
#Database Architecture
#Consistency Models
Full Stack Engineer
•
Technical
•
medium
Explain the differences between clustered and non-clustered indexes in SQL Server or PostgreSQL.
#SQL
#Indexing
#Performance Tuning
Full Stack Engineer
•
Technical
•
easy
How do you prevent SQL Injection and Cross-Site Scripting (XSS) in a full-stack web application?
#Web Security
#OWASP
#Input Validation
Full Stack Engineer
•
Technical
•
medium
How do you optimize the performance of a slow-loading React application that processes large datasets?
#React
#Performance
#Web Vitals
Full Stack Engineer
•
Technical
•
medium
What are the trade-offs between a Microservices architecture and a Monolithic architecture? When would you recommend a Monolith to a client?
#Microservices
#Monolith
#System Design
Full Stack Engineer
•
Technical
•
medium
Describe the event loop in Node.js. How does it handle concurrent requests despite being single-threaded?
#Node.js
#Asynchronous Programming
#Concurrency
Full Stack Engineer
•
Technical
•
easy
What are the ACID properties, and why are they critical for KPMG's financial ledger software?
#Relational Databases
#Transactions
#Data Integrity
Full Stack Engineer
•
Technical
•
medium
How do you secure a REST API that handles sensitive Personally Identifiable Information (PII)?
#API Design
#Authentication
#Authorization
#Data Protection
Full Stack Engineer
•
Technical
•
medium
Explain the difference between React's Context API and Redux. When would you choose one over the other for a large-scale financial dashboard?
#React
#State Management
#Architecture
Machine Learning Engineer
•
Behavioral
•
medium
Describe a time you had to build a model using messy, unstructured, or incomplete client data. How did you handle it?
#Data Cleaning
#Problem Solving
#Resilience
Machine Learning Engineer
•
Behavioral
•
hard
KPMG places a high value on ethical AI. How do you ensure your machine learning models do not introduce or amplify bias, especially in loan approval risk assessments?
#Ethical AI
#Bias Mitigation
#Fairness
Machine Learning Engineer
•
Behavioral
•
medium
Describe a time you collaborated with a cross-functional team (data engineers, SMEs, business analysts) to deliver an end-to-end ML solution.
#Teamwork
#Cross-functional Collaboration
#Project Delivery
Machine Learning Engineer
•
Behavioral
•
medium
Tell me about a time you had to push back on a client or stakeholder who had unrealistic expectations about what AI/ML could achieve.
#Client Management
#Scope Management
#Expectation Setting
Machine Learning Engineer
•
Behavioral
•
medium
Tell me about a situation where a deployed model's performance degraded over time. How did you diagnose and resolve the issue?
#Troubleshooting
#Production ML
#Continuous Improvement
Machine Learning Engineer
•
Behavioral
•
medium
Imagine you are presenting the results of a complex predictive model to a non-technical audit partner at KPMG. How do you explain the model's predictions and build trust in the system?
#Stakeholder Management
#Explainable AI
#Consulting
Machine Learning Engineer
•
Coding
•
hard
Write an algorithm to detect a cycle in a directed graph. This is often used in anti-money laundering (AML) to detect circular financial transactions.
#Graphs
#DFS
#Cycle Detection
Machine Learning Engineer
•
Coding
•
easy
Given a string of unstructured text from a financial report, write a Python script using regex to extract all monetary values (e.g., '$1,000.50', '€500').
#Regex
#Python
#Text Processing
Machine Learning Engineer
•
Coding
•
medium
Given a Pandas dataframe of audit system logs, write code to efficiently group by user and calculate the time difference between consecutive logins.
#Pandas
#Data Wrangling
#Time-series
Machine Learning Engineer
•
Coding
•
medium
Write a SQL query using window functions to calculate the cumulative sum of revenue per department, ordered by transaction date.
#Window Functions
#Cumulative Sum
#Data Aggregation
Machine Learning Engineer
•
Coding
•
medium
How would you optimize a Pandas script that is running out of memory on a 10GB dataset?
#Pandas
#Memory Optimization
#Python
Machine Learning Engineer
•
Coding
•
medium
Implement a Python function to compute the cosine similarity between two sparse vectors represented as dictionaries.
#Math
#Data Structures
#Sparse Matrices
Machine Learning Engineer
•
Coding
•
easy
Write a Python function to calculate the 7-day moving average of a time series array representing daily transaction volumes.
#Arrays
#Sliding Window
#Python
Machine Learning Engineer
•
Coding
•
medium
Write a SQL query to identify the top 3 highest-value suspicious transactions for each corporate client over the past 30 days.
#Window Functions
#Aggregations
#Time-series
Machine Learning Engineer
•
System Design
•
medium
How would you approach building a predictive model to identify advisory clients at risk of churn?
#Predictive Modeling
#Feature Engineering
#Business Strategy
Machine Learning Engineer
•
System Design
•
hard
Explain how you would secure sensitive Personally Identifiable Information (PII) data within an ML training pipeline.
#Data Privacy
#Security
#Compliance
Machine Learning Engineer
•
System Design
•
hard
How would you design a scalable data pipeline using PySpark to process terabytes of transaction logs for feature engineering?
#PySpark
#Big Data
#Distributed Computing
Machine Learning Engineer
•
System Design
•
medium
What are the trade-offs between batch inference and real-time inference? Give an example of a KPMG use case for each.
#Batch Processing
#Real-time Processing
#Architecture
Machine Learning Engineer
•
System Design
•
hard
Design a system to continuously retrain a financial forecasting model as new transaction data arrives weekly.
#CI/CD for ML
#Automated Retraining
#Pipeline Orchestration
Machine Learning Engineer
•
System Design
•
medium
How do you manage machine learning experiments, track hyperparameters, and handle model versioning in a collaborative team setting?
#Experiment Tracking
#Model Registry
#MLflow
Machine Learning Engineer
•
System Design
•
medium
Explain your approach to monitoring data drift and concept drift in a production ML environment.
#Model Monitoring
#Data Drift
#Concept Drift
Machine Learning Engineer
•
System Design
•
hard
Design a real-time fraud detection API. What are the latency requirements, and how do you ensure the model meets them under high load?
#Real-time Inference
#API Design
#Latency Optimization
Machine Learning Engineer
•
System Design
•
hard
Design an end-to-end machine learning system to automatically extract and categorize line items from millions of scanned tax documents and receipts.
#OCR
#NLP
#Batch Processing
#Cloud Architecture
Machine Learning Engineer
•
System Design
•
medium
Walk me through how you would deploy a trained PyTorch model as a scalable web service using Azure Machine Learning.
#Azure ML
#Model Deployment
#Cloud Computing
Machine Learning Engineer
•
Technical
•
medium
How do you evaluate an NLP model used for extracting specific regulatory clauses from lengthy legal contracts?
#NLP
#Information Extraction
#Evaluation Metrics
Machine Learning Engineer
•
Technical
•
medium
Explain the vanishing gradient problem in deep neural networks and discuss methods to mitigate it.
#Deep Learning
#Neural Networks
#Optimization
Machine Learning Engineer
•
Technical
•
medium
What metrics would you use to evaluate a classification model where false positives are extremely costly (e.g., flagging a compliant client as high-risk)?
#Evaluation Metrics
#Precision vs Recall
#Business Impact
Machine Learning Engineer
•
Technical
•
medium
Explain the difference between Random Forest and Gradient Boosting. Which would you prefer for modeling tabular financial risk data, and why?
#Ensemble Methods
#Decision Trees
#Model Selection
Machine Learning Engineer
•
Technical
•
hard
How do you handle missing values in a dataset where the missingness is not at random (MNAR)?
#Data Imputation
#Statistics
#Data Quality
Machine Learning Engineer
•
Technical
•
hard
Describe the attention mechanism in Transformer models. Why is it more effective than RNNs for processing long documents?
#Transformers
#NLP
#Deep Learning
Machine Learning Engineer
•
Technical
•
medium
In the context of credit card fraud detection for a financial client, how would you handle a highly imbalanced dataset where fraudulent transactions represent less than 0.1% of the data?
#Imbalanced Data
#Sampling Techniques
#Evaluation Metrics
Machine Learning Engineer
•
Technical
•
hard
KPMG often works with highly regulated clients. How do you ensure model explainability (XAI) for a complex deep learning model?
#Explainable AI
#SHAP
#LIME
Machine Learning Engineer
•
Technical
•
hard
What is data leakage, and how do you prevent it specifically in time-series forecasting models?
#Time-series
#Data Leakage
#Cross-validation
Machine Learning Engineer
•
Technical
•
hard
Explain how you would use Retrieval-Augmented Generation (RAG) to build a secure Q&A bot for internal tax policy documents.
#LLMs
#RAG
#Vector Databases
Machine Learning Engineer
•
Technical
•
easy
How do you choose between L1 (Lasso) and L2 (Ridge) regularization? When would you use Elastic Net?
#Regularization
#Feature Selection
#Linear Models
Product Manager
•
Behavioral
•
hard
How do you handle a situation where the engineering team says a feature requested by a key Fortune 500 client is impossible to build in the given timeframe?
#Negotiation
#Engineering Collaboration
#Client Management
Product Manager
•
Behavioral
•
medium
Tell me about a time you used data to influence a product decision against strong pushback from leadership.
#Data-Driven Decision Making
#Influence without Authority
Product Manager
•
Behavioral
•
medium
Walk me through your process for conducting user research with busy C-level executives.
#User Research
#Executive Communication
#Discovery
Product Manager
•
Behavioral
•
medium
How would you handle a critical bug discovered in production during the busy tax season?
#Incident Management
#Crisis Communication
#Triage
Product Manager
•
Behavioral
•
medium
How do you align your product roadmap with the broader strategic goals of a professional services firm like KPMG?
#Strategic Alignment
#OKRs
#Business Acumen
Product Manager
•
Behavioral
•
medium
Tell me about a time you had to manage conflicting priorities from different partners or senior stakeholders.
#Stakeholder Management
#Conflict Resolution
#Prioritization
Product Manager
•
Behavioral
•
medium
Tell me about a time you had to step in and lead a team that was lacking direction or demoralized.
#Team Building
#Empathy
#Execution
Product Manager
•
Behavioral
•
medium
Tell me about a time you identified a new market opportunity for an existing product.
#Market Research
#Innovation
#Growth
Product Manager
•
Behavioral
•
easy
Tell me about a time you worked with a distributed engineering team across different time zones (e.g., US and India).
#Remote Work
#Cross-functional Teams
#Communication
Product Manager
•
Behavioral
•
medium
Describe a time you had to pivot a product roadmap due to sudden regulatory or compliance requirements.
#Risk Management
#Agile Delivery
#Roadmapping
Product Manager
•
Behavioral
•
easy
Describe your experience with Agile/Scrum methodologies. How do you run backlog grooming?
#Agile
#Scrum
#Backlog Management
Product Manager
•
Behavioral
•
medium
Tell me about a time you had to learn a complex, highly technical domain quickly to manage a product.
#Continuous Learning
#Technical Acumen
#Onboarding
Product Manager
•
Behavioral
•
medium
Describe a time you had to say 'no' to a senior partner or a major client regarding a feature request.
#Stakeholder Management
#Prioritization
#Communication
Product Manager
•
Behavioral
•
medium
Tell me about a time you failed to deliver a product or feature on time. What happened and what did you learn?
#Accountability
#Retrospectives
#Project Management
Product Manager
•
Coding
•
easy
Write a SQL query to find the top 5 clients by revenue in the last quarter, given 'transactions' and 'clients' tables.
#Data Analysis
#SQL
#Joins
Product Manager
•
Coding
•
hard
Write a SQL query to calculate the month-over-month retention rate of users on a new compliance platform.
#Retention
#Advanced SQL
#Window Functions
Product Manager
•
System Design
•
hard
Design a secure document sharing platform for KPMG clients to upload sensitive financial data.
#Security
#Data Privacy
#Architecture
Product Manager
•
System Design
•
hard
How would you design a machine learning system to detect anomalies in corporate expense reports?
#Machine Learning
#Fraud Detection
#Data Pipelines
Product Manager
•
System Design
•
hard
How would you go about migrating a legacy on-premise financial reporting tool to a cloud-based SaaS model?
#Cloud Migration
#Legacy Systems
#Change Management
Product Manager
•
System Design
•
medium
How would you design a dashboard for audit partners to track engagement profitability in real-time?
#User Experience
#Enterprise Software
#Data Visualization
Product Manager
•
System Design
•
hard
Design an automated risk assessment platform for enterprise clients onboarding vendors.
#B2B SaaS
#Risk Management
#Data Integration
Product Manager
•
System Design
•
hard
How would you design an identity and access management (IAM) flow for a multi-tenant KPMG SaaS product?
#IAM
#Security
#Multi-tenancy
Product Manager
•
System Design
•
hard
Design a system to ingest, process, and visualize millions of rows of transaction data for audit purposes.
#Big Data
#Data Pipelines
#ETL
Product Manager
•
Technical
•
hard
How would you price a new SaaS product developed by KPMG Lighthouse for mid-market accounting firms?
#Pricing Strategy
#Go-to-Market
#B2B SaaS
Product Manager
•
Technical
•
medium
How do you ensure compliance (e.g., GDPR, SOC2) is integrated into your product development lifecycle?
#Compliance
#SDLC
#Risk Management
Product Manager
•
Technical
•
hard
Walk me through how you would prioritize features for a new tax compliance software facing changing IRS regulations.
#Roadmapping
#Compliance
#Agile
Product Manager
•
Technical
•
easy
How do you measure the success of an internal tool used by KPMG consultants to automate data entry?
#Product Metrics
#Internal Tools
#Efficiency
Product Manager
•
Technical
•
easy
Explain how a REST API works to a non-technical audit partner.
#APIs
#Technical Communication
#Client Facing
Product Manager
•
Technical
•
medium
Given a scenario where user engagement on a client advisory portal drops by 20% week-over-week, how would you investigate?
#Root Cause Analysis
#Data Analytics
#Product Sense
Product Manager
•
Technical
•
medium
How do you balance technical debt with delivering new features for a client-facing advisory portal?
#Technical Debt
#Prioritization
#Engineering Collaboration
Product Manager
•
Technical
•
medium
What is your approach to writing PRDs (Product Requirements Documents) for highly regulated industries like banking or healthcare?
#PRDs
#Compliance
#Documentation
Product Manager
•
Technical
•
medium
Imagine you are the PM for KPMG's internal time-tracking software. How would you improve it?
#Product Improvement
#Internal Tools
#UX
Product Manager
•
Technical
•
hard
How do you define an MVP for a complex enterprise data integration platform?
#MVP
#Enterprise Software
#Scope Management
Product Manager
•
Technical
•
hard
What is your strategy for sunsetting an outdated legacy product that a few key clients still use?
#Product Lifecycle
#Sunsetting
#Client Management
Product Manager
•
Technical
•
medium
What metrics would you track to evaluate the adoption of a new AI-driven contract analysis tool?
#AI/ML Products
#Adoption Metrics
#KPIs
Software Engineer
•
Behavioral
•
medium
Tell me about a time you had to explain a complex technical concept to a non-technical stakeholder, such as a Tax Partner or a client executive.
#Communication
#Consulting
#Empathy
Software Engineer
•
Behavioral
•
medium
Describe a time you had to learn a new technology or framework very quickly to deliver a critical project.
#Adaptability
#Continuous Learning
Software Engineer
•
Behavioral
•
medium
Describe a time you found a critical bug right before a major client release. What steps did you take?
#Problem Solving
#Accountability
#Crisis Management
Software Engineer
•
Behavioral
•
medium
Give an example of a time you took the initiative to improve the performance or efficiency of an existing process or codebase.
#Initiative
#Optimization
#Continuous Improvement
Software Engineer
•
Behavioral
•
easy
Why do you want to work at KPMG? What interests you about working in a Big 4 technology consulting environment compared to a traditional tech company?
#Motivation
#Company Knowledge
#Career Goals
Software Engineer
•
Behavioral
•
medium
Tell me about a time you had to manage a difficult client or internal stakeholder who kept changing the project requirements.
#Stakeholder Management
#Communication
#Agile
Software Engineer
•
Behavioral
•
medium
How do you prioritize tasks when working on multiple client engagements simultaneously with overlapping deadlines?
#Time Management
#Prioritization
#Consulting
Software Engineer
•
Behavioral
•
medium
Tell me about a time you disagreed with a senior engineer or a manager regarding a technical decision. How did you handle it?
#Conflict Resolution
#Teamwork
#Communication
Software Engineer
•
Coding
•
medium
Given an array of intervals where intervals[i] = [starti, endi], merge all overlapping intervals, and return an array of the non-overlapping intervals.
#Arrays
#Sorting
Software Engineer
•
Coding
•
medium
Given an array of strings, group the anagrams together. You can return the answer in any order.
#Strings
#Hash Table
Software Engineer
•
Coding
•
easy
Write a SQL query to find all employees who earn more than their direct managers.
#Database
#Joins
Software Engineer
•
Coding
•
easy
Given the head of a singly linked list, reverse the list, and return the reversed list.
#Linked Lists
#Pointers
Software Engineer
•
Coding
•
easy
Given a string s, find the first non-repeating character in it and return its index. If it does not exist, return -1.
#Strings
#Hash Table
Software Engineer
•
Coding
•
medium
Write a SQL query to find the nth highest salary from an Employee table. If there is no nth highest salary, return null.
#Database
#Queries
#Window Functions
Software Engineer
•
Coding
•
easy
Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target. You may assume that each input would have exactly one solution.
#Arrays
#Hash Table
Software Engineer
•
Coding
•
medium
Given a string s, find the length of the longest substring without repeating characters.
#Sliding Window
#Strings
#Hash Table
Software Engineer
•
Coding
•
medium
Write a SQL query to calculate the cumulative sum of revenue per month for the year 2023.
#Window Functions
#Analytics
#Aggregation
Software Engineer
•
Coding
•
medium
Design and implement a data structure for a Least Recently Used (LRU) cache.
#Design
#Hash Table
#Doubly Linked List
Software Engineer
•
Coding
•
medium
Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.
#Heaps
#Hash Table
#Sorting
Software Engineer
•
Coding
•
hard
Write a SQL query to find the top 3 departments with the highest average salary, including the average salary amount.
#CTEs
#Window Functions
#Aggregation
Software Engineer
•
Coding
•
easy
Given a string containing just the characters '(', ')', '{', '}', '[' and ']', determine if the input string is valid.
#Strings
#Stack
Software Engineer
•
System Design
•
hard
Design an automated auditing system that ingests millions of transaction records daily from various client ERP systems and flags anomalies.
#Data Pipelines
#Scalability
#Kafka
#Microservices
Software Engineer
•
System Design
•
hard
Design a scalable data pipeline to migrate legacy on-premise financial data to Azure Cloud with zero data loss.
#Cloud Migration
#Azure
#ETL
#Data Engineering
Software Engineer
•
System Design
•
medium
Design a role-based access control (RBAC) system for an enterprise application used by different departments at KPMG.
#Security
#Database Design
#Authorization
Software Engineer
•
System Design
•
medium
Design a microservice architecture for a payroll processing system that integrates with third-party banking APIs.
#Microservices
#Event-Driven Architecture
#Integrations
Software Engineer
•
System Design
•
hard
Design a secure document sharing portal for KPMG tax clients to upload sensitive financial documents.
#Security
#Cloud Storage
#Architecture
#Compliance
Software Engineer
•
Technical
•
medium
How do you secure a RESTful API? What specific mechanisms would you implement for an internal financial tool?
#REST
#Security
#OAuth
#API Design
Software Engineer
•
Technical
•
medium
Explain Dependency Injection and its benefits. How is it implemented in modern frameworks like ASP.NET Core or Spring Boot?
#Design Patterns
#Testing
#Inversion of Control
Software Engineer
•
Technical
•
medium
What are the differences between clustered and non-clustered indexes in a relational database?
#Performance Tuning
#SQL
#Indexing
Software Engineer
•
Technical
•
medium
How do you handle database schema migrations in a CI/CD pipeline to ensure zero downtime?
#CI/CD
#Database Management
#Deployment
Software Engineer
•
Technical
•
medium
Explain the SOLID principles of object-oriented design. Can you give a practical example of the Single Responsibility Principle?
#Design Principles
#Clean Code
#Architecture
Software Engineer
•
Technical
•
medium
Explain the N+1 query problem in the context of ORMs like Entity Framework or Hibernate. How do you resolve it?
#ORMs
#Performance
#SQL
Software Engineer
•
Technical
•
easy
What is the difference between synchronous and asynchronous programming? When would you use async/await in a web application?
#Concurrency
#Async/Await
#Performance
Software Engineer
•
Technical
•
medium
How does Garbage Collection work in .NET (or Java)? How can you optimize an application to reduce GC pressure?
#Memory Management
#Performance
#C#
#Java
Software Engineer
•
Technical
•
medium
Explain the difference between abstract classes and interfaces. When would you use one over the other in a .NET or Java enterprise application?
#OOP
#C#
#Java
#Architecture
Difficulty Radar
Based on recent AI-sourced data.
Meet Your Interviewers
The "Standard" Interviewer
Senior EngineerFocuses on core competencies, system constraints, and clear communication.
SimulateUnwritten Rules
Think Out Loud
Always explain your thought process before writing code or drawing architecture.