Nvidia

Nvidia

Hardware and AI software leader powering the global generative AI revolution.

4 Rounds ~25 Days Very Hard
Start Mock Interview

The Interview Loop

Recruiter Screen (30 min)

Standard fit check, behavioral questions, and resume overview.

Technical Loop (3-4 Rounds)

Deep dive into domain knowledge, coding, and system design.

Interview Question Bank

Cloud Engineer Coding hard

Implement a concurrent job scheduler in Go that limits the number of active workers to N. Jobs have different priorities and dependencies. Ensure that high-priority jobs are executed first and dependencies are respected.

#Concurrency #Go #Graph Algorithms
Cloud Engineer Coding medium

Write a script to parse a large distributed system log file (e.g., 50GB) to find all instances of a specific OOM (Out of Memory) error, group them by node ID, and output the top 5 nodes with the most errors. Optimize for memory usage.

#File I/O #Data Structures #Scripting
Cloud Engineer Coding medium

Design and implement a thread-safe token bucket rate limiter in Python or Go. How would you scale this across multiple distributed API servers handling requests for Nvidia's NGC container registry?

#Concurrency #Distributed Systems #Python/Go
Data Engineer Coding medium

Given a massive log file containing billions of error codes, write a Python program to find the top K most frequent error codes. The file is too large to fit in memory.

#Python #Heaps #External Sorting #Generators
Data Engineer Coding hard

Implement an LRU (Least Recently Used) Cache in Python. This is often used to cache database lookups in our ingestion layer.

#Python #Data Structures #Hash Maps #Linked Lists
Data Engineer Coding medium

Write a Python function to implement a Rate Limiter using the Token Bucket algorithm. This is used to throttle API requests to our internal data services.

#Python #System Design Concepts #Concurrency
Data Engineer Coding hard

Given a list of task dependencies (e.g., Task A must finish before Task B), write a Python function to determine a valid execution order for the tasks. If there is a circular dependency, return an error.

#Graphs #Topological Sort #Python
Data Engineer Coding medium

Design and implement a Least Recently Used (LRU) cache in Python. This is often used in our data access layers to cache frequently queried model metadata.

#Data Structures #Hash Map #Doubly Linked List
Data Engineer Coding medium

Given a massive log file of error codes generated by our DGX systems that cannot fit into memory, write a Python script to find the top K most frequent error codes.

#Python #Heaps #File I/O #Memory Management
Data Engineer Coding medium

Given a list of intervals representing GPU job execution times (start_time, end_time), write a Python function to merge all overlapping intervals.

#Python #Arrays #Sorting
Data Engineer Coding medium

Given an array of GPU job execution intervals where intervals[i] = [start_i, end_i], merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the jobs.

#Arrays #Sorting #Python
Data Scientist Coding easy

Given an array of integers representing GPU memory allocations in MB, find the indices of two allocations that sum up exactly to a specific target memory limit.

#Hash Maps #Arrays
Data Scientist Coding medium

Given a string, write a function to find the length of the longest substring without repeating characters.

#Strings #Sliding Window #Hash Map
Data Scientist Coding hard

Given a Directed Acyclic Graph (DAG) representing dependencies of CUDA kernels, write a function to find the critical path (the path with the longest total execution time).

#Graphs #Dynamic Programming #Topological Sort
Data Scientist Coding hard

Write an algorithm to schedule a computational Directed Acyclic Graph (DAG) representing neural network layers across multiple GPUs to minimize cross-device communication overhead.

#Graphs #Topological Sort #Dynamic Programming
Data Scientist Coding medium

Given an M x N matrix representing a batch of images, write a function to perform a 2D convolution with a given K x K kernel without using external libraries like SciPy or PyTorch.

#Arrays #Matrix Manipulation #Computer Vision
Data Scientist Coding medium

Write a Python function to simulate a Monte Carlo estimation of Pi. Then, explain and write the vectorized version using NumPy or CuPy.

#Simulation #Vectorization #Math
Data Scientist Coding medium

Implement a Trie (Prefix Tree) data structure to efficiently store and search through millions of generated text tokens from an LLM.

#Trees #Trie #Strings
Data Scientist Coding medium

Implement a sliding window algorithm to find the maximum GPU temperature over a rolling 5-minute window given a continuous stream of timestamped telemetry data.

#Sliding Window #Queues #Time Series
Machine Learning Engineer Coding medium

Find the Kth largest element in an unsorted array. Optimize for average time complexity.

#QuickSelect #Heap #Sorting
Machine Learning Engineer Coding medium

Find the Lowest Common Ancestor (LCA) of two nodes in a Binary Tree.

#Trees #Recursion #DFS
Machine Learning Engineer Coding medium

Implement a sparse matrix multiplication algorithm. Assume the matrices are too large to fit into memory in a dense format.

#Arrays #Math #Data Structures
Machine Learning Engineer Coding hard

Given an array of k linked-lists, each linked-list is sorted in ascending order. Merge all the linked-lists into one sorted linked-list and return it.

#Linked Lists #Heaps #Divide and Conquer
Machine Learning Engineer Coding medium

Given a Directed Acyclic Graph (DAG) representing a neural network computation graph, write an algorithm to find the longest path (critical path) from the input node to the output node.

#Graphs #Dynamic Programming #Topological Sort
Machine Learning Engineer Coding medium

Implement an autocomplete system using a Trie data structure. Include methods to insert a word and return all words that start with a given prefix.

#Trees #Tries #Strings
Machine Learning Engineer Coding hard

Write a function to perform Matrix Multiplication. Optimize it for cache locality using tiling/blocking.

#Matrix Operations #Cache Optimization #C++
Machine Learning Engineer Coding medium

Given a 2D grid map of '1's (land) and '0's (water), count the number of islands. (Context: Autonomous Vehicle occupancy grid analysis).

#Graph Theory #DFS #BFS
Machine Learning Engineer Coding hard

Merge K sorted linked lists into one sorted linked list.

#Linked Lists #Divide and Conquer #Heap
Software Engineer Coding easy

Find the maximum subarray sum (Kadane's Algorithm).

#Arrays #Dynamic Programming
Software Engineer Coding medium

Design and implement an LRU (Least Recently Used) cache in C++.

#Hash Map #Doubly Linked List #C++
Software Engineer Coding easy

Given an integer, write a function to determine if it is a power of two using bitwise operators.

#Bit Manipulation #Math
Software Engineer Coding hard

You have K sorted streams of telemetry data coming from different sensors. Write an algorithm to merge them into a single sorted stream in real-time.

#Heap #Priority Queue #Linked List
Software Engineer Coding medium

Given an array of integers containing n + 1 integers where each integer is in the range [1, n] inclusive, find the one repeated number without modifying the array and using only O(1) extra space.

#Two Pointers #Array
Software Engineer Coding medium

Write a function to multiply two dense matrices. Then, optimize it for CPU cache locality.

#Arrays #Math #Cache Optimization
Software Engineer Coding hard

Merge K sorted linked lists.

#Heaps #Linked Lists #Divide and Conquer
Software Engineer Coding medium

Given an array of integers, return the indices of the two numbers that add up to a specific target. How would you optimize this for a highly parallel architecture?

#Parallel Computing #Hash Maps #Arrays

Difficulty Radar

Based on recent AI-sourced data.

Meet Your Interviewers

The "Standard" Interviewer

Senior Engineer

Focuses on core competencies, system constraints, and clear communication.

Simulate

Unwritten Rules

Think Out Loud

Always explain your thought process before writing code or drawing architecture.

Practice Now