Nvidia
Hardware and AI software leader powering the global generative AI revolution.
4 Rounds
~25 Days
Very Hard
The Interview Loop
Recruiter Screen (30 min)
Standard fit check, behavioral questions, and resume overview.
Technical Loop (3-4 Rounds)
Deep dive into domain knowledge, coding, and system design.
Interview Question Bank
Cloud Engineer
•
Coding
•
hard
Implement a concurrent job scheduler in Go that limits the number of active workers to N. Jobs have different priorities and dependencies. Ensure that high-priority jobs are executed first and dependencies are respected.
#Concurrency
#Go
#Graph Algorithms
Cloud Engineer
•
Coding
•
medium
Write a script to parse a large distributed system log file (e.g., 50GB) to find all instances of a specific OOM (Out of Memory) error, group them by node ID, and output the top 5 nodes with the most errors. Optimize for memory usage.
#File I/O
#Data Structures
#Scripting
Cloud Engineer
•
Coding
•
medium
Design and implement a thread-safe token bucket rate limiter in Python or Go. How would you scale this across multiple distributed API servers handling requests for Nvidia's NGC container registry?
#Concurrency
#Distributed Systems
#Python/Go
Data Engineer
•
Coding
•
medium
Given a massive log file containing billions of error codes, write a Python program to find the top K most frequent error codes. The file is too large to fit in memory.
#Python
#Heaps
#External Sorting
#Generators
Data Engineer
•
Coding
•
hard
Implement an LRU (Least Recently Used) Cache in Python. This is often used to cache database lookups in our ingestion layer.
#Python
#Data Structures
#Hash Maps
#Linked Lists
Data Engineer
•
Coding
•
medium
Write a Python function to implement a Rate Limiter using the Token Bucket algorithm. This is used to throttle API requests to our internal data services.
#Python
#System Design Concepts
#Concurrency
Data Engineer
•
Coding
•
hard
Given a list of task dependencies (e.g., Task A must finish before Task B), write a Python function to determine a valid execution order for the tasks. If there is a circular dependency, return an error.
#Graphs
#Topological Sort
#Python
Data Engineer
•
Coding
•
medium
Design and implement a Least Recently Used (LRU) cache in Python. This is often used in our data access layers to cache frequently queried model metadata.
#Data Structures
#Hash Map
#Doubly Linked List
Data Engineer
•
Coding
•
medium
Given a massive log file of error codes generated by our DGX systems that cannot fit into memory, write a Python script to find the top K most frequent error codes.
#Python
#Heaps
#File I/O
#Memory Management
Data Engineer
•
Coding
•
medium
Given a list of intervals representing GPU job execution times (start_time, end_time), write a Python function to merge all overlapping intervals.
#Python
#Arrays
#Sorting
Data Engineer
•
Coding
•
medium
Given an array of GPU job execution intervals where intervals[i] = [start_i, end_i], merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the jobs.
#Arrays
#Sorting
#Python
Data Scientist
•
Coding
•
easy
Given an array of integers representing GPU memory allocations in MB, find the indices of two allocations that sum up exactly to a specific target memory limit.
#Hash Maps
#Arrays
Data Scientist
•
Coding
•
medium
Given a string, write a function to find the length of the longest substring without repeating characters.
#Strings
#Sliding Window
#Hash Map
Data Scientist
•
Coding
•
hard
Given a Directed Acyclic Graph (DAG) representing dependencies of CUDA kernels, write a function to find the critical path (the path with the longest total execution time).
#Graphs
#Dynamic Programming
#Topological Sort
Data Scientist
•
Coding
•
hard
Write an algorithm to schedule a computational Directed Acyclic Graph (DAG) representing neural network layers across multiple GPUs to minimize cross-device communication overhead.
#Graphs
#Topological Sort
#Dynamic Programming
Data Scientist
•
Coding
•
medium
Given an M x N matrix representing a batch of images, write a function to perform a 2D convolution with a given K x K kernel without using external libraries like SciPy or PyTorch.
#Arrays
#Matrix Manipulation
#Computer Vision
Data Scientist
•
Coding
•
medium
Write a Python function to simulate a Monte Carlo estimation of Pi. Then, explain and write the vectorized version using NumPy or CuPy.
#Simulation
#Vectorization
#Math
Data Scientist
•
Coding
•
medium
Implement a Trie (Prefix Tree) data structure to efficiently store and search through millions of generated text tokens from an LLM.
#Trees
#Trie
#Strings
Data Scientist
•
Coding
•
medium
Implement a sliding window algorithm to find the maximum GPU temperature over a rolling 5-minute window given a continuous stream of timestamped telemetry data.
#Sliding Window
#Queues
#Time Series
Machine Learning Engineer
•
Coding
•
medium
Find the Kth largest element in an unsorted array. Optimize for average time complexity.
#QuickSelect
#Heap
#Sorting
Machine Learning Engineer
•
Coding
•
medium
Find the Lowest Common Ancestor (LCA) of two nodes in a Binary Tree.
#Trees
#Recursion
#DFS
Machine Learning Engineer
•
Coding
•
medium
Implement a sparse matrix multiplication algorithm. Assume the matrices are too large to fit into memory in a dense format.
#Arrays
#Math
#Data Structures
Machine Learning Engineer
•
Coding
•
hard
Given an array of k linked-lists, each linked-list is sorted in ascending order. Merge all the linked-lists into one sorted linked-list and return it.
#Linked Lists
#Heaps
#Divide and Conquer
Machine Learning Engineer
•
Coding
•
medium
Given a Directed Acyclic Graph (DAG) representing a neural network computation graph, write an algorithm to find the longest path (critical path) from the input node to the output node.
#Graphs
#Dynamic Programming
#Topological Sort
Machine Learning Engineer
•
Coding
•
medium
Implement an autocomplete system using a Trie data structure. Include methods to insert a word and return all words that start with a given prefix.
#Trees
#Tries
#Strings
Machine Learning Engineer
•
Coding
•
hard
Write a function to perform Matrix Multiplication. Optimize it for cache locality using tiling/blocking.
#Matrix Operations
#Cache Optimization
#C++
Machine Learning Engineer
•
Coding
•
medium
Given a 2D grid map of '1's (land) and '0's (water), count the number of islands. (Context: Autonomous Vehicle occupancy grid analysis).
#Graph Theory
#DFS
#BFS
Machine Learning Engineer
•
Coding
•
hard
Merge K sorted linked lists into one sorted linked list.
#Linked Lists
#Divide and Conquer
#Heap
Software Engineer
•
Coding
•
easy
Find the maximum subarray sum (Kadane's Algorithm).
#Arrays
#Dynamic Programming
Software Engineer
•
Coding
•
medium
Design and implement an LRU (Least Recently Used) cache in C++.
#Hash Map
#Doubly Linked List
#C++
Software Engineer
•
Coding
•
easy
Given an integer, write a function to determine if it is a power of two using bitwise operators.
#Bit Manipulation
#Math
Software Engineer
•
Coding
•
hard
You have K sorted streams of telemetry data coming from different sensors. Write an algorithm to merge them into a single sorted stream in real-time.
#Heap
#Priority Queue
#Linked List
Software Engineer
•
Coding
•
medium
Given an array of integers containing n + 1 integers where each integer is in the range [1, n] inclusive, find the one repeated number without modifying the array and using only O(1) extra space.
#Two Pointers
#Array
Software Engineer
•
Coding
•
medium
Write a function to multiply two dense matrices. Then, optimize it for CPU cache locality.
#Arrays
#Math
#Cache Optimization
Software Engineer
•
Coding
•
hard
Merge K sorted linked lists.
#Heaps
#Linked Lists
#Divide and Conquer
Software Engineer
•
Coding
•
medium
Given an array of integers, return the indices of the two numbers that add up to a specific target. How would you optimize this for a highly parallel architecture?
#Parallel Computing
#Hash Maps
#Arrays
Difficulty Radar
Based on recent AI-sourced data.
Meet Your Interviewers
The "Standard" Interviewer
Senior EngineerFocuses on core competencies, system constraints, and clear communication.
SimulateUnwritten Rules
Think Out Loud
Always explain your thought process before writing code or drawing architecture.