Swiggy
Leading Indian food delivery aggregator with complex real-time logistics.
4 Rounds
~15 Days
Hard
The Interview Loop
Recruiter Screen (30 min)
Standard fit check, behavioral questions, and resume overview.
Technical Loop (3-4 Rounds)
Deep dive into domain knowledge, coding, and system design.
Interview Question Bank
DevOps Engineer
•
Behavioral
•
medium
Tell me about a time you caused a Sev-1 production outage. How did you handle the immediate mitigation, and what was the outcome of the post-mortem?
#Incident Management
#Post-mortem
#Ownership
DevOps Engineer
•
Behavioral
•
medium
Describe a situation where a development team wanted to push a critical feature to production on a Friday evening, but it lacked proper monitoring and alerts. How did you handle it?
#Stakeholder Management
#Pushback
#Production Readiness
DevOps Engineer
•
Behavioral
•
medium
How do you balance the need to deliver infrastructure for new product features (like Swiggy Dineout) versus paying down existing technical debt?
#Prioritization
#Technical Debt
#Agile
DevOps Engineer
•
Behavioral
•
hard
Tell me about a time you had to collaborate with multiple engineering squads to migrate a legacy system to a new infrastructure platform with zero downtime.
#Collaboration
#Migration
#Project Management
DevOps Engineer
•
Coding
•
medium
Write a Python script to parse a 50GB Nginx access log file, find the top 10 IP addresses with the most 5xx errors, and output them in JSON format.
#Python
#Log Parsing
#Data Structures
#Memory Management
DevOps Engineer
•
Coding
•
medium
Write a Bash script to find all pods in a specific Kubernetes namespace that have been in a 'CrashLoopBackOff' state for more than 10 minutes and restart them.
#Bash
#kubectl
#Automation
DevOps Engineer
•
Coding
•
medium
Write a Python script using Boto3 to find and delete all unattached EBS volumes older than 30 days across all AWS regions.
#Python
#Boto3
#AWS
#Automation
DevOps Engineer
•
Coding
•
easy
Write a simple Go program that exposes a /health HTTP endpoint and checks the connectivity to a PostgreSQL database, returning 200 OK or 503 Service Unavailable.
#Golang
#API
#Database Connectivity
DevOps Engineer
•
System Design
•
hard
How would you design the Kubernetes cluster architecture to handle Swiggy's New Year's Eve traffic spike where load increases by 10x in minutes?
#Kubernetes
#Autoscaling
#AWS EKS
#Capacity Planning
DevOps Engineer
•
System Design
•
hard
Design a centralized logging and monitoring system for Swiggy's 500+ microservices generating terabytes of logs daily.
#ELK/EFK
#Prometheus
#Grafana
#Distributed Tracing
DevOps Engineer
•
System Design
•
hard
Design an automated disaster recovery and failover mechanism for Swiggy's primary relational database (Amazon Aurora) across multiple AWS regions.
#Disaster Recovery
#AWS Aurora
#High Availability
#RPO/RTO
DevOps Engineer
•
System Design
•
hard
Design a scalable CI/CD pipeline for a monorepo containing 50 microservices, ensuring that only the modified services are built and deployed.
#Monorepo
#Jenkins/GitLab CI
#Build Optimization
DevOps Engineer
•
System Design
•
medium
Design the CDN and edge caching strategy for Swiggy's static assets (restaurant images, app icons) to ensure sub-50ms latency for users across India.
#CDN
#AWS CloudFront
#Caching
#Edge Computing
DevOps Engineer
•
System Design
•
hard
Swiggy wants to build an Internal Developer Platform (IDP) to allow developers to self-serve infrastructure. How would you design the architecture using tools like Backstage, Terraform, and ArgoCD?
#Platform Engineering
#Backstage
#Self-Service
#GitOps
DevOps Engineer
•
Technical
•
medium
Swiggy's AWS bill is growing rapidly due to EC2 and NAT Gateway costs. How would you audit and optimize this without impacting production availability?
#FinOps
#AWS
#Cost Optimization
#Networking
DevOps Engineer
•
Technical
•
hard
Explain how you would implement a zero-downtime Canary deployment for the Swiggy Delivery Partner tracking microservice using ArgoCD and Istio.
#ArgoCD
#Istio
#GitOps
#Deployment Strategies
DevOps Engineer
•
Technical
•
medium
We use Terraform to manage our AWS infrastructure. How do you handle Terraform state file locking and what happens if the state file gets corrupted?
#Terraform
#State Management
#AWS S3
#DynamoDB
DevOps Engineer
•
Technical
•
medium
A Swiggy checkout microservice running on an EC2 instance is suddenly experiencing high CPU utilization (99%). Walk me through your exact troubleshooting steps.
#Troubleshooting
#Linux
#Performance Tuning
DevOps Engineer
•
Technical
•
easy
Explain the difference between ClusterIP, NodePort, and LoadBalancer in Kubernetes. Which one would you use for an internal gRPC service at Swiggy and why?
#Networking
#Kubernetes Services
#gRPC
DevOps Engineer
•
Technical
•
hard
Swiggy uses Kafka heavily for order state transitions. How do you monitor Kafka consumer lag, and how would you automate the scaling of consumer pods based on this lag?
#Kafka
#KEDA
#Kubernetes Autoscaling
DevOps Engineer
•
Technical
•
medium
How would you securely connect a Swiggy VPC in the ap-south-1 region to a third-party payment gateway's VPC without traversing the public internet?
#AWS Networking
#VPC Peering
#Transit Gateway
#PrivateLink
DevOps Engineer
•
Technical
•
medium
How do you inject secrets securely into a Jenkins pipeline and subsequently into a Kubernetes pod without exposing them in the source code or environment variables?
#Secret Management
#HashiCorp Vault
#Kubernetes
#Jenkins
DevOps Engineer
•
Technical
•
medium
How would you optimize a Dockerfile for a Node.js application to minimize the image size and build time, considering it's deployed hundreds of times a day?
#Docker
#Optimization
#Node.js
DevOps Engineer
•
Technical
•
medium
Explain what happens at the OS and network level when a user types swiggy.com into their browser, focusing on DNS resolution and TCP handshake.
#DNS
#TCP/IP
#OSI Model
DevOps Engineer
•
Technical
•
medium
We have a batch processing job for Swiggy Instamart inventory that requires high memory. How do you ensure these pods are only scheduled on specific memory-optimized nodes using Taints, Tolerations, and Node Affinity?
#Kubernetes Scheduling
#Taints and Tolerations
#Node Affinity
DevOps Engineer
•
Technical
•
medium
How do you structure Terraform code for multiple environments (dev, staging, prod) to ensure DRY principles?
#Terraform
#Code Architecture
#Modules
#Terragrunt
DevOps Engineer
•
Technical
•
hard
Swiggy's restaurant menu API relies heavily on Redis. If the Redis cluster experiences a sudden memory eviction issue, how do you troubleshoot and resolve it?
#Redis
#Caching
#Troubleshooting
#Memory Management
DevOps Engineer
•
Technical
•
hard
Explain how Prometheus pulls metrics. How would you design the Prometheus architecture to scrape metrics from 10,000 ephemeral pods without overwhelming the Prometheus server?
#Prometheus
#Monitoring
#Architecture
#Thanos/Cortex
DevOps Engineer
•
Technical
•
medium
A production server is reporting 'No space left on device', but when you run df -h, it shows 40% free space. What is the likely cause and how do you fix it?
#Linux
#Filesystems
#Troubleshooting
DevOps Engineer
•
Technical
•
medium
Explain the concept of IAM Roles for Service Accounts (IRSA) in EKS. Why is it preferred over attaching IAM roles to the underlying EC2 worker nodes?
#AWS IAM
#Kubernetes Security
#EKS
DevOps Engineer
•
Technical
•
medium
How would you configure Nginx to rate-limit incoming requests to the Swiggy login API to prevent brute-force attacks?
#Nginx
#Security
#Rate Limiting
DevOps Engineer
•
Technical
•
medium
What are Helm hooks? Give an example of how you would use a pre-install hook for a database migration job during a microservice deployment.
#Helm
#Deployments
#Database Migrations
DevOps Engineer
•
Technical
•
medium
A developer accidentally committed a file containing AWS access keys to the main branch and pushed it. Walk me through the exact steps to remediate this security breach.
#Git
#Security Incident
#AWS IAM
DevOps Engineer
•
Technical
•
easy
Explain the difference between a PersistentVolume (PV), PersistentVolumeClaim (PVC), and StorageClass. How does dynamic provisioning work in AWS EKS?
#Kubernetes Storage
#AWS EBS
#CSI Drivers
DevOps Engineer
•
Technical
•
easy
How do you handle sensitive data like database passwords in Ansible playbooks? Explain how Ansible Vault works.
#Ansible
#Security
#Secret Management
Difficulty Radar
Based on recent AI-sourced data.
Meet Your Interviewers
The "Standard" Interviewer
Senior EngineerFocuses on core competencies, system constraints, and clear communication.
SimulateUnwritten Rules
Think Out Loud
Always explain your thought process before writing code or drawing architecture.