DXC Technology

DXC Technology

American multinational B2B IT services provider.

4 Rounds ~21 Days Medium
Start Mock Interview

The Interview Loop

Recruiter Screen (30 min)

Standard fit check, behavioral questions, and resume overview.

Technical Loop (3-4 Rounds)

Deep dive into domain knowledge, coding, and system design.

Interview Question Bank

Data Engineer Behavioral medium

Tell me about a time you had a disagreement with a client or stakeholder regarding technical requirements. How did you resolve it?

#Communication #Conflict Resolution #Consulting
Data Engineer Behavioral medium

How do you manage scope creep in the middle of a complex data migration project?

#Project Management #Client Management #Agile
Data Engineer Behavioral medium

Describe a time you optimized a data pipeline and saved the company or client money.

#Optimization #Cost Reduction #Impact
Data Engineer Behavioral easy

How do you explain complex data engineering concepts, like data lakes or distributed computing, to non-technical business stakeholders?

#Communication #Stakeholder Management
Data Engineer Behavioral medium

Tell me about a time you failed to meet a project deadline. What happened and what did you learn?

#Accountability #Continuous Improvement
Data Engineer Behavioral easy

Why do you want to work at DXC Technology as a Data Engineer?

#Company Knowledge #Motivation
Data Engineer Behavioral medium

How do you ensure data quality and governance in the pipelines you build?

#Data Quality #Governance #Best Practices
Data Engineer Behavioral easy

Describe your experience working in an Agile/Scrum environment with globally distributed teams.

#Agile #Teamwork #Remote Work
Data Engineer Coding medium

Write a SQL query to find the second highest salary from an Employee table without using the LIMIT or TOP keywords.

#SQL #Subqueries #Aggregations
Data Engineer Coding medium

Write a SQL query to calculate the cumulative sum of sales by month for each region.

#SQL #Window Functions #Data Aggregation
Data Engineer Coding medium

Write a Python function to parse a large server log file and return the top 5 most frequent IP addresses.

#Python #File I/O #Hash Maps #Collections
Data Engineer Coding easy

Write a Python script using Pandas or PySpark to read a CSV file, filter out rows where the 'status' column is 'failed', and write the output to a Parquet file.

#Python #Pandas #PySpark #ETL
Data Engineer Coding medium

Write a PySpark snippet to group a dataframe by 'department' and calculate the average salary and total employee count for each.

#PySpark #Aggregations #DataFrames
Data Engineer Coding medium

Write a Python generator function that reads a file and yields one line at a time. Why is this useful?

#Python #Generators #Memory Management
Data Engineer Coding easy

Given a list of dictionaries representing employee records, write a Python function to sort the list by the 'salary' key in descending order.

#Python #Sorting #Data Structures
Data Engineer System Design hard

Design an ETL pipeline to migrate daily transactional data from an on-premise SQL Server to an AWS S3 data lake, and then to Redshift for reporting.

#AWS #ETL Architecture #Data Lake #Redshift
Data Engineer System Design hard

How would you design a real-time streaming pipeline to process IoT sensor data and generate alerts for anomalies?

#Streaming #Kafka #Spark Streaming #Real-time Processing
Data Engineer System Design medium

Explain the Lambda Architecture. What are its pros and cons compared to the Kappa Architecture?

#Data Architecture #Batch Processing #Stream Processing
Data Engineer System Design hard

Design a Data Lakehouse architecture for a large retail client. How does it differ from a traditional Data Lake?

#Data Lakehouse #Databricks #Delta Lake #Architecture
Data Engineer Technical easy

Explain the difference between RANK(), DENSE_RANK(), and ROW_NUMBER() in SQL. Provide a scenario where you would use each.

#SQL #Window Functions
Data Engineer Technical medium

You have a query that is taking too long to execute. Walk me through the steps you would take to optimize it.

#Performance Tuning #Execution Plans #Indexing
Data Engineer Technical medium

How would you handle processing a 50GB file in Python on a machine with only 8GB of RAM?

#Python #Generators #Memory Management #Chunking
Data Engineer Technical medium

Explain the difference between Repartition and Coalesce in PySpark. When would you use one over the other?

#PySpark #Data Partitioning #Performance Optimization
Data Engineer Technical medium

What is a Broadcast Join in Spark? How does it improve performance compared to a Sort Merge Join?

#PySpark #Joins #Distributed Computing
Data Engineer Technical hard

During a PySpark job, you notice that one task takes significantly longer than the others, causing a bottleneck. What is the likely cause and how do you fix it?

#PySpark #Data Skew #Troubleshooting
Data Engineer Technical easy

Explain lazy evaluation in Apache Spark. Why is it beneficial?

#Spark Architecture #DAG #Transformations vs Actions
Data Engineer Technical medium

What is the difference between a Star Schema and a Snowflake Schema? Which one would you prefer for a modern cloud data warehouse?

#Data Warehousing #Dimensional Modeling
Data Engineer Technical medium

Explain Slowly Changing Dimensions (SCD). How do you implement an SCD Type 2 in an ETL pipeline?

#Data Warehousing #ETL #SCD
Data Engineer Technical medium

In AWS, when would you choose to use AWS Glue versus Amazon EMR for your data transformation workloads?

#AWS #Glue #EMR #Serverless
Data Engineer Technical medium

How does Snowflake handle data storage and indexing? Explain the concept of micro-partitions.

#Snowflake #Cloud Data Warehousing #Micro-partitions
Data Engineer Technical medium

What is the optimal file size for storing data in Amazon S3 for querying with Athena or Spark? Why?

#AWS S3 #Big Data Storage #Performance Optimization
Data Engineer Technical medium

How do you handle task dependencies and retries in Apache Airflow?

#Airflow #Orchestration #DAGs
Data Engineer Technical medium

What is the difference between a Common Table Expression (CTE) and a Temporary Table? When would you use one over the other?

#SQL #Database Architecture #Performance
Data Engineer Technical hard

Explain the concept of 'salting' in PySpark. Write a conceptual code snippet showing how you would implement it to fix a skewed join.

#PySpark #Data Skew #Advanced Optimization
Data Engineer Technical medium

Compare Azure Data Factory (ADF) and Azure Databricks. In a modern data stack, how do they complement each other?

#Azure #ADF #Databricks #ETL

Difficulty Radar

Based on recent AI-sourced data.

Meet Your Interviewers

The "Standard" Interviewer

Senior Engineer

Focuses on core competencies, system constraints, and clear communication.

Simulate

Unwritten Rules

Think Out Loud

Always explain your thought process before writing code or drawing architecture.

Practice Now