DXC Technology

DXC Technology

American multinational B2B IT services provider.

4 Rounds ~21 Days Medium
Start Mock Interview

The Interview Loop

Recruiter Screen (30 min)

Standard fit check, behavioral questions, and resume overview.

Technical Loop (3-4 Rounds)

Deep dive into domain knowledge, coding, and system design.

Interview Question Bank

Data Scientist Behavioral medium

Tell me about a time you had to explain a highly technical machine learning concept to a non-technical client or business stakeholder.

#Stakeholder Management #Communication #Consulting
Data Scientist Behavioral medium

Describe a time when a client had unrealistic expectations about what AI/ML could achieve for their legacy systems. How did you manage the situation?

#Client Management #Conflict Resolution #Expectation Management
Data Scientist Behavioral medium

Tell me about a time you received a dataset from a client that was completely undocumented and messy. How did you proceed?

#Data Exploration #Client Communication #Ambiguity
Data Scientist Behavioral medium

Tell me about a time a machine learning model you built failed or underperformed in production. What was the root cause and how did you fix it?

#Failure Analysis #Continuous Improvement #Accountability
Data Scientist Behavioral easy

Why do you want to work as a Data Scientist at DXC Technology specifically?

#Company Knowledge #Motivation #Career Goals
Data Scientist Behavioral easy

Describe a project where you had to collaborate closely with Data Engineers and DevOps/MLOps teams to deliver a solution.

#Teamwork #Cross-functional Collaboration #Agile
Data Scientist Coding medium

Write a SQL query to calculate the cumulative sum of revenue per month for the year 2023.

#SQL #Window Functions #Cumulative Sum
Data Scientist Coding medium

Write a SQL query using window functions to calculate the 7-day rolling average of daily transaction volumes for a specific enterprise client.

#SQL #Window Functions #Data Aggregation
Data Scientist Coding easy

Write a Python script using Pandas to merge two large datasets (e.g., client CRM data and transaction logs), handle missing values, and group by a specific category to find the sum.

#Python #Pandas #Data Cleaning
Data Scientist Coding medium

Write a Python function from scratch to calculate the TF-IDF scores for a given list of documents without using Scikit-Learn.

#Python #NLP #Math Implementation
Data Scientist Coding medium

Write a SQL query to find the top 3 highest-paid employees in each department. Handle ties appropriately.

#SQL #Window Functions #Ranking
Data Scientist Coding hard

Implement the K-Means clustering algorithm from scratch in Python. You do not need to optimize it, but the logic must be sound.

#Python #Machine Learning Algorithms #Math
Data Scientist Coding easy

Given an array of integers and a target sum, write a Python function to return the indices of the two numbers that add up to the target. (Two Sum)

#Python #Hash Maps #Time Complexity
Data Scientist Coding hard

Write a SQL query to find the retention rate of users. Specifically, find the percentage of users who logged in on day 1 and also logged in on day 2.

#SQL #Self Joins #Cohort Analysis #Product Analytics
Data Scientist System Design hard

Design an end-to-end architecture for deploying a predictive maintenance model for a manufacturing client. How do you get data from IoT sensors to the model and return predictions?

#IoT #MLOps #Cloud Architecture #Streaming
Data Scientist System Design medium

How would you monitor a deployed machine learning model for concept drift and data drift? What steps would you take if drift is detected?

#Model Monitoring #Concept Drift #Data Drift #Production ML
Data Scientist System Design hard

Design an NLP pipeline to extract key clauses and entities from thousands of scanned legacy PDF contracts for a legal client.

#NLP #OCR #Information Extraction #LLMs
Data Scientist System Design medium

Design a system to automatically categorize and route incoming IT support tickets to the correct resolution team using machine learning.

#NLP #Text Classification #API Design #Enterprise Architecture
Data Scientist System Design hard

Design a real-time anomaly detection system for network security logs. The system must process 100,000 events per second.

#Streaming #Anomaly Detection #Scalability #Cybersecurity
Data Scientist System Design medium

If a client wants to migrate their on-premise machine learning workloads to the cloud, how would you evaluate whether to use AWS SageMaker vs Azure Machine Learning?

#Cloud Computing #AWS #Azure #MLOps
Data Scientist Technical medium

Explain the difference between Random Forest and Gradient Boosting. In what scenario would you choose one over the other for a client's predictive model?

#Ensemble Methods #Random Forest #XGBoost #Model Selection
Data Scientist Technical medium

We often work with clients in the financial sector. How would you handle a highly imbalanced dataset for a credit card fraud detection model?

#Imbalanced Data #SMOTE #Class Weights #Fraud Detection
Data Scientist Technical easy

Explain the Bias-Variance tradeoff. How does regularizing a model affect its bias and variance?

#Model Evaluation #Regularization #Statistical Theory
Data Scientist Technical medium

What is the mathematical difference between L1 (Lasso) and L2 (Ridge) regularization? When would you use L1 over L2?

#Regularization #Linear Models #Feature Selection
Data Scientist Technical medium

Explain the concept of statistical power and p-values in the context of A/B testing a new feature on an enterprise web portal.

#A/B Testing #Hypothesis Testing #P-value
Data Scientist Technical hard

How do you scale a machine learning data preparation pipeline when the dataset is too large to fit into a single machine's RAM? Explain how you would use PySpark.

#PySpark #Distributed Computing #Big Data
Data Scientist Technical easy

What are your strategies for handling missing data in a dataset provided by a client? How do you decide between imputation and dropping rows?

#Data Cleaning #Imputation #EDA
Data Scientist Technical medium

Why can't you use standard k-fold cross-validation for time series forecasting? What should you use instead?

#Cross-Validation #Time Series #Model Evaluation
Data Scientist Technical medium

When evaluating a binary classifier, when would you prefer the Precision-Recall curve over the ROC curve?

#ROC-AUC #Precision-Recall #Imbalanced Data
Data Scientist Technical medium

What is multicollinearity in a regression model? How do you detect it, and how do you resolve it?

#Linear Regression #VIF #Feature Engineering
Data Scientist Technical easy

What are the most effective techniques for detecting outliers in a multi-dimensional dataset?

#Outlier Detection #Isolation Forest #Z-score
Data Scientist Technical hard

Explain the self-attention mechanism in Transformer models. Why has it largely replaced RNNs for NLP tasks?

#NLP #Transformers #Attention Mechanism #LLMs
Data Scientist Technical medium

Explain the kernel trick in Support Vector Machines (SVM). Name two common kernels and when you would use them.

#SVM #Algorithms #Math
Data Scientist Technical medium

Compare ARIMA and Prophet for time series forecasting. What are the pros and cons of each?

#ARIMA #Prophet #Forecasting
Data Scientist Technical easy

What is the time complexity of looking up a value in a Python list versus a Python dictionary? Why?

#Python #Time Complexity #Hash Tables

Difficulty Radar

Based on recent AI-sourced data.

Meet Your Interviewers

The "Standard" Interviewer

Senior Engineer

Focuses on core competencies, system constraints, and clear communication.

Simulate

Unwritten Rules

Think Out Loud

Always explain your thought process before writing code or drawing architecture.

Practice Now