TechyVia

Unlock SQL Mastery: Advanced Concepts for Experts

Advanced Joins and Subqueries

Advanced joins and subqueries allow you to work efficiently with relational databases by combining data from multiple tables and performing complex queries.

Why Use Advanced Joins and Subqueries?

1. Joining Multiple Tables

Real-world databases often store data across multiple tables. Joins help retrieve meaningful information efficiently.

Example: Joining Three Tables

Retrieve order details including customer name and product name:


SELECT o.order_id, c.customer_name, p.product_name, o.order_date
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id;

2. Nested Subqueries

Subquery in the WHERE Clause

Find all customers who have placed an order:


SELECT customer_name 
FROM customers 
WHERE customer_id IN (
SELECT customer_id FROM orders
);

Subquery in the SELECT Clause

Retrieve each customer’s total number of orders:


SELECT customer_name, (
SELECT COUNT(*) FROM orders o 
WHERE o.customer_id = c.customer_id
) AS total_orders 
FROM customers c;

Subquery in the FROM Clause

Calculate the average order amount per customer:


SELECT customer_id, AVG(order_amount) AS avg_order_amount 
FROM (
SELECT customer_id, order_amount FROM orders
) AS customer_orders 
GROUP BY customer_id;

3. Using Subqueries in SELECT, FROM, and WHERE Clauses

Subquery in the WHERE Clause

Find customers who have placed more than 5 orders:


SELECT customer_name 
FROM customers 
WHERE customer_id IN (
SELECT customer_id FROM orders 
GROUP BY customer_id 
HAVING COUNT(*) > 5
);

Common Mistakes to Avoid

What’s Next?

Now that you understand advanced joins and subqueries, the next step is exploring Window Functions, which enable calculations across related rows.

Interactive Practice

Try these challenges in your SQL editor:

Window Functions: Advanced Data Analysis

Window functions are powerful SQL tools that allow calculations across a set of rows while keeping individual rows intact. Unlike aggregate functions, which collapse data, window functions maintain row-level details while providing insights.

Why Use Window Functions?

Key Concepts

1. ROW_NUMBER()

Assigns a unique number to each row within a partition.


SELECT 
author_id, 
title, 
ROW_NUMBER() OVER (PARTITION BY author_id ORDER BY year_published) AS row_num
FROM books;

2. RANK() and DENSE_RANK()

Ranks rows based on a specified order.


SELECT 
title, 
price, 
RANK() OVER (ORDER BY price DESC) AS rank,
DENSE_RANK() OVER (ORDER BY price DESC) AS dense_rank
FROM books;

3. LEAD() and LAG()

Access data from the next or previous row.

Example: Comparing a Book’s Price to the Next Book


SELECT 
title, 
price, 
LEAD(price) OVER (ORDER BY price) AS next_price
FROM books;

4. SUM() and AVG() Over a Window

Running Total of Book Prices


SELECT 
title, 
price, 
SUM(price) OVER (ORDER BY year_published) AS running_total
FROM books;

Moving Average of Book Prices


SELECT 
title, 
year_published, 
price, 
AVG(price) OVER (ORDER BY year_published ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg
FROM books;

Common Mistakes to Avoid

What’s Next?

Next, explore Common Table Expressions (CTEs) to simplify complex queries.

Interactive Practice

Try these challenges in your SQL editor:

Common Table Expressions (CTEs): Simplifying Complex Queries

Common Table Expressions (CTEs) allow you to define temporary result sets within a query. They help in breaking down complex queries into smaller, more manageable parts.

Why Use CTEs?

1. Basic Syntax


WITH cte_name AS (
SELECT ...
)
SELECT ...
FROM cte_name;

2. Simplifying a Query

Find authors with an average book price above $15.

Without CTE


SELECT author_id, AVG(price) AS avg_price
FROM books
GROUP BY author_id
HAVING AVG(price) > 15;

With CTE


WITH author_avg_price AS (
SELECT author_id, AVG(price) AS avg_price
FROM books
GROUP BY author_id
)
SELECT author_id, avg_price
FROM author_avg_price
WHERE avg_price > 15;

3. Reusing a CTE

Calculate total sales per author and percentage contribution of each book.


WITH total_sales AS (
SELECT author_id, SUM(price) AS total_price
FROM books
GROUP BY author_id
)
SELECT 
b.title, 
b.price, 
b.price / total_sales.total_price * 100 AS sales_percentage
FROM books b
JOIN total_sales
ON b.author_id = total_sales.author_id;

4. Recursive CTEs

Recursive CTEs allow handling hierarchical data, like employee hierarchies.

Example: Organizational Hierarchy


WITH RECURSIVE org_chart AS (
SELECT employee_id, employee_name, manager_id
FROM employees
WHERE employee_id = 1 -- Manager's ID

UNION ALL

SELECT e.employee_id, e.employee_name, e.manager_id
FROM employees e
INNER JOIN org_chart oc
ON e.manager_id = oc.employee_id
)
SELECT * FROM org_chart;

5. Why CTEs Are Important for Data Engineers

Common Mistakes to Avoid

What’s Next?

Next, explore Indexes and Query Optimization to improve performance on large datasets.

Interactive Practice

Try these challenges in your SQL editor:

Transactions and Concurrency Control: Keeping Your Data Safe and Consistent

A transaction is a sequence of database operations that are treated as a single unit of work. Transactions ensure data integrity, even in cases of system failures or concurrent user access.

Why Are Transactions Important?

1. How Do Transactions Work?

Step 1: Start a Transaction

START TRANSACTION;

Step 2: Perform Operations

UPDATE accounts SET balance = balance - 100 WHERE account_id = 1; -- Deduct from savings
UPDATE accounts SET balance = balance + 100 WHERE account_id = 2; -- Add to checking

Step 3: Commit or Rollback

If successful, commit the transaction:

COMMIT;

If an error occurs, rollback the changes:

ROLLBACK;

2. Concurrency Control: Handling Multiple Users

When multiple users access the database simultaneously, transactions must be managed to avoid conflicts.

Isolation Levels

Example: Setting the Isolation Level

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

START TRANSACTION;

SELECT * FROM orders;

COMMIT;

3. Example: Using Transactions in SQL

Placing an Order with Inventory Update


START TRANSACTION;

-- Step 1: Insert the order
INSERT INTO orders (order_id, product_id, quantity)
VALUES (101, 1, 2);

-- Step 2: Update inventory
UPDATE inventory SET stock = stock - 2 WHERE product_id = 1;

COMMIT;

Simulating a Rollback


START TRANSACTION;

INSERT INTO orders (order_id, product_id, quantity)
VALUES (103, 3, 5);

-- Cause an error (invalid product_id)
UPDATE inventory SET stock = stock - 5 WHERE product_id = 999;

ROLLBACK;

Common Mistakes to Avoid

What’s Next?

Next, explore Stored Procedures and Functions to encapsulate complex logic.

Interactive Practice

Try these challenges in your SQL editor:

Stored Procedures and Functions

Stored procedures and user-defined functions (UDFs) are essential for modularizing SQL code, improving performance, and enhancing security.

What Are Stored Procedures and Functions?

1. Creating Stored Procedures

Example: Retrieve All Customers


CREATE PROCEDURE GetAllCustomers()
BEGIN
SELECT * FROM customers;
END;

Calling the Procedure

CALL GetAllCustomers();

2. Parameters in Stored Procedures

Example: Retrieve Customers by City


CREATE PROCEDURE GetCustomersByCity(IN city_name VARCHAR(100))
BEGIN
SELECT * FROM customers WHERE city = city_name;
END;

Calling the Procedure

CALL GetCustomersByCity('New York');

3. Creating User-Defined Functions (UDFs)

Example: Calculate Total Order Price


CREATE FUNCTION CalculateOrderTotal(order_id INT)
RETURNS DECIMAL(10, 2)
BEGIN
DECLARE total DECIMAL(10, 2);
SELECT SUM(price * quantity) INTO total
FROM order_items
WHERE order_id = order_id;
RETURN total;
END;

Using the Function


SELECT order_id, CalculateOrderTotal(order_id) AS total_price
FROM orders;

4. Differences Between Procedures and Functions

Feature Stored Procedures User-Defined Functions (UDFs)
Return Value Can return multiple values or none. Must return a single value or a table.
Usage Called using CALL. Used in SQL statements like SELECT.
Parameters Can have IN, OUT, and INOUT parameters. Only supports IN parameters.
SQL Statements Can execute multiple SQL statements. Typically used for calculations.
Performance Better for complex operations. Better for reusable calculations.

Common Mistakes to Avoid

What’s Next?

Next, explore Triggers, which automatically execute in response to INSERT, UPDATE, or DELETE events.

Interactive Practice

Try these challenges in your SQL editor:

Triggers: Automating Database Actions

A trigger is a special type of stored procedure that automatically executes in response to certain events on a table, such as INSERT, UPDATE, or DELETE. Triggers help enforce business rules, maintain data integrity, and automate actions.

Why Use Triggers?

1. Creating and Dropping Triggers

Example: Log New Orders


CREATE TRIGGER log_new_order
AFTER INSERT
ON orders
FOR EACH ROW
BEGIN
INSERT INTO order_audit (order_id, action, action_time)
VALUES (NEW.order_id, 'INSERT', NOW());
END;

Example: Prevent Cancelling Shipped Orders


CREATE TRIGGER prevent_cancellation
BEFORE UPDATE
ON orders
FOR EACH ROW
BEGIN
IF OLD.order_status = 'Shipped' AND NEW.order_status = 'Cancelled' THEN
SIGNAL SQLSTATE '45000' 
SET MESSAGE_TEXT = 'Cannot cancel a shipped order';
END IF;
END;

Dropping a Trigger


DROP TRIGGER log_new_order;

2. BEFORE and AFTER Triggers

BEFORE Triggers

Executed before the triggering event, useful for validation.

Example: Validate Order Date

CREATE TRIGGER validate_order_date
BEFORE INSERT
ON orders
FOR EACH ROW
BEGIN
IF NEW.order_date < CURDATE() THEN
SIGNAL SQLSTATE '45000' 
SET MESSAGE_TEXT = 'Order date cannot be in the past';
END IF;
END;

AFTER Triggers

Executed after the triggering event, useful for logging and updates.

Example: Update Inventory After an Order

CREATE TRIGGER update_inventory
AFTER INSERT
ON order_items
FOR EACH ROW
BEGIN
UPDATE inventory
SET stock = stock - NEW.quantity
WHERE product_id = NEW.product_id;
END;

3. Use Cases for Triggers

1. Auditing Changes

Track updates to sensitive data.


CREATE TRIGGER log_employee_updates
AFTER UPDATE
ON employees
FOR EACH ROW
BEGIN
INSERT INTO employee_audit (employee_id, old_salary, new_salary, action_time)
VALUES (OLD.employee_id, OLD.salary, NEW.salary, NOW());
END;

2. Enforcing Business Rules

Prevent deletion of customers with active orders.


CREATE TRIGGER prevent_customer_deletion
BEFORE DELETE
ON customers
FOR EACH ROW
BEGIN
IF (SELECT COUNT(*) FROM orders WHERE customer_id = OLD.customer_id) > 0 THEN
SIGNAL SQLSTATE '45000' 
SET MESSAGE_TEXT = 'Cannot delete customer with active orders';
END IF;
END;

3. Maintaining Data Integrity

Automatically update customer sales records.


CREATE TRIGGER update_total_sales
AFTER INSERT
ON orders
FOR EACH ROW
BEGIN
UPDATE customers
SET total_sales = total_sales + NEW.order_amount
WHERE customer_id = NEW.customer_id;
END;

Common Mistakes to Avoid

What’s Next?

Next, explore Transactions and Concurrency Control to manage data consistency in multi-user environments.

Interactive Practice

Try these challenges in your SQL editor:

Performance Optimization: Making Your Queries Faster

As your database grows, queries can slow down, affecting performance and user experience. Optimizing queries ensures fast and efficient data retrieval. Key focus areas include:

1. Query Execution Plans

What Is a Query Execution Plan?

A query execution plan is a roadmap that shows how a database retrieves data, including:

Viewing the Execution Plan

EXPLAIN SELECT * FROM customers WHERE city = 'New York';

The output includes:

2. Index Optimization

What Are Indexes?

Indexes help the database locate rows quickly, improving query performance.

Types of Indexes

Example: Creating an Index


CREATE INDEX idx_city ON customers (city);
SELECT * FROM customers WHERE city = 'New York';

When to Use Indexes

When NOT to Use Indexes

3. Avoiding Full Table Scans

What Is a Full Table Scan?

A full table scan occurs when the database reads all rows to find the required data, slowing down queries.

How to Avoid Full Table Scans

Example: Using an Index Instead of a Full Table Scan


-- Full Table Scan (Slow)
SELECT * FROM customers WHERE city = 'New York';

-- Index Scan (Optimized)
CREATE INDEX idx_city ON customers (city);
SELECT * FROM customers WHERE city = 'New York';

4. Using EXPLAIN for Query Analysis

What Is EXPLAIN?

The EXPLAIN command shows how the database executes a query, helping to identify slow operations.

Example: Analyzing a Query


EXPLAIN SELECT c.customer_name, o.order_date
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE c.city = 'New York';

Key EXPLAIN Output

EXPLAIN ANALYZE (PostgreSQL)

PostgreSQL allows you to see both the execution plan and runtime performance using:

EXPLAIN ANALYZE SELECT * FROM customers WHERE city = 'New York';

Common Mistakes to Avoid

What’s Next?

Next, explore Transactions and Concurrency Control to manage multiple user operations and data integrity.

Interactive Practice

Try these challenges in your SQL editor:

Backup and Recovery: Protecting Your Data

Backup and recovery ensure that your database is protected from data loss due to accidental deletions, hardware failures, or cyberattacks. A solid backup strategy allows for quick restoration and disaster recovery.

Why Are Backup and Recovery Important?

1. Database Backup Strategies

Types of Backups

Full Backup Example (MySQL)

mysqldump -u username -p database_name > backup_file.sql

Incremental Backup (PostgreSQL)

Enable Write-Ahead Logging (WAL) to track database changes.


wal_level = replica
archive_mode = on
archive_command = 'cp %p /path/to/wal_archive/%f'

Perform a full backup:

pg_basebackup -U username -D /path/to/backup -Ft -z

Differential Backup (SQL Server)


BACKUP DATABASE database_name
TO DISK = '/path/to/backup.bak'
WITH DIFFERENTIAL;

2. Point-in-Time Recovery (PITR)

What Is PITR?

PITR allows restoring a database to a specific moment in time, helping recover from accidental deletions or corruption.

PITR in PostgreSQL

pg_restore -U username -d database_name /path/to/backup.tar

Apply WAL logs to restore a specific time:

pg_wal_restore -U username -d database_name /path/to/wal_archive --target-time '2023-10-01 12:00:00'

PITR in MySQL

Enable binary logging in my.cnf:

log_bin = /var/log/mysql/mysql-bin.log

Restore full backup:

mysql -u username -p database_name < backup_file.sql

Apply binary logs to restore a specific time:

mysqlbinlog --stop-datetime="2023-10-01 12:00:00" /var/log/mysql/mysql-bin.log | mysql -u username -p

3. Exporting and Importing Data

Exporting Data

Export data for backup or migration.

MySQL Export to CSV


SELECT * INTO OUTFILE '/path/to/customers.csv'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM customers;

PostgreSQL Export to CSV

COPY customers TO '/path/to/customers.csv' WITH CSV HEADER;

Importing Data

Load data back into the database.

MySQL Import from CSV


LOAD DATA INFILE '/path/to/customers.csv'
INTO TABLE customers
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

PostgreSQL Import from CSV

COPY customers FROM '/path/to/customers.csv' WITH CSV HEADER;

Common Mistakes to Avoid

What’s Next?

Next, explore Database Monitoring and Maintenance to keep your database running efficiently over time.

Interactive Practice

Try these challenges in your SQL editor:

Advanced Database Design: Building Efficient Databases

Why Is Database Design Important?

A well-designed database is crucial for performance, data integrity, and scalability. It helps:

1. Normalization

What Is Normalization?

Normalization is the process of organizing data to reduce redundancy and improve integrity. It involves splitting tables and defining relationships between them.

Levels of Normalization

First Normal Form (1NF):

Ensures each column contains atomic values and each row is unique.

Before 1NF:
OrderID Products
1 Apple, Banana
2 Orange, Mango
After 1NF:
OrderID Product
1 Apple
1 Banana
2 Orange
Second Normal Form (2NF):

Ensures all non-key columns depend on the entire primary key.

Before 2NF:
OrderID Product Price
1 Apple 1.00
1 Banana 0.50
After 2NF:
Orders Table:
OrderID Product
1 Apple
1 Banana
Products Table:
Product Price
Apple 1.00
Banana 0.50

Third Normal Form (3NF):

Ensures non-key columns are independent of each other.

Before 3NF:
OrderID Customer CustomerPhone
1 John 123-456-7890
After 3NF:
Orders Table:
OrderID CustomerID
1 1
Customers Table:
CustomerID Customer CustomerPhone
1 John 123-456-7890

2. Denormalization

What Is Denormalization?

Denormalization introduces redundancy to improve read performance. It’s useful in:

3. Entity-Relationship Diagrams (ERDs)

An ERD visually represents database relationships. It shows:

4. Schema Design Best Practices

What’s Next?

Now that you understand database design, let’s move to Database Monitoring & Maintenance to keep systems running smoothly.

Working with Big Data: Scaling Your Database

Why Is Big Data Important?

As datasets grow, traditional databases can struggle to handle the volume, velocity, and variety of data. Big data techniques help you:

1. Partitioning Tables

What Is Partitioning?

Partitioning splits a large table into smaller, more manageable pieces called partitions. Each partition stores a subset of the data based on a specific criterion (e.g., date, region).

Types of Partitioning

Example: Range Partitioning

Partition a sales table by year:


CREATE TABLE sales (
sale_id INT,
sale_date DATE,
amount DECIMAL(10, 2)
)
PARTITION BY RANGE (YEAR(sale_date)) (
PARTITION p0 VALUES LESS THAN (2020),
PARTITION p1 VALUES LESS THAN (2021),
PARTITION p2 VALUES LESS THAN (2022),
PARTITION p3 VALUES LESS THAN (2023)
);

Example: List Partitioning

Partition a customers table by region:


CREATE TABLE customers (
customer_id INT,
name VARCHAR(100),
region VARCHAR(50)
)
PARTITION BY LIST (region) (
PARTITION p_north VALUES IN ('NY', 'NJ', 'PA'),
PARTITION p_south VALUES IN ('TX', 'FL', 'GA'),
PARTITION p_west VALUES IN ('CA', 'WA', 'OR')
);

2. Sharding

What Is Sharding?

Sharding splits a database into smaller, independent pieces called shards. Each shard is stored on a separate server, allowing distributed load balancing.

Example: Sharding by User ID

Shard a users table across 3 servers based on user_id:

3. Clustering

What Is Clustering?

Clustering groups related data together on disk to improve query performance. It’s often used with partitioning to optimize retrieval.

Example: Clustered Index

Create a clustered index on the orders table:


CREATE CLUSTERED INDEX idx_order_date
ON orders (order_date);

4. Using SQL with Big Data Tools

Apache Hive

Apache Hive is a data warehouse tool that allows you to query large datasets using SQL-like syntax.


CREATE TABLE sales (
sale_id INT,
sale_date STRING,
amount DOUBLE
)
STORED AS ORC;

SELECT * FROM sales WHERE sale_date = '2023-10-01';

Spark SQL

Spark SQL is a module in Apache Spark for querying structured data.


from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("example").getOrCreate()

# Load data
df = spark.read.csv("sales.csv", header=True, inferSchema=True)

# Create a temporary view
df.createOrReplaceTempView("sales")

# Run a query
result = spark.sql("SELECT * FROM sales WHERE amount > 100")
result.show()

Common Mistakes to Avoid

What’s Next?

Now that you know how to work with big data, let’s explore Database Monitoring and Maintenance to ensure smooth performance.

Interactive Practice

Try these challenges in your SQL environment:

Partition a logs table by log_date (range partitioning):


CREATE TABLE logs (
log_id INT,
log_date DATE,
message TEXT
)
PARTITION BY RANGE (log_date) (
PARTITION p0 VALUES LESS THAN ('2023-01-01'),
PARTITION p1 VALUES LESS THAN ('2023-07-01'),
PARTITION p2 VALUES LESS THAN ('2024-01-01')
);

Shard a users table across 4 servers based on user_id:


-- Shard 1: user_id % 4 = 0
-- Shard 2: user_id % 4 = 1
-- Shard 3: user_id % 4 = 2
-- Shard 4: user_id % 4 = 3

Create a clustered index on the orders table:


CREATE CLUSTERED INDEX idx_order_date
ON orders (order_date);

Key Takeaways

SQL for Data Analysis: Unlocking Insights

Why Use SQL for Data Analysis?

SQL is a powerful tool for analyzing large datasets because:

1. Pivoting Data

What Is Pivoting?

Pivoting transforms rows into columns, making it easier to analyze tabular data. This is useful for creating summary tables and reports.

Example: Pivoting Sales Data

Suppose you have a sales table:

sale_id product month amount
1 Apple Jan 100
2 Banana Jan 150
3 Apple Feb 200
4 Banana Feb 250
Pivoted View:
product Jan Feb
Apple 100 200
Banana 150 250
Using CASE Statements for Pivoting

SELECT product,
SUM(CASE WHEN month = 'Jan' THEN amount ELSE 0 END) AS Jan,
SUM(CASE WHEN month = 'Feb' THEN amount ELSE 0 END) AS Feb
FROM sales
GROUP BY product;
Using PIVOT (SQL Server)

SELECT product, [Jan], [Feb]
FROM sales
PIVOT (
SUM(amount)
FOR month IN ([Jan], [Feb])
) AS pvt;

2. Time Series Analysis

What Is Time Series Analysis?

Time series analysis involves analyzing data points collected over time to identify trends and patterns.

Example: Monthly Sales Trends

SELECT DATE_TRUNC('month', sale_date) AS month,
SUM(amount) AS total_sales
FROM sales
GROUP BY DATE_TRUNC('month', sale_date)
ORDER BY month;
Example: Rolling Averages

Calculate a 3-month rolling average of sales:


SELECT month,
AVG(total_sales) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS rolling_avg
FROM (
SELECT DATE_TRUNC('month', sale_date) AS month,
    SUM(amount) AS total_sales
FROM sales
GROUP BY DATE_TRUNC('month', sale_date)
) AS monthly_sales;

3. Advanced Aggregations

What Are Advanced Aggregations?

Advanced aggregations go beyond simple SUM and AVG to provide deeper insights.

Example: Window Functions

Calculate cumulative sales for each product:


SELECT product,
sale_date,
amount,
SUM(amount) OVER (PARTITION BY product ORDER BY sale_date) AS cumulative_sales
FROM sales;
Example: Grouping Sets

Aggregate sales by product and month:


SELECT product,
DATE_TRUNC('month', sale_date) AS month,
SUM(amount) AS total_sales
FROM sales
GROUP BY GROUPING SETS ((product), (month), (product, month));
Example: CUBE

Generate subtotals and grand totals for sales by product and month:


SELECT product,
DATE_TRUNC('month', sale_date) AS month,
SUM(amount) AS total_sales
FROM sales
GROUP BY CUBE (product, month);

Common Mistakes to Avoid

What’s Next?

Now that you know how to use SQL for data analysis, let’s explore Database Monitoring and Maintenance to ensure your database runs efficiently over time.

Interactive Practice

Try these challenges in your SQL environment:

Pivot the following sales table to show sales by product and month:

SELECT product,
SUM(CASE WHEN month = 'Jan' THEN amount ELSE 0 END) AS Jan,
SUM(CASE WHEN month = 'Feb' THEN amount ELSE 0 END) AS Feb
FROM sales
GROUP BY product;
Calculate the cumulative sales for each product:

SELECT product,
sale_date,
amount,
SUM(amount) OVER (PARTITION BY product ORDER BY sale_date) AS cumulative_sales
FROM sales;

Key Takeaways

Common SQL Interview Questions

1. Basic SQL Questions

These questions test your fundamental SQL knowledge, including query writing and key concepts.

1.1 What is the Difference Between WHERE and HAVING?


-- WHERE: Filter rows before grouping
SELECT department, AVG(salary)
FROM employees
WHERE salary > 50000
GROUP BY department;

-- HAVING: Filter groups after grouping
SELECT department, AVG(salary)
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000;

1.2 What is a JOIN? Explain the Types of Joins.

A JOIN combines rows from two or more tables based on a related column.


-- INNER JOIN Example
SELECT employees.name, departments.department_name
FROM employees
INNER JOIN departments ON employees.department_id = departments.department_id;

1.3 What is the Difference Between DELETE and TRUNCATE?


-- DELETE Example
DELETE FROM employees WHERE department_id = 10;

-- TRUNCATE Example
TRUNCATE TABLE employees;

2. Tricky SQL Problems

These questions test your ability to solve complex problems using subqueries, window functions, and joins.

2.1 Find the Second Highest Salary

Write a query to find the second highest salary in the employees table.


SELECT MAX(salary) AS second_highest_salary
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

2.2 Find Duplicate Emails

Write a query to find duplicate emails in the users table.


SELECT email, COUNT(*)
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

2.3 Rank Employees by Salary

Write a query to rank employees by salary within each department.


SELECT name, department_id, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
FROM employees;

3. Real-World SQL Scenarios

These questions test how well you can apply SQL to practical business situations.

3.1 Analyze Sales Trends

Calculate monthly sales trends from a sales table.


SELECT DATE_TRUNC('month', sale_date) AS month,
SUM(amount) AS total_sales
FROM sales
GROUP BY DATE_TRUNC('month', sale_date)
ORDER BY month;

3.2 Identify Inactive Users

Find users who haven’t logged in for more than 30 days.


SELECT user_id
FROM users
WHERE last_login_date < NOW() - INTERVAL '30 days';

3.3 Calculate Customer Lifetime Value (CLV)

Calculate the total amount spent by each customer.


SELECT customer_id, SUM(amount) AS lifetime_value
FROM orders
GROUP BY customer_id;

Common Mistakes to Avoid

What’s Next?

Now that you’ve practiced SQL interview questions, let’s explore Database Monitoring and Maintenance to ensure optimal database performance.

Interactive Practice

Try these SQL challenges in your database:

Find the second highest salary in the employees table:


SELECT MAX(salary) AS second_highest_salary
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

Find duplicate emails in the users table:


SELECT email, COUNT(*)
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

Calculate the monthly sales trend from the sales table:


SELECT DATE_TRUNC('month', sale_date) AS month,
SUM(amount) AS total_sales
FROM sales
GROUP BY DATE_TRUNC('month', sale_date)
ORDER BY month;

Key Takeaways