DBMS Interview Questions and Answers (2026)

Preparing for a DBMS interview? This comprehensive guide covers the most frequently asked Database Management System interview questions and answers for freshers, intermediate, and experienced candidates in 2026.

From core concepts like normalization, ACID properties, and indexing to advanced topics like query optimization, transactions, and concurrency control — this resource is designed to help you ace technical rounds at top tech companies, product startups, and IT firms.

What You'll Learn

Fundamentals

Core DBMS concepts, keys, schemas, and ER diagrams

SQL Mastery

JOINs, subqueries, aggregations, and advanced queries

Normalization

1NF, 2NF, 3NF, BCNF and denormalization

Transactions

ACID properties, concurrency control, and deadlocks

Performance

Indexing, query optimization, and tuning

Modern Concepts

NoSQL, CAP theorem, sharding, and replication

Why DBMS is Important for Placements

Having a strong grip on DBMS is one of the keys to cracking technical interviews and securing dream placements in software development, data engineering, and IT roles. Here's why it should be a priority:

High Demand — Most companies from startups to MNCs rely on databases, making DBMS expertise a must-have skill
Core Interview Topic — DBMS questions appear in nearly every backend, full-stack, and data engineering interview
Real-World Application — Knowledge of SQL, normalization, and transactions directly applies to building scalable, production-grade systems
Versatility — DBMS skills are valued across roles: backend development, database administration, data engineering, and analytics
Systems Thinking — Understanding DBMS deepens your grasp of data structures, storage engines, and how applications manage state at scale

Basic DBMS Concepts

1. What is DBMS?

A Database Management System (DBMS) is software that enables users to create, manage, and interact with databases. It acts as an interface between the end user and the database, handling storage, retrieval, data integrity, security, and concurrent access. Examples include MySQL, PostgreSQL, Oracle, and MongoDB.

2. Explain the difference between DBMS and RDBMS.

A DBMS stores data as files without enforcing relationships between them. An RDBMS (Relational DBMS) stores data in structured tables with rows and columns and enforces relationships between tables using keys.

RDBMS supports SQL, ensures ACID properties, and is built on Codd's relational model. Examples: MySQL, PostgreSQL (RDBMS) vs. early dBASE or simple flat-file managers (DBMS).

3. What are the different types of keys in DBMS?

Primary Key — Unique identifier for each row, cannot be NULL
Foreign Key — References a primary key in another table
Candidate Key — Any column that could be a primary key
Super Key — Any set of columns that uniquely identify a row
Composite Key — Primary key made of multiple columns
Surrogate Key — System-generated unique identifier
Unique Key — Enforces uniqueness but allows one NULL

4. What is normalization, and why is it used?

Normalization is the process of organizing a database to reduce data redundancy and improve data integrity. It involves decomposing tables into smaller, well-structured ones following a series of normal forms (1NF, 2NF, 3NF, BCNF).

It ensures that each piece of data is stored in only one place, making updates, deletions, and insertions efficient and consistent.

5. What is 1NF, 2NF, and 3NF?

First Normal Form (1NF): Requires that a table has no repeating groups or arrays, each column contains atomic (indivisible) values, and each column holds values of a single type.

Second Normal Form (2NF): Builds on 1NF and additionally requires that every non-key attribute be fully functionally dependent on the entire primary key — not just part of it. Eliminates partial dependencies.

Third Normal Form (3NF): Builds on 2NF and requires that no non-key attribute is transitively dependent on the primary key through another non-key attribute. Non-key columns must depend only on the primary key.

6. What is BCNF in DBMS?

Boyce-Codd Normal Form (BCNF) is a stricter version of 3NF. A table is in BCNF if for every functional dependency A → B, A must be a super key. BCNF handles anomalies that 3NF misses in cases with multiple overlapping candidate keys.

7. What is the E-R model in DBMS?

The Entity-Relationship (E-R) model is a high-level conceptual data model used to design databases visually. It represents data as entities (objects), attributes (properties of entities), and relationships (associations between entities).

E-R diagrams use rectangles for entities, ellipses for attributes, and diamonds for relationships. It serves as a blueprint before translating the design into actual database tables.

SQL Queries

8. Explain the difference between DDL, DML, and DCL.

DDL (Data Definition Language) defines and modifies database structure: CREATE, ALTER, DROP, TRUNCATE.

DML (Data Manipulation Language) operates on data within tables: SELECT, INSERT, UPDATE, DELETE.

DCL (Data Control Language) manages access permissions: GRANT and REVOKE.

A fourth category, TCL (Transaction Control Language), handles transactions: COMMIT, ROLLBACK, SAVEPOINT.

9. What is the difference between DELETE and TRUNCATE?

Aspect	DELETE	TRUNCATE
Operation	Removes rows one at a time	Removes all rows at once
WHERE clause	Supports selective deletion	Cannot use WHERE
Rollback	Can be rolled back	Cannot be rolled back (usually)
Triggers	Fires triggers	Does not fire triggers
Speed	Slower (logs each deletion)	Much faster

10. What is the difference between WHERE and HAVING?

WHERE filters rows before any grouping or aggregation occurs — it operates on individual rows and cannot use aggregate functions.

HAVING filters groups after GROUP BY aggregation — it can use aggregate functions like COUNT, SUM, AVG.

SELECT dept, COUNT(*)
FROM employees
WHERE active = 1
GROUP BY dept
HAVING COUNT(*) > 5

WHERE filters active employees first, then HAVING filters departments with more than 5 of them.

11. What is a JOIN? Explain different types of JOINs.

A JOIN combines rows from two or more tables based on a related column:

INNER JOIN — Returns only matching rows from both tables
LEFT JOIN — Returns all rows from left table plus matching rows from right (NULLs for non-matches)
RIGHT JOIN — Mirror of LEFT JOIN
FULL OUTER JOIN — Returns all rows from both tables, with NULLs where no match exists
CROSS JOIN — Returns Cartesian product of both tables
SELF JOIN — Joins a table with itself

12. What is a view in SQL?

A view is a virtual table based on the result of a SQL query. It does not store data itself but presents data from one or more underlying tables.

CREATE VIEW active_employees AS
SELECT id, name, dept
FROM employees
WHERE active = 1;

-- Query it like a regular table
SELECT * FROM active_employees;

Views simplify complex queries, provide security by exposing only specific columns/rows, and ensure backward compatibility when tables are restructured.

13. What is an index?

An index is a data structure that speeds up data retrieval at the cost of extra storage and slower writes. Types include:

Clustered Index — Physically reorders table rows to match the index (one per table)
Non-Clustered Index — Stores separate structure pointing to row locations (multiple allowed)
Unique Index — Enforces uniqueness
Composite Index — Covers multiple columns
Full-Text Index — Optimizes text searches

Transactions and Concurrency

14. What is a transaction?

A transaction is a logical unit of work comprising one or more SQL operations that must all succeed or all fail together. If any operation within the transaction fails, the entire transaction is rolled back to its initial state.

Example: Transferring money between bank accounts requires debiting one account and crediting another — both must succeed or neither should.

15. Explain ACID properties.

ACID guarantees reliable transaction processing:

Atomicity — A transaction is all-or-nothing; either all operations complete or none do
Consistency — A transaction brings the database from one valid state to another, respecting all integrity constraints
Isolation — Concurrent transactions execute as if they were sequential; intermediate states are not visible to others
Durability — Once committed, a transaction's changes are permanent even in the event of a system crash

16. What is a deadlock?

A deadlock occurs when two or more transactions are each waiting for a resource held by another, creating a circular dependency where none can proceed.

Example: Transaction A holds a lock on Table 1 and waits for Table 2; Transaction B holds a lock on Table 2 and waits for Table 1. The DBMS detects this cycle, selects one transaction as a victim, rolls it back, and allows the others to proceed.

17. What is concurrency control?

Concurrency control is the mechanism by which a DBMS manages simultaneous access to the database by multiple transactions while ensuring data consistency. Without it, problems like dirty reads, lost updates, and phantom reads can occur.

Techniques include locking protocols (shared/exclusive locks), timestamp ordering, and multiversion concurrency control (MVCC).

18. What is a dirty read in DBMS?

A dirty read occurs when a transaction reads data that has been modified by another transaction that has not yet committed. If the other transaction later rolls back, the first transaction has read data that never officially existed.

This violates isolation and is prevented by using the READ COMMITTED isolation level or higher.

Advanced DBMS Concepts

19. What is database partitioning?

Database partitioning divides a large table into smaller, more manageable pieces called partitions, while still appearing as a single logical table to queries.

Types include: Range partitioning (by date ranges), List partitioning (by specific values), Hash partitioning (by hash of a key), and Composite partitioning (combination). Partitioning improves query performance and simplifies maintenance.

20. What is database sharding?

Sharding is a horizontal scaling technique where data is distributed across multiple separate database instances (shards), each holding a subset of the data. Unlike partitioning (which stays within one database), sharding splits data across different servers.

A shard key determines which shard stores a given record. Sharding enables massive scale but adds complexity in cross-shard queries and distributed transactions.

21. Explain CAP theorem.

The CAP theorem states that a distributed database system can guarantee only two of three properties simultaneously:

Consistency — Every read receives the most recent write or an error
Availability — Every request receives a response, though it may not be the most recent
Partition Tolerance — The system continues operating despite network failures between nodes

Real-world databases choose between CP (consistent and partition-tolerant, e.g., HBase) and AP (available and partition-tolerant, e.g., Cassandra).

22. What is NoSQL?

NoSQL (Not Only SQL) refers to databases that do not use the traditional relational table model. They are designed for specific use cases requiring high scalability, flexible schemas, or special data models.

Types include:

Document stores — MongoDB (JSON documents)
Key-Value stores — Redis (simple key-value pairs)
Column-family stores — Cassandra (wide-column model)
Graph databases — Neo4j (nodes and edges)

23. Explain the difference between SQL and NoSQL.

Aspect	SQL	NoSQL
Structure	Relational, fixed schema	Non-relational, flexible schema
Scalability	Vertical scaling	Horizontal scaling
Transactions	ACID compliant	Eventual consistency (usually)
Use case	Structured business data	Unstructured, high-velocity data

24. What is database replication?

Database replication is the process of copying and maintaining database data across multiple servers (replicas) in real time or near real time.

Master-slave replication has writes going to the master and reads distributed across slaves. Multi-master replication allows writes on multiple nodes. Replication improves read performance, provides high availability (failover), and enables geographic distribution.

25. What is Denormalization?

Denormalization is the deliberate introduction of redundancy into a database schema by merging tables or adding redundant columns to improve read performance.

While normalization reduces redundancy for data integrity, denormalization trades storage and write complexity for faster reads. It is commonly used in data warehouses and high-read systems.

Performance Tuning

26. How do you optimize a SQL query?

Query optimization strategies:

Use indexes on columns in WHERE, JOIN, and ORDER BY clauses
Avoid SELECT * — fetch only needed columns
Rewrite correlated subqueries as JOINs where possible
Use EXISTS instead of IN for large subqueries
Avoid functions on indexed columns in WHERE clauses
Use EXPLAIN to identify bottlenecks like full table scans

27. What is the difference between a clustered and a non-clustered index?

A clustered index determines the physical order of data on disk — the table rows are stored sorted by the clustered index key. Each table can have only one clustered index (typically the primary key).

A non-clustered index is a separate structure that stores index entries with pointers to the actual row locations. A table can have many non-clustered indexes.

28. What is database caching?

Database caching stores frequently accessed query results or computed data in fast-access memory (RAM) to avoid repeated expensive database operations.

Types include: Buffer pool (DBMS's internal cache), Query result cache, and Application-level cache (using Redis or Memcached). Caching dramatically reduces database load and latency.

Security

29. What is SQL injection?

SQL injection is a critical security vulnerability where an attacker inserts malicious SQL code into user input fields, causing the database to execute unintended commands.

Example: If a login query is "SELECT * FROM users WHERE name = '" + userInput + "'", an attacker can enter ' OR '1'='1 to bypass authentication.

30. How do you prevent SQL injection?

The primary defense is parameterized queries (prepared statements) — never concatenate user input directly into SQL strings.

-- Safe approach
SELECT * FROM users WHERE email = ?
-- email passed as parameter

Additional measures: use an ORM, validate and sanitize all input, apply principle of least privilege, and use stored procedures.

Data Warehousing

31. What is a data warehouse?

A data warehouse is a centralized repository designed for analytical reporting and business intelligence, storing large volumes of historical data from multiple source systems.

Unlike OLTP databases optimized for fast transactional reads/writes, data warehouses are optimized for complex analytical queries (OLAP) across large datasets. Examples: Amazon Redshift, Google BigQuery, Snowflake.

32. What is OLAP and OLTP?

Aspect	OLTP	OLAP
Purpose	Transactional operations	Analytical queries
Operations	INSERT, UPDATE, DELETE	Complex SELECT queries
Data	Current, detailed	Historical, aggregated
Schema	Highly normalized	Denormalized (star/snowflake)

33. What is a star schema?

A star schema is a dimensional data warehouse schema where a central fact table is surrounded by multiple dimension tables, forming a star shape.

The fact table contains measures and foreign keys to each dimension. Dimension tables are denormalized. Example: a sales_fact table connected to customer_dim, product_dim, date_dim, and store_dim.

Master DBMS Through Practice! This guide covers essential database management concepts from fundamentals to advanced topics. The key to success is hands-on SQL practice, understanding the "why" behind normalization and transactions, and building real database-driven applications.

Additional Resources

JavaScript Interview Questions Operating System Interview Questions Spring Boot Interview Questions Python Interview Questions Join our WhatsApp Channel Get Batch Wise Job Updates

Best of luck with your DBMS interviews! Focus on understanding core concepts deeply, practice SQL queries daily, design database schemas for real-world problems, and stay curious about database internals. Remember that DBMS knowledge is foundational for building scalable applications.