Back to all posts

SQL Interview Q&A

Here are the questions along with their corresponding answers extracted from the provided context: What is SQL? SQL stands for Structured Query Language. I…

Here are the questions along with their corresponding answers extracted from the provided context:

  1. What is SQL?
    SQL stands for Structured Query Language. It is a programming language used for managing and manipulating relational databases.
  2. What is a database?
    A database is an organized collection of data stored and accessed electronically. It provides a way to store, organize, and retrieve large amounts of data efficiently.
  3. What is a primary key?
    A primary key is a column or combination of columns that uniquely identifies each row in a table. It enforces the entity integrity rule in a relational database. The entity integrity rule is a fundamental concept in relational databases that ensures each row in a table is uniquely identifiable. Here's what it means:
    • Uniqueness: The primary key ensures that no two rows in the table have the same value for the primary key column(s). This guarantees that every record is unique.
    • Non-nullability: The primary key cannot contain NULL values because a NULL does not uniquely identify a record. Every row must have a valid, unique value for the primary key column(s).
  4. What is a foreign key?
    A foreign key is a column or combination of columns that establishes a link between data in two tables. It ensures referential integrity by enforcing relationships between tables.
    • If you have an Orders table with a CustomerID foreign key column, and it references the CustomerID primary key in the Customers table, the database will enforce the rule that every CustomerID in the Orders table must exist in the Customers table.
    • You can define rules (e.g., ON DELETE CASCADE or ON UPDATE CASCADE) so that changes in the parent table are automatically reflected in the child table.
  5. What is the difference between a primary key and a unique key?
    • Primary Key
    • Used to uniquely identify each row in a table.
    • Ensures uniqueness.
    • Cannot contain NULL values.
    • Only one primary key is allowed in a table.
    • Automatically creates a clustered index (in most databases).
    • Unique Key
    • Ensures that a column or combination of columns has unique values.
    • Ensures uniqueness.
    • allow null values.
    • A table can have multiple unique keys.
    • Creates a non-clustered index by default (in most databases).
  6. What is normalization?
    Normalization is the process of organizing data in a database to minimize redundancy and dependency. It involves breaking down a table into smaller tables and establishing relationships between them.
  7. What are the different types of normalization?
    The different types of normalization are:
    • First Normal Form (1NF)
    • Second Normal Form (2NF)
    • Third Normal Form (3NF)
    • Boyce-Codd Normal Form (BCNF)
    • Fourth Normal Form (4NF)
    • Fifth Normal Form (5NF) or Project-Join Normal Form (PJNF)
  8. What is a join in SQL?
    A join is an operation used to combine rows from two or more tables based on related columns. It allows you to retrieve data from multiple tables simultaneously.
  9. What is the difference between DELETE and TRUNCATE in SQL?
    The DELETE statement is used to remove specific rows from a table based on a condition. It can be rolled back and generates individual delete operations for each row. TRUNCATE, on the other hand, is used to remove all rows from a table. It cannot be rolled back, and it is faster than DELETE as it deallocates the data pages instead of logging individual row deletions.
  10. What is the difference between UNION and UNION ALL?
    UNION removes duplicate rows from the combined result set, whereas UNION ALL includes all rows, including duplicates.
  11. What is the difference between the HAVING clause and the WHERE clause?
    The WHERE clause is used to filter rows based on a condition before the data is grouped or aggregated. It operates on individual rows. The HAVING clause is used to filter grouped rows based on a condition after the data is grouped or aggregated using the GROUP BY clause.
  12. What is a transaction in SQL?
    A transaction in SQL is a group of operations that are treated as a single unit. Either all the operations in a transaction are successfully executed (committed) or none of them are executed (rolled back). Transactions are useful to ensure data consistency and integrity. Key Features of Transactions:
    • Atomicity: All statements in the transaction succeed or fail as a unit.
    • Consistency: The database remains in a valid state before and after the transaction.
    • Isolation: Transactions are executed independently of one another.
    • Durability: Once committed, changes made by the transaction are permanent.
    • How Transactions Ensure Consistency
      • Without transactions, if the first operation succeeds but the second fails, your database could end up in an inconsistent state. By grouping operations into a transaction, you ensure that the database is either completely updated or not updated at all.
    • Using Transactions in SQL
      • Start the transaction: BEGIN TRANSACTION or START TRANSACTION.
      • Perform operations (e.g., INSERT, UPDATE, DELETE).
      • Commit the transaction: COMMIT.
      • Rollback if there’s an error: ROLLBACK.
  13. What is the difference between a clustered and a non-clustered index?
    A clustered index determines the physical order of data in a table and can be created on only one column. A table can have only one clustered index. A non-clustered index does not affect the physical order of data in a table and can have multiple non-clustered indexes.
  14. What is a deadlock?
    A deadlock occurs when two or more transactions are waiting for each other to release resources, resulting in a circular dependency, causing none of the transactions to proceed.
  15. What is the difference between a database and a schema?
    Difference Between a Database and a Schema
    Database:
    It's the main container for all the data and objects (like tables, views, and procedures).
    Think of it as a big folder where everything is stored.
    Example: A company's database might store all its information.
    Schema:
    It's a sub-container inside the database that organizes and groups objects.
    It helps in categorizing data and defining ownership.
    Example: Within a company's database, you might have a schema for "HR" data and another for "Finance" data .

    Simple Analogy:
    Database: Like a library that holds books.
    Schema: Like sections within the library (e.g., Fiction, Science, History), where each section organizes related books.

    Key Points:
    A database can have multiple schemas.
    Schemas are used to organize data logically and control access permissions.
  16. What is the difference between a temporary table and a table variable?
    A temporary table exists only for the duration of a session or a transaction and can be explicitly dropped. A table variable is a variable that can store a table-like structure in memory and has a limited scope within a batch, stored procedure, or function.
  17. What is the purpose of the GROUP BY clause?
    The GROUP BY clause is used to group rows based on one or more columns in a table, typically used with aggregate functions to perform calculations on grouped data.
  18. What is the difference between CHAR and VARCHAR data types?
    CHAR is a fixed-length string data type, while VARCHAR is a variable-length string data type.
    • Use CHAR when all values are of fixed length (e.g., 5-character codes). Pads with spaces to match the defined length.
    • Use VARCHAR when the values can vary in length to save storage space. No padding; stores exactly what is provided.
  19. What is a stored procedure?
    A stored procedure is a precompiled collection of one or more SQL statements and optional procedural code that is stored in the database and can be executed as a single unit. It allows users to encapsulate complex logic, reuse code, and improve performance by reducing repetitive query processing.
    • Use Cases:
      • Performing CRUD (Create, Read, Update, Delete) operations.
      • Complex business logic involving multiple SQL statements.
      • Generating reports with aggregated data.
      • Batch processing tasks.
    • Advantages:
      • Improved Performance: Reduces query parsing and execution time.
      • Code Reusability: Simplifies development and maintenance.
      • Security: Restricts direct access to tables by granting permissions on procedures.
      • Reduced Network Traffic: Only the procedure call is sent to the database, not the actual SQL statements.
  20. What is a subquery?
    A subquery is a query nested inside another query, used to retrieve data based on the result of an inner query.
  21. What is a view?
    A view is a virtual table based on the result of an SQL statement, allowing users to retrieve and manipulate data as if it were a table.
  22. What is the difference between a cross join and an inner join?
    A cross join (Cartesian product) returns the combination of all rows from two or more tables, while an inner join returns only the matching rows based on a join condition.
  23. What is the purpose of the COMMIT statement?
    The COMMIT statement is used to save changes made in a transaction permanently, ending the transaction and making the changes visible to other users.
  24. What is the purpose of the ROLLBACK statement?
    The ROLLBACK statement is used to undo changes made in a transaction, reverting the database to its previous state before the transaction started.
  25. What is the purpose of the NULL value in SQL?
    NULL represents the absence of a value or unknown value, requiring special handling in SQL queries.
  26. What is a correlated subquery?
    A correlated subquery is a subquery that refers to a column from the outer query and executes once for each row processed by the outer query.
  27. What is the purpose of the DISTINCT keyword?
    The DISTINCT keyword is used to retrieve unique values from a column or combination of columns in a SELECT statement.
  28. What is the difference between the IN and EXISTS operators?
    The IN operator checks for a value within a set of values or the result of a subquery, while the EXISTS operator checks for the existence of rows returned by a subquery.
  29. What is the purpose of the TRIGGER statement?
    The TRIGGER statement is used to associate a set of SQL statements with a specific event in the database, executed automatically when the event occurs.
  30. What is the difference between a unique constraint and a unique index?
    A unique constraint ensures the uniqueness of values in one or more columns, while a unique index enforces uniqueness and also improves query performance.
  31. What is the purpose of the TOP or LIMIT clause?
    The TOP (in SQL Server) or LIMIT (in MySQL) clause is used to limit the number of rows returned by a query, often used with an ORDER BY clause.
  32. What is the difference between the UNION and JOIN operators?
    UNION combines the result sets of two or more SELECT statements vertically, while JOIN combines columns from two or more tables horizontally based on a join condition.
  33. What is a data warehouse?
    A data warehouse is a large, centralized repository that stores and manages data from various sources, designed for efficient reporting, analysis, and business intelligence purposes.
  34. What is the difference between a primary key and a candidate key?
    A primary key is a chosen candidate key that uniquely identifies a row in a table, while a candidate key is a set of one or more columns that could potentially become the primary key.
  35. What is the purpose of the GRANT statement?
    The GRANT statement is used to grant specific permissions or privileges to users or roles in a database.
  36. What is a correlated update?
    A correlated update is an update statement that refers to a column from the same table in a subquery, updating values based on the result of the subquery for each row.
  37. What is the purpose of the CASE statement?
    The CASE statement is used to perform conditional logic in SQL queries, allowing you to return different values based on specified conditions.
  38. What is the purpose of the COALESCE function?
    The COALESCE function returns the first non-null expression from a list of expressions, often used to handle null values effectively.
  39. What is the purpose of the ROW_NUMBER() function?
    The ROW_NUMBER() function assigns a unique incremental number to each row in the result set, commonly used for pagination or ranking purposes.
  40. What is the difference between a natural join and an inner join?
    A natural join is an inner join that matches rows based on columns with the same name in the joined tables, automatically determined by the database.
  41. What is the purpose of the CASCADE DELETE constraint?
    The CASCADE DELETE constraint is used to automatically delete related rows in child tables when a row in the parent table is deleted.
  42. What is the purpose of the ALL and ANY keywords in SQL?
  43. What is the difference between the EXISTS and NOT EXISTS operators?
    The EXISTS operator returns true if a subquery returns any rows, while the NOT EXISTS operator returns true if a subquery returns no rows.
  44. What is the purpose of the CROSS APPLY operator?
    The CROSS APPLY operator is used to invoke a table-valued function for each row of a table expression, returning the combined result set.
  45. What is a self-join?
    A self-join is a join operation where a table is joined with itself, useful for comparing rows within the same table based on related columns.
  46. What is an ALIAS command?
    The ALIAS command in SQL is the name that can be given to any table or a column, which can be referred to in the WHERE clause to identify a particular table or a column.
  47. Why are SQL functions used?
    SQL functions are used for the following purposes:
    • To perform some calculations on the data
    • To modify individual data items
    • To manipulate the output
    • To format dates and numbers
    • To convert the data types

Key Points on MS SQL Server History:

  1. Initial Development:
    • MS SQL Server was originally a collaboration between Microsoft, Sybase, and Ashton-Tate in 1989.
    • The first version, SQL Server 1.0, was released for OS/2.
  2. Separation from Sybase:
    • In 1994, Microsoft ended its partnership with Sybase and took full control of SQL Server's development.
    • Sybase retained the rights for its own database systems, while Microsoft developed MS SQL Server independently.
  3. Major Releases and Features:
    • SQL Server 6.0 (1995): Microsoft's first independent version after parting ways with Sybase.
    • SQL Server 7.0 (1998): Marked a complete rewrite of the database engine; introduced OLAP services.
    • SQL Server 2000: Introduced XML support, indexed views, and more.
    • SQL Server 2005: A major release with features like SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), and SQL Server Analysis Services (SSAS).
    • SQL Server 2008: Added features like data compression and policy-based management.
    • SQL Server 2012: Brought AlwaysOn availability groups and columnstore indexes.
    • SQL Server 2016: Introduced real-time operational analytics and support for JSON.
    • SQL Server 2017: Added support for Linux, making it cross-platform.
    • SQL Server 2019: Introduced Big Data Clusters and enhanced intelligence over all data types.
  4. Cloud Integration:
    • With Azure SQL Database, Microsoft shifted focus to cloud-based solutions, allowing seamless database management and scalability.
  5. Current Trends:
    • Integration with AI and machine learning.
    • Strong emphasis on data security, performance tuning, and hybrid cloud environments.

Keep building your data skillset

Explore more SQL, Python, analytics, and engineering tutorials.