Back to all posts

Normalization in Databases

Normalization in a database is the process of organizing the data to reduce redundancy and improve data integrity. The main goal of normalization is to bre…

  • Normalization in a database is the process of organizing the data to reduce redundancy and improve data integrity.
  • The main goal of normalization is to break down larger tables into smaller, more manageable ones, ensuring that data is stored logically and efficiently.
  • This minimizes duplication and eliminates anomalies during data insertion, update, or deletion.

There are generally six types of normalization forms in database design, each with its own rules and objectives for organizing data. Here’s an overview:

First Normal Form (1NF):

  • Ensures that the table has no repeating groups and that each column contains only atomic (indivisible) values.
  • Each record must be unique.

Second Normal Form (2NF):

  • Builds on 1NF.
  • Ensures that all non-primary key attributes are fully dependent on the entire primary key.
  • Eliminates partial dependency (i.e., dependency on part of a composite primary key).
    Problems with the Current Table:
    • The table is in 1NF because it has atomic values and unique rows, but it suffers from partial dependency.
    • StudentName depends only on StudentID, not on CourseID.
    • CourseName and Instructor depend only on CourseID, not on StudentID.

    Third Normal Form (3NF):

    • Builds on 2NF.
    • Ensures that there are no transitive dependencies, meaning non-key attributes should not depend on other non-key attributes.
    • All non-key attributes must be directly dependent on the primary key.

      Boyce-Codd Normal Form (BCNF):

      • A stronger version of 3NF.
      • Ensures that every determinant (i.e., an attribute that can determine other attributes) is a candidate key, avoiding anomalies in complex relationships.

      Fourth Normal Form (4NF):

      • Ensures that there are no multi-valued dependencies in a table.
      • A relation should not have more than one independent multi-valued dependency.

        Fifth Normal Form (5NF) or Project-Join Normal Form (PJNF):

        Fifth Normal Form (5NF), also known as Project-Join Normal Form (PJNF), is the highest level of database normalization. It addresses complex relationships between tables and ensures that the data is completely free of redundancy by eliminating join dependencies.

          There are even more advanced normal forms like Sixth Normal Form (6NF) and Domain-Key Normal Form (DKNF), which deal with specific edge cases and are rarely used in typical database applications.

          Keep building your data skillset

          Explore more SQL, Python, analytics, and engineering tutorials.