Post

Dbs101_flippedclass_5

Topic: Normalization

From what I have learned, Data normalization is a process used in database design to organize databsase structure efficiently and to reduce data redunancy(repetition). It involves breaking down a database into smaller, logical units and establishing relationships between them and the main aim is to minimize data repetition and dependence therefore, improving data integrity.

Types of Normal form

During my cousrse of diving through normalization I found out that there are different types of normal form

Alt text

First Nomal form (1NF)

There are basic principle or rule for the database to be in a First normal form:

  1. The database should have single valued attributes or columns

  2. All the values should be of the same domain.

  3. All the columns in the table should have unique names

  4. The order in which data is stored should not matter

Alt text

let me show you a example, in the following table as said in the above rules each column should contain only single value but in this case there are more one values in the subject column.

1
![Alt text](../image/1NF.png)

So to fix this we just need to break the values in the singl values as shown in the table below.

Alt text


Second normal form (2NF)

This form also has some basic principle to be followed for the data to be in 2NF. The principle are;

  1. First and fore most the table should be in 1NF.

  2. There should not be any Partail Dependency.

Before we dive in let me tell you what Partail Dependency is, It an attribute in a table that depends on only a part of the primary key but not the whole key.

for example, In the table given below we have partial dependency and here is how,

Emplyoo-Task

Employee_IDTask_NoEmployee_NameTask_Name
C0134MonaApp Development
CO258GenineUX/UI Designing

the non-primary keys are Employee_Name and Task_Name. Employee_Name can be determined by Employee_ID and Task_Name and can be determined by Task_No. Thus the table(Employee-Task) relation would violate the 2NF in normalization and is considered a bad database design, So we decompose the table to remove the Partail Dependence

Employee_Info

Employee_IDTask_NoEmployee_Name
C0134Mona
CO258Genine

Task_Info

Task_NoTask_Name
34App Development
58UX/UI Designing

Thus, the relation happens to be in the second normal form in the case of Database Normalization.


Third Normal form (3NF)

Same as 1NF and 2NF, 3NF also has some rule to be followed and they are;

  1. the table should be in Second normal form.

  2. And the table should not have Transitive Dependency.

And again before I say anything further let me tell you what Trasitive Dependency is, “Transitive Dependency refers to some non-primary key other than the primary key that depends on another non-primary akey that is dependent entirely on the primary key.”

Example,

The table below is not in 3NF because it includes a transitive dependency,

Show_IDTelecast_IDTelecast_TypeCD_Cost ($)
F08S09Thriller50
F03S05Romantic30

Since, Show_ID is related to Tecast_ID AND Telecast_ID is related to Telecast_Type therfore, there exist a Trasitive Dependency as one non-primary key is related to another non-primary key.

So to solve this we need spilt the table to remove the voilation of 3NF.

Show

Show_IDTelecast_IDCD_Cost ($)
F08S0950
F03S0530

Telecast

Telecast_IDTelecast_Type
S09Thriller
S05Romantic

So now the above table is now on 3NF of normalization.


Boyce-codd normal form (BCNF)

Its is a extention of third noraml form and is also known as 3.5 normal form. and like any other normal form it also has some conditions to be followed;

  1. The table shoul be Third noraml form
  2. And, for a dependency A–>B, A, can not be non-primary attribute, if B, is a primary attritube.

If a non-primary attribute determines a primary attritube than it does not satisfy the BCNF. Inorder to do it we need to decompose table.

Below in the table is an example how we can table into BCNF:

Student_IDSubjectProfessor
101JavaP.Java
101C++P.Cpp
102JavaP.Java2
103C#P.Chash

In the table above the primary key are Student_ID + Subject, because using these two we can find all the columns in the table. One professor teaches only one subject, but one subject may have two different professors. So we see there is dependency there where the subject is depending on the professor name. And since the subject being the primary attribute, and professor being the non-primary attribute which is not allowed in BCNF.

To fix this we decompose the table into two tables “Student” and “Professor” table,

Student

Student_IDp_ID
1011
1012
1023
1034

Professor

P_IDSubjectProfessor
1JavaP.Java
2C++P.Cpp
3JavaP.Java2
4C#P.Chash

Now the table will satify BCNF.


Fourth normal form (4NF)

For a table to satisfy 4NF the following condition should be full filled;

1
2
3
1. The table should be in BCNF

2. The table should not have Multi-valued Dependency.

Example; The table below shows the college enrolment table with columns s_id, course and hobby.

  • s_ID being the primary key
  • course is multi-valued attribute
  • hobby is the attribute that does not depend on the course.
s_IDcoursehobby
1sciencefootball
1mathvollyball
2EVScricket
2historybaseball

Now to make the table into 4NF we need to decompose the table into two.

R1(student-couser) table

s_IDcourse
1science
1math
2EVS
2history

R2(student-hobby) table

s_IDhobby
1fotball
1vollyball
2cricket
2baseball

Now the tables are in 4NF, there are no multi-valued dependencies, and all functionally dependind on the primary key.

This post is licensed under CC BY 4.0 by the author.