Database Normalization

Database Normalization


A badly designed relational schema has a number of redundancies and therefore it is less efficient. These relation schemas have improper redundancies that lead to a number of problems as Redundancy, insertion anomalies, update anomalies and delete anomalies.
Normalization is a database design technique where database is analyzed to organize the tables in a manner that reduces redundancy and Data dependency. Practically, it is not possible to make redundancy ZERO even in BCNF but it can be reduced with higher normal form.

Normal Forms

Basically, there are 4 most important normal forms that are:

First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF)

There is also a relationship in different normal forms. The degree of normalization goes on increasing from 1 NF to BCNF. A relation present in 1 NF is less normalized than a relation in BCNF. Also, if a relation is present in higher normal form, of course it will be in lower normal forms. For example, if a table is in BCNF then of course, it is in 1 NF , 2NF and 3 NF also.
Redundancy in Normal forms
1NF>2NF>3NF>BCNF

First Normal Form
First normal form states that all the attributes in a table must have atomic values and there should be no repeating groups i.e each cell should have only 1 value.
Consider the following example of table DEPARTMENT

DEPARTMENT

DNAME DNUMBER DLOCATION
Research 1 Boston
Marketing 2 {Boston, NewYork,Sugarland}
Administration 3 {Sugarland,Houston}

{Violates 1 NF}

This relation violates 1 NF because there are multiple values for DLOCATION in last two tuples.

DNAME DNUMBER DLOCATION
Research 1 Boston
Marketing 2 Boston
Marketing 2 NewYork
Marketing 2 Sugarland
Administration 3 Sugarland
Administration 3 Houston

{ 1NF}

Second Normal Form(2NF)
For a relation schema to be in second normal form (2NF), every non-prime attribute A in R must be fully functionally dependent on the primary key.
Prime attributes are the attributes that are part of any candidate key and rest of attributes are non-prime attributes.
Note: A relation Schema having non-composite key is always in 2 NF because there can be no partial dependency.

Third Normal Form (3NF)
For a relation schema R to be in third normal form (3NF), two conditions are needed to be satisfied : 1) must be in 2NF and no non-prime attribute A in R is transitively dependent on the primary key.
In simple terms, if there is a functional dependency X-> Y then either X should be super key or Y should be a Prime attribute.
Practically a relational Schema is considered normalized if it is in 3NF because most of the 3NF relations are free of any type of anomalies. Also, Lossless and dependency-preserving decomposition into 3NF is always possible

Boyce-Codd Normal Form (BCNF)
BCNF is stricter than 3NF. In BCNF, if there is a functional dependency X-> Y then X must be a super key.
Note: Any relation with two attributes is always in BCNF.

GATE-2005 (IT)

A table has fields F1, F2, F3, F4, and F5, with the following functional dependencies:
F1->F3
F2->F4
(F1,F2)->F5
We need to find the normal form for this table.
Sol: Normal form for a table is the highest normal form for which conditions are satisfied in the table.

First we need to find out the keys as not given. For finding keys, you can switch to blog- “Determining keys using functional dependencies”. It is very easy to find here that key is composite (F1,F2).


Now start from 1 NF. As nothing is given about atomic values we will consider it is in 1NF.


Now check for 2NF. Since key for this table is composite, it becomes mandatory to check for full dependencies. Form first two dependencies, it is easy to find out that 2NF is violated. F3 and F4 are non-prime attributes and they are partial dependent on key i.e dependent on a part of key. Therefore the given table is not in 2 NF.
Table is in 1 NF.


GATE-2004 The relation schema Student_Performance (name, courseNo, rollNo, grade) has the following FDs:
name,courseNo->grade
rollNo,courseNo->grade
name->rollNo
rollNo->name
The highest normal form of this relation scheme is
(a) 2NF (b) 3NF (c) BCNF (d) 4NF

Solution:

Find keys
Here two composite keys are : (name,courseNo) and (rollNo,courseNo)


Check for 2NF. There is only 1 non-prime attribute that is grade. Grade is fully dependent on all the keys, so in 2 NF.


Check for 3 NF
First consider FD1 : name,courseNo->grade here L.H.S has key so satisfy 3NF.
Consider FD2 : rollNo,courseNo->grade here also L.H.S has key so satisfy 3NF.
Consider FD3 : name->rollNo here R.H.S has prime attribute so satisfy 3NF.
Consider FD4 : rollNo->name here also R.H.S has prime attribute so satisfy 3NF.


Check for BCNF
FD1 and FD2 satisfy BCNF also because both having key in L.H.S. FD3 and FD4 not satisfying BCNF condition because no key on L.H.S


Therefore given relation schema is in 3NF

No comments:

Post a Comment