Normalization of Data

Normalization technique logically groups the data over a number of tables, which in turn removes the duplicate information available in the database. The result of normalized tables contain simple data items and relations between tables.

Example for a Purchase Order information for a Company

SUPER COMPUTERS ORDER NUMBER : 1811

CUSTOMER NUMBER : 1400
CUSTOMER ADDRESS : ABC COMPUTERS, 67822, AIRPORT FREE WAY, IRVING, TX - 75039
ORDER DATE : 10/10/1999
DELIVERY DATE : 10/11/1999
ITEM ID ITEM DESCRIPTION PRICE QTY TOTAL
1001 17 inch monitor 200.00 5 1000.00
1002 Box in the Store( 500 MHz, 124 RAM, 56KB, 8 GB 17 inch monitor ) 999.99 10 9999.90
1003 Dell Computer every thing you wanted 1666 2 3332.00
1004 Geteway for Kids 1000 3 3000.00
Total amount for the Purchase Order 17,331.90

So according to the above sample Purchase order we need a table in the following structure to save the data.

PURCHASE_ORDER ( ORDER_NO, CUST_NO, CUST_ADDRESS, ORDER_DATE, DEL_DATE, ITEM_ID, ITEM_DESC, PRICE, QTY, TOTAL )

First Normal Form
Remove any repeating group of attribute types and rewrite them as a new table.

So if we take close look at the above Purchase Order,ITEM information is repeatedly shown,in order to avoid this,we create another table called ITEMS.We can do this by splitting the table and create another table.

PURCHASE_ORDER ( ORDER_NO, CUST_NO, CUST_ADDRESS, ORDER_DATE, DEL_DATE, TOTAL)

ORDER_ITEMS ( ORDER_NO, ITEM_ID, ITEM_DESC, PRICE, QTY)

Second Normal Form
Every non-identtifying attribute type must be fully functionally dependent on the identifier.

PURCHASE_ORDER(ORDER_NO, CUST_NO, CUST_ADDRESS, ORDER_DATE, DEL_DATE, TOTAL)

ORDER_ITEMS( ORDER_NO, ITEM_ID, QTY)

ITEMS( ITEM_ID, ITEM_DESC, PRICE)

Third Normal Form
Remove any dependencies between non-identifying attributes.

PURCHASE_ORDER(ORDER_NO, CUST_NO, ORDER_DATE, DEL_DATE, TOTAL)

ORDER_ITEMS(ORDER_NO, ITEM_ID, QTY)

ITEMS( ITEM_ID, ITEM_DESC, PRICE)

CUSTOMER (CUST_NO, CUST_ADDRESS)

Once you have normalized the tables you end up having all the tables independent and can be related to one another using foreign keys.

ADVANTAGES:

1.Storage space is saved by removing Redudant Data

2.Transactions are faster

DISADVANTAGES:

1.SQL is slower because it involves more tables