Peck Tech Designs - Database Integrity

Click below to see
some references:

Introduction to Data Integrity

A good database designer will go to great lengths to protect the quality of the information contained in a database. This is called data integrity. Following are some of the most important tools and methods available to achieve this goal:

Required fields

All important information is "required". For example, one of our clients has a very complicated order with a lot of information, and frequently the manufacturing people needed to talk to the sales person who entered the order to get questions answered. We made "sales person" a required field - the sales person couldn't "save" the order in the database until they enter their initials. If someone tries to save the order without entering their initials, they get the following message:

Rules

Let's say that a company has an inventory of trailers, and they store the trailer's width and length. Rules should be set up establishing the allowed minimum and maximum widths and lengths of the trailers to help avoid typos. If someone types a width or length that doesn't meet the rules, it's undoubtedly an error, and the system won't let them finish "saving" the new unit to the database until the error is corrected. If they accidentally type "77" instead of "7" for the width, for example, they'd get the following message:

Code tables

A very powerful capability of modern relational database is the ability to store coded information in its own table and then make sure that all information of that type entered into the database is one of the allowed codes.

For example, let's say that you're trying to evaluate the effectiveness of various methods of advertising. For each new account, it's important to store the source of new business - but only certain categories are meaningful. For example:

Without a code table, users are likely to type in any of the following for a call from a yellow page ad:

Phone company ad
Yellow page ad
Telephone book

When the same thing is entered more than one way, it's difficult to get an accurate count of how many customers came from this source. Using today's database programs, it's very easy to set up the data entry screen so users can only select from the allowed options, as stored in the code table:

BEFORE selecting an option from the list:	AFTER selecting an option from the list:

If you get a new advertising source, you just have to add a new code to the code table and it automatically appears on every screen and report that has that code on it.

Referential Integrity

(Please read the Introduction to Relational Databases first, before trying to read this section).

Let's say you have two important tables, "Customers" and "Orders." The Orders table stores the CustomerID. The database should be set up to require that only valid Customer ID's can be entered into the Orders table (otherwise, you wouldn't know which customer placed a certain order.) If that's done and someone puts a non-existent CustomerID into an order, they would see the following message:

You can't create an order for a non-existent CustomerID:

Similarly, the database should be set up so that if an order exists for a certain CustomerID, you can't delete that customer or the same problem would result - you would no longer know which customer placed that order:

You can't delete a customer if they have any orders:

Normalized design

(Please read the Introduction to Relational Databases first, before trying to read this section).

One of the most important ways to protect the quality of the data stored in a database is to make sure that the database design is "normalized"; e.g., to make sure that no information is repeated more than once.

It's important to eliminate duplicate data because the more times information is repeated:

the greater the chances an error will be made,
the less the chances it will be noticed, and the
more difficult it is to fix it in all the places it occurs.

A good database design should use all these techniques to avoid costly errors.