In the Relational Model, the most important kind of key is the candidate key. A candidate key is a set of columns whose values must be unique for all rows. No element of a candidate key may be null, since nulls cannot be compared for equality (which is the basis of uniqueness). Every table must have at least one candidate key, in order to avoid duplicate rows, which should be avoided --- as Edgar Codd observed, “Saying something more than once doesn’t make it any more true.”
In SQL, there is no CANDIDATE KEY constraint – only PRIMARY KEY for the candidate key chosen as the most useful identifier, and UNIQUE for others. Note that PRIMARY KEY implies NOT NULL for all attributes, but UNIQUE does not, so you should add NOT NULL constraints for those.
SQL allows at most one PRIMARY KEY constraint per table, and while it will permit a table with no primary key, you should never do this in practice (other than for temporary tables for importing data that need to be cleaned up first).
Note that in general, a key is a set of columns, and it is not unusual for a key to have more than one (these are known as composite keys). When declaring a composite key, use the usual SQL list syntax, like so:
primary key (Student_ID, Paper_Code, Year, Semester)
Note that adding more columns to a key makes it “less unique”, i.e. it undermines its strictness. Therefore, use only the minimal set of attributes required to assure uniqueness. Using existing columns to form a composite primary key is generally preferable to creating a new surrogate key, since surrogate key values have no intrinsic meaning.
Using the ERD as a guide, identify the primary key for each of the following tables:
• Scientist:
• Sample:
• Site:
Now modify your CREATE TABLE statements to implement these, and test that they are correctly enforced.
Make sure they are named table-level constraints.