diff --git a/UsedCars/UsedCars.xml b/UsedCars/UsedCars.xml index 7dee08f..97a949b 100644 --- a/UsedCars/UsedCars.xml +++ b/UsedCars/UsedCars.xml @@ -93,10 +93,6 @@ Firstname VARCHAR2(50) NOT NULL, Lastname VARCHAR2(50) NOT NULL, Address VARCHAR2(150) NOT NULL, - -- Use VARCHAR2 for phone numbers in order to retain leading zeros. - -- Format: leading 0 plus 1-digit area code plus up to 9 digits, e.g., - -- 02187654321; longer if we allow punctuation to be included. - -- (http://en.wikipedia.org/wiki/Telephone_numbers_in_New_Zealand) Phone VARCHAR2(11) NOT NULL, Date_Hired DATE DEFAULT SYSDATE NOT NULL, @@ -109,7 +105,11 @@ ); -

This solution uses the discrete approach for transforming subtypes (i.e., one relation for the supertype, plus one relation for each of the subtypes). The reason for this is that the three Staff subtypes each have their own attributes and it’s easier to add more subtypes if necessary. You could probably also make a reasonable argument for the additive approach (one relation for each subtype, including the supertype attributes in each) if you assume that the subtypes are exclusive, i.e., a staff member can’t be two different types at once (this seems unlikely in practice, however). The integrated approach is inappropriate for this scenario because it would force us to make all the subtype attributes optional (allow nulls), where currently they are all mandatory (no nulls).

+

This schema uses what’s known as the “discrete approach” for subtypes (i.e., one relation for the supertype, plus one relation for each of the subtypes). The reason for this is that the three Staff subtypes each have their own attributes and it’s easier to add more subtypes if necessary.

+ +

A surprising number of people seemed to be very confused about the primary keys of the subtype tables, despite the very clear statement that they “use the same primary key as Staff” (i.e., Staff_ID). Some people didn’t even specify primary keys for the subtype tables at all, or worse, specified primary keys for some (usually Sales) and not others! Another related issue was introducing something like a Salesrep_ID (or Service_ID, or whatever) in addition to the Staff_ID. While Salesrep_ID on its own is generally OK (we generally accepted this as a solution), the combination of both Salesrep_ID and Staff_ID is pointless, as they’ll both be unique and therefore might as well be the same value anyway.

+ +

Phone numbers should always be stored as text rather than numbers, to allow for things like punctuation and leading zeroes (which would vanish if the column used a numeric data type). Wikipedia’s entry on the New Zealand telephone numbering system reveals that numbers are at least 11 digits long if we ignore punctuation, so this is the minimum length that we accepted.

There are several different ways to calculate the age: using INTERVAL types as shown above, using the MONTHS_BETWEEN function, or simply doing your own date arithmetic calculation based on the number of days between the two dates. However, when doing the latter, you need to take into account two things:

@@ -121,7 +121,9 @@ -

If you work on the basis that “one year is 365 days”, then your age calculation could be off by up to a week! (Incidentally, using something like TO_CHAR(date, 'YYYY') is even worse; if Date_of_Birth is December 31 and Date_Hired is January 1, the calculation will be off by a whole year!) The built-in date functions already take these factors into account, so it's much safer to use them than to implement your own solution.

+

If you work on the basis that “one year is 365 days”, then your age calculation could be off by up to a week! (Incidentally, using something like TO_CHAR( <date>, 'YYYY') is even worse; if Date_of_Birth is December 31 and Date_Hired is January 1, the calculation will be off by a whole year!) The built-in date functions already take these factors into account, so it's much safer to use them than to implement your own solution.

+ +

A few people stated that the age constraint “can’t be implemented in Oracle”. This is clearly incorrect, as can be seen above. We can only assume that they got this mixed up with the inability in to use SYSDATE in CHECK constraints (because the result of the function isn’t deterministic). However, the age constraint doesn’t need to use SYSDATE, as it just directly compares two fixed date values.

@@ -180,9 +182,11 @@ ); -

It makes sense to include ON DELETE CASCADE on the foreign keys from subtypes to Staff, as subtype instances can’t exist independently of the corresponding supertype instance.

+

While not explicitly specified, it makes sense to include ON DELETE CASCADE on the foreign keys from subtypes to Staff, as subtype instances can’t exist independently of the corresponding supertype instance. We awarded bonus marks for implementing an appropriate CASCADE.

-

We also accepted Service_Staff as a table name, on the theory that this is a reasonable transformation for the purposes of clarity.

+

A few people seemed to be confused by the explanation of why the upper bound for Total_Hours was 4500, and added on some random number of hours to cater for overtime. This was, however, already included in the original upper bound of 4500, which should have been fairly clear from the calculation (40 52 is only 2080).

+ +

Disappointingly, only two people attempted to implement the Service_Total_Hours_Quarters constraint (both achieved a working solution). There is at least one other way—possibly more—that this can be implemented using ’s mathematical functions (left as an exercise for the reader).

@@ -221,7 +225,7 @@ CREATE TABLE Other ( Staff_ID NUMBER(4), - Salary NUMBER(8,2) NOT NULL + Salary NUMBER(8,2) NOT NULL CONSTRAINT Other_Salary_Min CHECK ( Salary >= 28080 ), -- @@ -231,7 +235,7 @@ ); -

Similar to Service_Staff above. We also accepted Other_Staff as a table name.

+

Similar to Service above.

@@ -288,8 +292,6 @@ Commission_Rate NUMBER(3,2) NOT NULL CONSTRAINT Sales_Valid_Commission_Rate CHECK ( Commission_Rate BETWEEN 0.00 AND 0.30 ), - -- We can't make Gross_Earnings a computed column, because it requires data - -- from other tables. Gross_Earnings NUMBER(8,2) NOT NULL CONSTRAINT Sales_Valid_Gross_Earnings CHECK ( Gross_Earnings >= 0.00 ), @@ -304,7 +306,13 @@ ); -

Similar to Service_Staff above. We also accepted Other_Staff as a table name. Note that the paragraph above about Gross_Earnings describes the process of how it is calculated, not a constraint on its possible values, so implementing this as a constraint is incorrect.

+

Similar to Service above. Note that the paragraph above about Gross_Earnings describes the process of how it is calculated, not a constraint on its possible values, so implementing this as a constraint is incorrect (although this is probably impossible in practice without using triggers anyway). Ideally, we’d make this a computed column, but ’s virtual columns unfortunately can’t access data from other tables.

+ +

Bonus marks were awarded for applying a sensible default to On_Commission, such as “N” or “false”. Another thing to watch out for when implementing columns like this is not to allow mixed case values (e.g., “T”, “t”, “F”, “f”). This unnecessarily complicates any logic that needs to compare these values, as you need to include an extra comparison for every permutation of the value. This can lead to subtle and difficult to detect logic bugs. You might argue that permitting mixed case is more flexible, but the obvious counter to this is that you can always permit mixed case input in the user interface, then convert everything to upper or lower case before storing it in the database.

+ +

One interesting aspect of this table is the overlapping constraints. Not only do we need to cater for the moderately complex interplay between On_Commission and Commission_Rate, we also need to ensure that Commission_Rate only accepts values between 0.00 and 0.30. Quite a few people implemented the complex constraint correctly, but forgot to enforce the maximum commission rate! You could merge the test for the upper bound into the Sales_Check_Commission constraint by changing the last comparison to ( Commission_Rate BETWEEN 0.00 AND 0.30 ), but having them as separate constraints makes it easier to figure out why an invalid value was rejected. The argument is essentially that the upper bound condition is completely independent of the other condition, and should therefore be checked independently and return a separate “exception”. You could even reasonably argue that the condition of Sales_Valid_Commission_Rate should be changed to ( Commission_Rate < 0.30 ), to avoid redundant checking of the lower bound.

+ +

The data type of Commission_Rate caught a few people out. Remember that the precision of a NUMBER is the total number of significant digits, not the number of digits before the decimal point. A column declared as NUMBER(8,2) has eight significant digits, two of which appear after the decimal point. The column can therefore only store values in the range -999999.99 to 999999.99, even before any constraints are applied. Commission_Rate therefore has to be at least NUMBER(2,2) in order to store the specified values. A few people declared it as NUMBER(1,2), which effectively means that the column can only physically store values between 0.00 and 0.09! Anything larger (e.g., 0.10) requires at least two significant digits.

@@ -398,7 +406,9 @@ ); -

Many people assigned the Comments a size of 500 characters or less, which is far too small for a general comments column. If we assume an average word length of five characters plus one for the inter-word space, then 500 characters can only hold about 83 words! 100 characters would be only 16 words! We accepted 500 characters as a bare minimum; anything below that lost marks. A better choice would be to use either the maximum allowable VARCHAR2 size of 4000, or just use a CLOB, which for most practical purposes is effectively unlimited in size. A similar argument applies to the Details columns in Purchase and Sale.

+

A few people assigned the Comments a size of less than 500 characters, which is far too small for a general comments column. If we assume an average word length of five characters plus one for the inter-word space, then 500 characters can only hold about 83 words. 100 characters would be only 16 words! We accepted 500 characters as a bare minimum; anything below that lost marks. A better choice would be to use either the maximum allowable VARCHAR2 size of 4000, or just use a CLOB, which for most practical purposes is effectively unlimited in size. A similar argument applies to the Details columns in Purchase and Sale.

+ +

We awarded bonus marks for checking the format of Email, either using LIKE or regular expressions. Note that a UNIQUE constraint on Email probably doesn’t make sense, as we’re not using it as a username or similar. In real life, it’s not unusual for two or more people to share the same email address.

@@ -507,7 +517,6 @@ Flat_Rate NUMBER(4) NOT NULL CONSTRAINT Car_Valid_Flat_Rate CHECK ( Flat_Rate > 0 ), -- - -- Not specified, but makes sense. Bonus marks! CONSTRAINT Car_Valid_Service_Date CHECK ( Last_Serviced >= First_Registered ) -- @@ -515,12 +524,14 @@ ); -

The statement of VIN as “17 characters” was obviously a little too subtle for some people, who implemented it as NUMBER(17) rather than CHAR(17). VINs contain both numbers and letters, and the format is well-documented if you weren’t sure (e.g., look up “VIN” on Wikipedia).

- -

There was a small typo in the original specification, which said that the value of Odometer should be “0.0–999,999.0”, when it should actually have been 0.0–999,999.9 (this has been corrected above). We accepted either when marking.

+

The statement of VIN as “17 characters” was obviously a little too subtle for some people, who implemented it as NUMBER(17) rather than CHAR(17). VINs contain both numbers and letters, and the format is well-documented if you weren’t sure (e.g., search for “VIN” on Wikipedia). We had hoped that someone might try to validate the format of the VIN, but sadly no-one did, so no bonus marks there .

There is little point in storing Year as a DATE, as we won’t be using anything but the year part anyway. Storing it as a DATE means that we would need to do unnecessary extra work to extract the year part from the date. If you don’t need something, don’t store it, especially if it makes things more complicated!

+

We’d also hoped to award some bonus marks for checking that Last_Serviced was no earlier than First_Registered. This wasn’t explicitly specified, but makes a lot of sense to check. But again, no-one attempted this .

+ +

Note that while Price is specified as ≥ 0, Flat_Rate is specified as > 0. A common error was to have either both >, or both ≥.

+ @@ -578,8 +589,6 @@ ); -

At this point we also need to add an associative entity to resolve the many-to-many relationship between Car and Feature:

-

CarFeatureVIN, FeatureCode (foreign keys: VIN Car, FeatureCode Feature)

@@ -672,19 +681,19 @@ CONSTRAINT Warranty_PK PRIMARY KEY ( W_Code ) ); -INSERT INTO Warranty ( W_Code, Duration, Distance, Description ) +INSERT INTO Warranty ( W_Code, Max_Age, Max_KM, Duration, Distance, Notes ) VALUES ( 'A', 4, 50000, 3, 5000, 'Category A motor vehicle' ); -INSERT INTO Warranty ( W_Code, Duration, Distance, Description ) +INSERT INTO Warranty ( W_Code, Max_Age, Max_KM, Duration, Distance, Notes ) VALUES ( 'B', 6, 75000, 2, 3000, 'Category B motor vehicle' ); -INSERT INTO Warranty ( W_Code, Duration, Distance, Description ) +INSERT INTO Warranty ( W_Code, Max_Age, Max_KM, Duration, Distance, Notes ) VALUES ( 'C', 8, 100000, 1, 1500, 'Category C motor vehicle' ); -INSERT INTO Warranty ( W_Code, Duration, Distance, Description ) +INSERT INTO Warranty ( W_Code, Max_Age, Max_KM, Duration, Distance, Notes ) VALUES ( 'D', NULL, NULL, 0, 0, 'Category D motor vehicle' ); -

This is a lookup table that is intended to define the types of warranties allowed for vehicles. If the regulations change, it’s a simple matter to change the contents of the table to reflect this, which means that the table is effectively acting as a configurable constraint on warranty type. It would therefore be counterproductive to add constraints to the columns of this table based on the values specified in the Appendix, as the whole point of a lookup table is that the contents of the table effectively define a constraint. The values in the Appendix should therefore be used only to determine the data to be inserted into the table.

+

This is a lookup table that is intended to define the types of warranties allowed for vehicles. If the regulations change, it’s a simple matter to change the contents of the table to reflect this, which means that the table is effectively acting as a configurable constraint on warranty type. It’s therefore counterproductive to add constraints to the columns of this table based on the values specified in the Appendix, as the whole point of a lookup table is that the contents of the table effectively define a constraint. The values in the Appendix should therefore be used only to determine the data to be inserted into the table. (Incidentally, most people didn’t insert any data into this table despite being told to do so above. A couple of people inserted essentially random data, which scored zero marks.)

-

On a similar note, there is little point in trying to enforce constraints between the specification of the warranty type and the First_Registered column in Car. There is no dedicated column in the table to store this information, so it could only be stored in Description, and it would probably be almost impossible in practice to extract anything useful from this column. It could be possible, however, to use a trigger to check that a car is assigned the correct warranty category for it’s odometer reading when it’s sold.

+

It might be possible to use a trigger to check that a car is assigned the correct warranty category for it’s odometer and age reading when it’s sold.

@@ -761,7 +770,9 @@ ); -

Note the absence of ON DELETE CASCADE on the foreign keys in this table. If we attempt to delete a salesrep, it makes little sense to delete all their associated purchases, especially if they are completed. This would cause a serious accounting problem for the company. A better solution might be to have a “dummy” salesrep that can have “orphaned” purchases assigned to it.

+

Note the absence of ON DELETE CASCADE on the foreign keys in this table. If we attempt to delete a salesrep, it makes little sense to delete all their associated purchases, especially if they are completed. This would cause a serious accounting problem for the company. A better solution might be to have a “dummy” salesrep that can have “orphaned” purchases assigned to it. Even better, add a Status column to the Staff table, and use this to record that the salesrep no longer works for the company, rather than deleting them.

+ +

Quite a few people had both Salesrep_ID and Staff_ID columns in this table (and similar for Sale). This again doesn’t really make sense, for the same reasons as the Staff subtype tables. We suspect this is an example of people tweaking their code until the schema checker stopped complaining, without stopping to think about why the checker was complaining.

@@ -844,7 +855,7 @@ ); -

Similar to Purchase above. While the foreign key for trade-ins could go in either table (or even both), it really makes most sense to place it here, since trade-ins only occur as part of a sale transaction.

+

Similar to Purchase above. While the foreign key for trade-ins could technically go in either Purchase or Sale (or even both), it really makes most sense to place it here, since trade-ins only ever occur as part of a sale transaction.