Discussion_Papers/Papers/1999/99-23/Paper/dp9923ns.tex at 7298a9eb6fc3167538973ebc435dec7fd3bdac0b

nigel.stanger / Discussion_Papers
Find file
Newer
Older
Discussion_Papers / Papers / 1999 / 99-23 / Paper / dp9923ns.tex
Nigel Stanger on 6 May 2013 44 KB - Added .gitignore.
Raw Blame History
\documentclass[12pt]{article}

\usepackage{palatino}
\usepackage[dcucite]{harvard}
\usepackage{graphicx}
\usepackage{url}
% \usepackage{amsmath}
\usepackage{amssymb}
\usepackage{varioref}
\usepackage{pstricks}
\usepackage{pst-node}
\usepackage{subfigure}
\usepackage[normalem]{ulem}
\usepackage{a4wide}

\title{Modifications to Smith's method for deriving normalised relations from a functional dependency diagram}
\author{Nigel Stanger\thanks{Address correspondence to: N. Stanger, Department of Information Science, University of Otago, P.O. Box 56, Dunedin, New Zealand. Fax: +64-3-479-8311. Email: nstanger@infoscience.otago.ac.nz}}
\date{December 1999}

% single-valued dependency, i.e., A -> B
\newcommand{\svd}[2]{\emph{#1} $\rightarrow$ \emph{#2}}

% multivalued dependency, i.e., A ->> B
\newcommand{\mvd}[2]{\emph{#1} $\twoheadrightarrow$ \emph{#2}}

\newcommand{\shortpage}{\enlargethispage{-\baselineskip}}
\newcommand{\longpage}{\enlargethispage*{\baselineskip}}

\begin{document}

\urlstyle{same}

\maketitle

\begin{abstract}
Smith's method \cite{Smit:HC:1985} is a formal technique for deriving a set of normalised relations from a functional dependency diagram (FDD). Smith's original rules for deriving these relations are incomplete, as they do not fully address the issue of determining the foreign key links between relations. In addition, one of the rules for deriving foreign keys can produce incorrect results, while the other rule is difficult to automate. In this paper are described solutions these issues.
\end{abstract}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Introduction}
\label{Sec:SmithsMethod:Introduction}

A \emph{functional dependency diagram} (FDD) is a means of graphically modelling the dependencies within a collection of attributes \cite[pp.~294--295]{Date:CJ:1995:IDS}. \emph{Smith's method} \cite{Smit:HC:1985} is a formal technique for deriving a set of normalised relations from an FDD. As part of this process, Smith defines two rules for deriving foreign keys (referred to by the author as the \emph{target bubble rule} and the \emph{domain flag rule} respectively). There are three major issues with these rules:
\begin{enumerate}
	\item the target bubble rule does not always produce all possible foreign keys, even in relatively simple FDDs;
	\item the target bubble rule can in certain situations produce `foreign keys' that violate the relational definition of a foreign key; and
	\item the domain flag rule is difficult to automate, because information that is important to the correct derivation of foreign keys cannot be expressed using Smith's original FDD notation.
\end{enumerate}
In this paper are described new rules for deriving foreign keys from an FDD. In addition, the author proposes some minor modifications to Smith's original FDD notation to facilitate the process of deriving foreign keys.

Smith's original method is summarised in Section~\ref{Sec:SmithsMethod:Overview}. The issues with deriving foreign keys are then described in more detail in Section~\ref{Sec:SmithsMethod:FKProblem}. In Section~\ref{Sec:SmithsMethod:Solution}, modifications to Smith's FDD notation and two new foreign key derivation rules are proposed to address these issues. An example of the new rules in use is presented in Section~\ref{Sec:SmithsMethod:Examples}, and the paper is concluded in Section~\ref{Sec:Conclusions}. It is assumed that readers are familiar with the concepts and terminology of relational dependency theory \cite{Arms:WW:1974,Beer:C:1977,Date:CJ:1995:IDS}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Overview of Smith's method}
\label{Sec:SmithsMethod:Overview}

As previously stated, a functional dependency diagram (FDD) is a graphical representation of the functional dependencies within a collection of attributes. Smith derives FDDs from a set of plain English \emph{dependency-list statements}, such that shown in Figure~\ref{Fig.SmithsMethod.DependencyListStatement}.

%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htb]
	\hrule\medskip
	\begin{quote}
		\small \sffamily
		Anticipated design engineering work is organized into JOB\_NO engineering job numbers. Each JOB\_NO has one TYPE\_JOB (i.e., `1' = Basic Release, `2' = Sustaining,~\ldots), one RESP\_ENGR responsible engineer (entered as an employee number), and one DUE\_DATE planned due date.
	\end{quote}
	\medskip\hrule
	\caption[Example of a dependency-list statement]%
		{Example of a dependency-list 
		 statement~\protect\cite[Figure~1]{Smit:HC:1985}}
	\label{Fig.SmithsMethod.DependencyListStatement}
\end{figure}

%%%%%%%%%%%%%%%%%%%%

%================================================================================

\subsection{Smith's FDD notation}
\label{Sec:SmithsMethod:Overview:Notation}

In Smith's FDD notation, attributes (Smith refers to these as `fields') are placed within \emph{bubbles}. Multiple attributes may be placed within the same bubble to simplify the diagram (see Figure~\ref{Fig.SmithsMethod.AttributesBubbles}).

%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htb]
	\centering
	\psset{linewidth=0.5pt}
	\hrule\medskip
	\hfill
	\subfigure[]%
	{	\begin{minipage}{2cm}
			\centering
			\scriptsize
			\sffamily
			\ovalnode{B1}{NAME}
		\end{minipage}
	}
	\hfill
	\subfigure[]%
	{	\begin{minipage}{3cm}
			\centering
			\scriptsize
			\sffamily
			\ovalnode{B2}{\shortstack{NAME \\ \\ ADDRESS \\ \\ PHONE}}
		\end{minipage}
	}
	\hfill
	\subfigure[]%
	{	\begin{minipage}{5.5cm}
			\centering
			\scriptsize
			\sffamily
			\ovalnode{B3}{NAME+ADDRESS+PHONE}
		\end{minipage}
	}
	\hfill
	\medskip\hrule
	\caption[Attributes and bubbles]%
		{(a) A bubble that contains a single attribute;
		 (b), (c) bubbles that contain multiple attributes}
	\label{Fig.SmithsMethod.AttributesBubbles}
\end{figure}
 
%%%%%%%%%%%%%%%%%%%%

A functional or \emph{single-valued} dependency \svd{A}{B} is represented by a single-headed arrow between the corresponding bubbles. The bubble at the start of the arrow is called a \emph{prime-key bubble}, as shown in Figure~\ref{Fig.SmithsMethod.Dependencies.Single}. The bubble at the end of the arrow is called a \emph{target bubble}.

\longpage
A multivalued dependency \mvd{C}{D} is represented by a double-headed arrow between the corresponding bubbles. If the bubble at the end of the arrow is a prime key bubble, then the bubble at the start of the arrow is called an \emph{uplink-key bubble}, otherwise it is a prime-key bubble. If the bubble at the end of the arrow is not a prime key or an uplink key, then it is known as an \emph{end-key bubble}, as shown in Figure~\ref{Fig.SmithsMethod.Dependencies.Multi}.

%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htb]
	\centering
	\psset{linewidth=0.5pt,arrowscale=1.5}
	\hrule\medskip
	\subfigure[\protect\label{Fig.SmithsMethod.Dependencies.Single}]%
	{	\scriptsize
		\sffamily
		\begin{psmatrix}
			\ovalnode{SRCS}{CUST\_NO}\nput{90}{SRCS}{\textit{prime key}}	&	\ovalnode{DSTS}{NAME}\nput{90}{DSTS}{\textit{target}}	\\
		\end{psmatrix}
		\ncline{->}{SRCS}{DSTS}
	}
	\hspace{2cm}
	\subfigure[\protect\label{Fig.SmithsMethod.Dependencies.Multi}]%
	{	\scriptsize
		\sffamily
		\begin{psmatrix}
			\ovalnode{SRCP}{CUST\_NO}	&	\ovalnode{DSTP}{ORDER\_NO}	\\
			\ovalnode{SRCU}{CUST\_NO}	&	\ovalnode{DSTU}{ORDER\_NO}	&	\rnode{T}{}	\\
		\end{psmatrix}
		\nput{90}{SRCP}{\textit{prime key}}
		\nput{90}{DSTP}{\textit{end key}}
		\nput{90}{SRCU}{\textit{uplink key}}
		\nput{90}{DSTU}{\textit{prime key}}
		\ncline{->>}{SRCP}{DSTP}
		\ncline{->>}{SRCU}{DSTU}
		\ncline[linestyle=dashed]{->}{DSTU}{T}
	}
	\medskip\hrule
	\caption[Single- and multivalued dependencies]%
		{(a) Single- and (b) multivalued dependencies}
	\label{Fig.SmithsMethod.Dependencies}
\end{figure}

%%%%%%%%%%%%%%%%%%%%

Attributes may be placed within more than one bubble. `Multibubbles' are often used to show the linkage of a chain of uplink-key, prime-key and end-key bubbles, as shown in Figure~\ref{Fig.SmithsMethod.MultiFlags.Bubbles}. Each bubble is independent of the others.

\emph{Domain flags} are used to tag attributes which belong to the same domain. For example, \textsf{EMP\_NO} and \textsf{DEPT\_MGR} both belong to the domain `employee number', as shown in Figure~\ref{Fig.SmithsMethod.MultiFlags.Flags}.
 
%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htb]
	\centering
	\psset{linewidth=0.5pt,rowsep=0.5cm,colsep=0.75cm,arrowscale=1.5}
	\hrule\medskip
	\subfigure[\protect\label{Fig.SmithsMethod.MultiFlags.Bubbles}]%
	{	\centering
		\scriptsize
		\sffamily
		\begin{psmatrix}
			\rnode{MB1D}{}	&	\ovalnode{MB1}{\ovalnode{MB2}{\ovalnode{MB3}{EMP\_NO}}}	&		\rnode{MB3D}{}	\\
							&	\rnode{MB2D}{}	&	\\
		\end{psmatrix}
		\ncline{->}{MB1D}{MB1}
		\ncline[border=2pt]{->>}{MB2}{MB2D}
		\ncline[border=2pt]{->}{MB3}{MB3D}
	} % \subfigure
	\subfigure[\protect\label{Fig.SmithsMethod.MultiFlags.Flags}]%
	{	\centering
		\scriptsize
		\sffamily
		\setlength{\unitlength}{1mm}
		\begin{picture}(125,25)
			\put(0,0)%
			{	\makebox(125,25)%
				{
			 		\begin{psmatrix}
			 			\ovalnode{ENOB}{\rnode{ENO}{EMP\_NO}}		&	\ovalnode{DNOT}{\ovalnode{DNOP}{DEPT\_NO}}		&	 		\ovalnode{DEPT}{DEPT\_NAME + \rnode{DMGR}{DEPT\_MGR}}	\\
			 		\end{psmatrix}
			 		\nput{45}{DMGR}{\pstribox{1}}
			 		\nput{135}{ENO}{\pstribox{1}}
			 		\nput{-90}{DEPT}{\pstribox{1}Employee number}
			 		\ncline{->}{ENOB}{DNOT}
			 		\ncline[border=2pt]{->}{DNOP}{DEPT}
			 	} % \makebox
			 } % \put
	 	\end{picture}
	} % \subfigure
	\medskip\hrule
	\caption[Multiple bubbles and domain flags]%
	        {(a) Multiple bubbles and (b) domain flags}
	\label{Fig.SmithsMethod.MultiFlags}
\end{figure}

%%%%%%%%%%%%%%%%%%%%

%================================================================================

\subsection{Smith's method for deriving a set of relations from an FDD}
\label{Sec:SmithsMethod:Overview:DerivingRelations}

%--------------------------------------------------------------------------------

\subsubsection{Single-valued dependencies composed into relations}

All target bubbles of a prime-key bubble, plus all associated uplink-key bubbles (if any) become the attributes of a single relation. The primary key of this relation comprises the concatenation of all attributes within the prime-key bubble plus all attributes within associated uplink-key bubbles. An example is shown in Figure~\vref{Fig.SmithsMethod.SVD}.

\newcommand{\Widen}[1]{\hspace{1mm}#1\hspace{1mm}}

%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htbp]
	\centering
	\setlength{\unitlength}{1mm}
	\psset{arrows=->>,arrowscale=2,linewidth=0.5pt,labelsep=2pt}
	\newcommand{\Attr}[2][0.8]{\makebox(#1,0.4){#2}}
	\hrule\medskip
	\begin{pspicture}(12,9.9)
		\sffamily \scriptsize
		% background blob
		\rput[bl](0.7,1.6){\includegraphics{SVDBlob}}
		% nodes
		\rput(1.5,8.8){\ovalnode{A}{\Attr{A}}}
		\rput(4.8,8.8){\ovalnode{Bo}{\ovalnode{Bi}{\Attr{B}}}}
		\rput(8.9,8.8){\ovalnode{C}{\Attr{C}}}
		\rput(3.4,6.9){\ovalnode{D}{\Attr{D}}}
		\rput(7.2,7.2){\ovalnode{E}{\Attr{E}}}
		\rput(0.8,6.3){\ovalnode{F}{\Attr{F}}}
		\rput(5.6,5){\ovalnode{G}{\Attr{G}}}
		\rput(1.8,3.6){\ovalnode{HI}{\Attr[1]{H + I}}}
		\rput(7.5,3.1){\ovalnode{J}{\Attr{J}}}
		\rput(9.9,6.8){\ovalnode{Ko}{\ovalnode{Ki}{\Attr{K}}}}
		\rput(10.8,4.4){\ovalnode{LM}{	\begin{psmatrix}[rowsep=8mm]
											\ovalnode{L}{\Attr{L}}	\\
											M
										\end{psmatrix}}}
		% connectors
		\ncarc[border=2pt]{A}{Bi}
		\ncarc[border=2pt]{->}{Bi}{C}
		\ncarc{<<-}{D}{Bo}
		\ncarc{D}{E}
		\ncarc{<<-}{F}{D}
		\ncarc{D}{G}
		\ncarc{<-}{HI}{G}\naput[nrot=:U]{\shortstack{single-valued \\ dependence}}
		\ncarc{->}{G}{J}\naput[nrot=:U]{\shortstack{single-valued \\ dependence}}
		\ncarc{Bo}{Ko}
		\ncarc[border=2pt]{->}{J}{Ki}
		\ncarc[border=2pt]{Ko}{L}
		% node labels
		\nput{-80}{Bo}{uplink key}
		\nput{-100}{D}{uplink key}
		\nput{-100}{G}{prime key}
		\nput{-90}{HI}{target}
		\nput{-90}{J}{target}
		\nput{0}{HI}{\pstribox{3}}
		% table
		\rput(1.55,1.6){TABLE T1}
		\multips(1.3,0){6}{\psframe(0.9,0.5)(2.2,1.4)}
		\rput(1.55,0.95){\uline{\Widen{B}}}
		\rput(2.85,0.95){\uline{\Widen{D}}}
		\rput(4.15,0.95){\uline{\Widen{G}}}
		\rput(5.45,0.95){H}
		\rput(6.75,0.95){\uwave{\Widen{I}}}
		\rput(8.05,0.95){\uwave{\Widen{J}}}
		% key labels
		\pnode(0.9,0.2){PKAL}
		\pnode(4.8,0.2){PKAR}
		\rput(2.85,0.2){\Rnode{PK}{PK}}
		\rput(6.75,0.2){FK}
		\rput(8.05,0.2){FK}
		\ncline[nodesepA=1mm]{->}{PK}{PKAL}
		\ncline[nodesepA=1mm]{->}{PK}{PKAR}
	\end{pspicture}
	\medskip\hrule
	\caption[Deriving a relation from a single-valued dependency]%
		{Deriving a relation from a single-valued 
		 dependency~\protect\cite[p.~830]{Smit:HC:1985}}
	\label{Fig.SmithsMethod.SVD}
\end{figure}

%%%%%%%%%%%%%%%%%%%%

Attributes within a target bubble become foreign keys of the derived relation if they also function as a key bubble of any sort (referred to by the author as the \emph{target bubble rule}), or are tagged with a domain flag (the \emph{domain flag rule}). These rules shall be revisited in Section~\ref{Sec:SmithsMethod:FKProblem}.

%--------------------------------------------------------------------------------

\subsubsection{End-key dependencies composed into relations}

All attributes of an end-key bubble, its prime-key bubble and all associated uplink-key bubbles become the primary key of a single relation. An example is shown in Figure~\ref{Fig.SmithsMethod.Endkey}.

%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htbp]
	\centering
	\setlength{\unitlength}{1mm}
	\psset{arrows=->>,arrowscale=2,linewidth=0.5pt,labelsep=2pt}
	\newcommand{\Attr}[2][0.8]{\makebox(#1,0.4){#2}}
	\hrule\medskip
	\begin{pspicture}(-0.6,1.1)(12,9.9)
		\sffamily \scriptsize
		% background blob
		\rput[bl](-0.6,1.6){\includegraphics{EndKeyBlob}}
		% nodes
		\rput(1.5,8.8){\ovalnode{A}{\Attr{A}}}
		\rput(4.8,8.8){\ovalnode{Bo}{\ovalnode{Bi}{\Attr{B}}}}
		\rput(8.9,8.8){\ovalnode{C}{\Attr{C}}}
		\rput(3.4,6.9){\ovalnode{D}{\Attr{D}}}
		\rput(7.2,7.2){\ovalnode{E}{\Attr{E}}}
		\rput(0.8,6.3){\ovalnode{F}{\Attr{F}}}
		\rput(5.6,5){\ovalnode{G}{\Attr{G}}}
		\rput(1.8,3.6){\ovalnode{HI}{\Attr[1]{H + I}}}
		\rput(7.5,3.1){\ovalnode{J}{\Attr{J}}}
		\rput(9.9,6.8){\ovalnode{Ko}{\ovalnode{Ki}{\Attr{K}}}}
		\rput(10.8,4.4){\ovalnode{LM}{	\begin{psmatrix}[rowsep=8mm]
											\ovalnode{L}{\Attr{L}}	\\
											M
										\end{psmatrix}}}
		% connectors
		\ncarc[border=2pt]{A}{Bi}
		\ncarc[border=2pt]{->}{Bi}{C}
		\ncarc{<<-}{D}{Bo}
		\ncarc{D}{E}
		\ncarc{<<-}{F}{D}\naput[nrot=:U]{\shortstack{end-key \\ dependence}}
		\ncarc{D}{G}
		\ncarc{<-}{HI}{G}
		\ncarc{->}{G}{J}
		\ncarc{Bo}{Ko}
		\ncarc[border=2pt]{->}{J}{Ki}
		\ncarc[border=2pt]{Ko}{L}
		% node labels
		\nput{-80}{Bo}{uplink key}
		\nput{-100}{D}{uplink key}
		\nput{-90}{F}{end key}
		\nput{0}{HI}{\pstribox{3}}
		% table
		\rput(4.45,2.7){TABLE T4}
		\multips(1.3,0){3}{\psframe(3.8,1.6)(5.1,2.5)}
		\rput(4.45,2.05){\uline{\Widen{B}}}
		\rput(5.75,2.05){\uline{\Widen{D}}}
		\rput(7.05,2.05){\uline{\Widen{F}}}
		% key labels
		\pnode(3.8,1.3){PKAL}
		\pnode(7.7,1.3){PKAR}
		\rput(5.75,1.3){\Rnode{PK}{PK}}
		\ncline[nodesepA=1mm]{->}{PK}{PKAL}
		\ncline[nodesepA=1mm]{->}{PK}{PKAR}
	\end{pspicture}
	\medskip\hrule
	\caption[Deriving a relation from an end-key dependency]%
		{Deriving a relation from an end-key 
		 dependency~\protect\cite[p.~831]{Smit:HC:1985}}
	\label{Fig.SmithsMethod.Endkey}
\end{figure}

%%%%%%%%%%%%%%%%%%%%

%--------------------------------------------------------------------------------

\subsubsection{Isolated bubbles composed into relations}

An \emph{isolated bubble} is one that has no arrows pointing either to or from it. All attributes within an isolated bubble become the primary key of a single relation.

%--------------------------------------------------------------------------------

\subsubsection{Practicable diagrams}

Smith defines a \emph{practicable dependency diagram} as one that does not produce a relation with a primary key comprising three or more attributes. To correct an impracticable FDD, the diagram is modified by adding \emph{surrogate keys} \cite[pp.~368--369]{Date:CJ:1995:IDS} to break the offending relation(s) into two or more sub-relations (see Figure~\ref{Fig.SmithsMethod.Practicable}).

%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htbp]
	\centering
	\setlength{\unitlength}{1mm}
	\psset{linewidth=0.5pt,arrows=->>,arrowscale=1.5,labelsep=2pt}
	\newcommand{\Attr}[2][0.6]{\makebox(#1,0.3){#2}}
	\hrule\medskip
	\subfigure[Not practicable]%
	{	\begin{pspicture}(0,0.4)(6.9,5)
			\sffamily \scriptsize
			% background blob
			\rput[bl](0,1.2){\includegraphics{ImpracticableBlob}}
			% nodes
			\rput(0.8,4.1){\ovalnode{M1}{\Attr{M}}}
			\rput(2.8,4.3){\ovalnode{N1}{\Attr{N}}}
			\rput(4.9,4){\ovalnode{O1}{\Attr{O}}}
			\rput(3.3,3.1){\ovalnode{PQ1}{\Attr[0.8]{P + Q}}}
			\rput(4.8,2.7){\ovalnode{R1}{\Attr{R}}}
			\rput(6.1,3.1){\ovalnode{S1}{\Attr{S}}}
			% connectors
			\ncarc{M1}{N1}
			\ncarc{N1}{O1}
			\ncarc{O1}{PQ1}
			\ncarc{->}{PQ1}{R1}
			\ncarc{->}{PQ1}{S1}
			% table
			\rput(1.65,1.3){T10}
			\multips(0.7,0){7}{\psframe(1.3,0.5)(2,1.1)}
			\rput(1.65,0.8){\uline{\Widen{M}}}
			\rput(2.35,0.8){\uline{\Widen{N}}}
			\rput(3.05,0.8){\uline{\Widen{O}}}
			\rput(3.75,0.8){\uline{\Widen{P}}}
			\rput(4.45,0.8){\uline{\Widen{Q}}}
			\rput(5.15,0.8){\Widen{R}}
			\rput(5.85,0.8){\Widen{S}}
		\end{pspicture}
	} % subfigure
	\\
	\subfigure[Practicable]%
	{	\begin{pspicture}(0,0.4)(9,5)
			\sffamily \scriptsize
			% background blob
			\rput[bl](0,1.15){\includegraphics{PracticableBlobs}}
			% nodes
			\rput(0.8,4.1){\ovalnode{M2}{\Attr{M}}}
			\rput(2.8,4.3){\ovalnode{N2}{\Attr{N}}}
			\rput(4.9,4){\ovalnode{O2}{\Attr{O}}}
			\rput(3.3,3.1){\ovalnode[linestyle=dashed]{SS1}{\Attr{S1}}}
			\rput(5.5,2.7){\ovalnode{PQ2}{\Attr[0.8]{P + Q}}}
			\rput(6.9,2.2){\ovalnode{R2}{\Attr{R}}}
			\rput(8.1,2.7){\ovalnode{S2}{\Attr{S}}}
			% connectors
			\ncarc{M2}{N2}
			\ncarc{N2}{O2}
			\ncarc{->}{O2}{SS1}
			\ncarc{SS1}{PQ2}
			\ncarc{->}{PQ2}{R2}
			\ncarc{->}{PQ2}{S2}
			% table
			\rput(0.65,1.3){T11}
			\multips(0.7,0){4}{\psframe(0.3,0.5)(1,1.1)}
			\rput(0.65,0.8){\uline{\Widen{M}}}
			\rput(1.35,0.8){\uline{\Widen{N}}}
			\rput(2.05,0.8){\uline{\Widen{O}}}
			\rput(2.75,0.8){\uwave{\Widen{S1}}}
			\rput(4.25,1.3){T12}
			\multips(0.7,0){5}{\psframe(3.9,0.5)(4.6,1.1)}
			\rput(4.25,0.8){\uline{\Widen{S1}}}
			\rput(4.95,0.8){\uline{\Widen{P}}}
			\rput(5.65,0.8){\uline{\Widen{Q}}}
			\rput(6.35,0.8){\Widen{R}}
			\rput(7.05,0.8){\Widen{S}}
			% node labels
			\nput{-100}{SS1}{\shortstack{surrogate \\ key}}
		\end{pspicture}
	} % subfigure
	\medskip\hrule
	\caption[Correcting an impracticable FDD]%
		{Correcting an impracticable 
		 FDD~\protect\cite[p.~832]{Smit:HC:1985}}
	\label{Fig.SmithsMethod.Practicable}
\end{figure}

%%%%%%%%%%%%%%%%%%%%

%--------------------------------------------------------------------------------

\subsubsection{Additional guidelines}

Smith also stated several additional guidelines for designing FDDs that generally produce a `better' design \cite[pp.~831--832]{Smit:HC:1985}. These guidelines do not have any impact on the foreign key issues and are therefore not described here.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Issues with deriving foreign keys}
\label{Sec:SmithsMethod:FKProblem}

\subsection{Non-derivable foreign keys}

There are some valid foreign keys that cannot be derived using the target bubble rule. Consider Smith's first example, shown in Figure~\ref{Fig.SmithsMethod.UnderivableFKs}; it is clear that CLASS + SECTION in relation R222 is a foreign key to CLASS + SECTION in relation R212. This cannot be derived using the target bubble rule, however, because the bubble containing CLASS and SECTION is not a target bubble. Similarly, STUDENT in R11a is a foreign key to STUDENT in R121, but cannot be derived from the diagram using the target bubble rule because neither of the two STUDENT bubbles are target bubbles.

%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htbp]
	\centering
	\setlength{\unitlength}{1mm}
	\psset{arrows=->>,arrowscale=2,linewidth=0.5pt,labelsep=2pt,unit=7.5mm}
	\newcommand{\Attr}[2][0.8]{\makebox(#1,0.4){#2}}
	\hrule\medskip
	\begin{pspicture}(20.9,11.9)
		\sffamily \tiny
		% nodes
		\rput(1,5.3){\ovalnode{R}{\Attr{ROOM}}}
		\rput(1.7,8.5){\ovalnode{D}{\Attr{DAY}}}
		\rput(2.8,2.9){\ovalnode{MY}{\Attr[1.5]{MAJOR + YEAR}}}
		\rput(4.1,6.5){\ovalnode{STo}{\ovalnode{STi}{\Attr[1]{STUDENT}}}}
		\rput(5.7,4.2){\ovalnode[linestyle=dashed]{SK}{\Attr[1.3]{SUROG\_KEY}}}
		\rput(6.6,1.9){\ovalnode{E}{\Attr[1.3]{EXAM\_SCORE}}}
		\rput(8.1,5.5){\ovalnode{RS}{\Attr[1.5]{RANK + SALARY}}}
		\rput(7.7,7.5){\ovalnode{I}{\Attr[1.3]{INSTRUCTOR}}}
		\rput(7.5,9.7){\ovalnode{T}{\Attr{TEXT}}}
		\rput(4.6,9.9){\ovalnode{CS}{	\begin{psmatrix}[rowsep=1cm]
											\ovalnode{C}{\Attr{CLASS}}	\\
											SECTION
										\end{psmatrix}}}
		% connectors
		\ncarc{<-}{R}{D}
		\ncarc{<-}{MY}{STo}
		\ncline[border=2pt]{CS}{STi}
		\ncarc[border=2pt]{->}{STi}{SK}
		\ncarc{SK}{E}
		\ncarc{->}{I}{RS}
		\ncarc{<<-}{D}{CS}
		\ncarc{->}{CS}{I}
		\ncarc[border=2pt]{C}{T}
		%
		\psset{arrows=-}
		% tables
		\rput(11.4,11.2){R221}
		\psframe(10.9,10.2)(12.4,11)
		\psframe(12.4,10.2)(13.7,11)
		\rput(11.65,10.55){\uline{CLASS}}
		\rput(13.05,10.55){\uline{TEXT}}
		\psline(11.65,10.2)(11.65,9.3)
		%
		\rput(12.2,9.5){R212}
		\psframe(10.9,8.5)(12.4,9.3)
		\psframe(12.4,8.5)(14.2,9.3)
		\psframe(14.2,8.5)(16.6,9.3)
		\rput(11.65,8.85){\uline{CLASS}}
		\rput(13.3,8.85){\uline{SECTION}}
		\rput(15.4,8.8){\uwave{INSTRUCTOR}}
		\psline(11.65,8.5)(11.65,6)
		\psline(13.3,8.5)(13.3,6)
		\psline(15.4,8.5)(15.4,7.6)
		%
		\rput(14.9,7.8){R211}
		\psframe(14.2,6.8)(16.6,7.6)
		\psframe(16.6,6.8)(17.9,7.6)
		\psframe(17.9,6.8)(19.5,7.6)
		\rput(15.4,7.15){\uline{INSTRUCTOR}}
		\rput(17.25,7.2){RANK}
		\rput(18.7,7.2){SALARY}
		%
		\rput(12.2,6.2){R11a}
		\psframe(10.9,5.2)(12.4,6)
		\psframe(12.4,5.2)(14.2,6)
		\psframe(14.2,5.2)(16,6)
		\psframe(16,5.2)(18.2,6)
		\rput(11.65,5.55){\uline{CLASS}}
		\rput(13.3,5.55){\uline{SECTION}}
		\rput(15.1,5.55){\uline{STUDENT}}
		\rput(17.1,5.5){\uwave{SUROG\_KEY}}
		\psline(11.65,5.2)(11.65,1.1)
		\psline(13.3,5.2)(13.3,1.1)
		\psline(15.1,5.2)(15.1,2.7)
		\psline(17.1,5.2)(17.1,4.3)
		%
		\rput(14.6,2.9){R121}
		\psframe(14.2,1.9)(16,2.7)
		\psframe(16,1.9)(17.6,2.7)
		\psframe(17.6,1.9)(18.8,2.7)
		\rput(15.1,2.25){\uline{STUDENT}}
		\rput(16.8,2.3){MAJOR}
		\rput(18.2,2.3){YEAR}
		%
		\rput(16.6,4.5){R11b}
		\psframe(16,3.5)(18.2,4.3)
		\psframe(18.2,3.5)(20.6,4.3)
		\rput(17.1,3.85){\uline{SUROG\_KEY}}
		\rput(19.4,3.85){\uline{EXAM\_SCORE}}
		%
		\rput(12.2,1.3){R222}
		\psframe(10.9,0.3)(12.4,1.1)
		\psframe(12.4,0.3)(14.2,1.1)
		\psframe(14.2,0.3)(15.3,1.1)
		\psframe(15.3,0.3)(16.6,1.1)
		\rput(11.65,0.65){\uline{CLASS}}
		\rput(13.3,0.65){\uline{SECTION}}
		\rput(14.75,0.65){\uline{DAY}}
		\rput(15.95,0.7){ROOM}
	\end{pspicture}
	\medskip\hrule
	\caption[Foreign keys that cannot be derived using the existing rules]%
		{Foreign keys that cannot be derived using the existing 
		 rules~\protect\cite[Figure~3]{Smit:HC:1985}}
	\label{Fig.SmithsMethod.UnderivableFKs}
\end{figure}

%%%%%%%%%%%%%%%%%%%%

\subsection{Invalid foreign keys}

The target bubble rule can sometimes produce invalid foreign keys. The target bubble rule states that attributes within a target bubble become foreign keys of the resultant relation if they also function as a key bubble. Applying this rule to Smith's second example \cite[Figures~4--7]{Smit:HC:1985} results in several attributes being identified as foreign keys when they are not, such as those highlighted in Figure~\vref{Fig.SmithsMethod.InvalidFKs}.

%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htbp]
	\centering
	\setlength{\unitlength}{1mm}
	\psset{linewidth=0.5pt,labelsep=2pt,unit=6.5mm}
	\newcommand{\Attr}[2][0.8]{\makebox(#1,0.4){#2}}
	\hrule\medskip
	\begin{pspicture}(23.8,13.4)
		\sffamily \tiny
		% tables
		\rput(1.5,12.8){PROJ}
		\psframe(0.2,11.8)(1.7,12.6)
		\psframe(1.7,11.8)(3.9,12.6)
		\rput(0.95,12.15){\uline{PROJ\_NO}}
		\rput(2.8,12.2){PROJ\_TITLE}
		\psline(0.95,11.8)(0.95,11)
		\psline(0.95,12.6)(0.95,13.2)(5.9,13.2)(5.9,12.6)
		%
		\rput(6.5,12.8){BUDG}
		\psframe(5.1,11.8)(6.7,12.6)
		\psframe(6.7,11.8)(7.8,12.6)
		\psframe(7.8,11.8)(9.2,12.6)
		\psframe(9.2,11.8)(11.4,12.6)
		\psframe(11.4,11.8)(13.4,12.6)
		\psframe(13.4,11.8)(15.7,12.6)
		\rput(5.9,12.15){\uline{PROJ\_NO}}
		\rput(7.25,12.15){\uline{YEAR}}
		\rput(8.5,12.1){\uwave{PI\_NO}}
		\rput(10.3,12.2){LABOR\_BUDG}
		\rput(12.4,12.2){MATL\_BUDG}
		\rput(14.55,12.2){OTHER\_BUDG}
		\psline(8.5,11.8)(8.5,11)
		\psline(7.25,11.8)(7.25,11.4)(2.3,11.4)(2.3,11)
		%
		\rput(1.3,11.2){AI}
		\psframe(0.2,10.2)(1.7,11)
		\psframe(1.7,10.2)(2.9,11)
		\psframe(2.9,10.2)(4.1,11)
		\rput(0.95,10.55){\uline{PROJ\_NO}}
		\rput(2.3,10.55){\uline{YEAR}}
		\rput(3.5,10.55){\uline{AI\_NO}}
		\psline(0.95,10.2)(0.95,9.4)
		\psline(3.5,10.2)(3.5,9.8)(8.2,9.8)(8.2,10.2)
		%
		\rput(9,11.2){EMP}
		\psframe(7.8,10.2)(9.2,11)
		\psframe(9.2,10.2)(11.2,11)
		\psframe(11.2,10.2)(12.9,11)
		\rput(8.5,10.55){\uline{EMP\_NO}}
		\rput(10.2,10.55){\uline{EMP\_NAME}}
		\rput(12.05,10.5){\uwave{DEPT\_NO}}
		\psline(9,10.2)(9,9.8)
		\pscircle(9,9.55){0.25}
		\rput(9,9.55){2}
		%
		\rput(12.6,9.6){DEPT}
		\psframe(11.2,8.6)(12.9,9.4)
		\psframe(12.9,8.6)(14.9,9.4)
		\psframe(14.9,8.6)(16.6,9.4)
		\rput(12.05,8.95){\uline{DEPT\_NO}}
		\rput(13.9,9){DEPT\_NAME}
		\rput(15.75,9){COST\_HR}
		%
		\rput(1.7,9.6){CONFG}
		\psframe(0.2,8.6)(1.7,9.4)
		\psframe(1.7,8.6)(3.5,9.4)
		\psframe(3.5,8.6)(5.8,9.4)
		\rput(0.95,8.95){\uline{PROJ\_NO}}
		\rput(2.6,8.95){\uline{AC\_CONFG}}
		\rput(4.65,9){CONFG\_DESC}
		\psline(0.95,8.6)(0.95,7.8)
		\psline(2.6,8.6)(2.6,7.8)
		%
		\rput(1.7,8){FGEOM}
		\psframe(0.2,7)(1.7,7.8)
		\psframe(1.7,7)(3.5,7.8)
		\psframe(3.5,7)(5.6,7.8)
		\psframe(5.6,7)(7.8,7.8)
		\psframe(7.8,7)(9.7,7.8)
		\psframe(9.7,7)(11.8,7.8)
		\psframe(11.8,7)(14.3,7.8)
		\psframe(14.3,7)(16.1,7.8)
		\psframe(16.1,7)(18.1,7.8)
		\psframe(18.1,7)(20.2,7.8)
		\psframe(20.2,7)(22.4,7.8)
		\rput(0.95,7.35){\uline{PROJ\_NO}}
		\rput(2.6,7.35){\uline{AC\_CONFG}}
		\rput(4.55,7.35){\uline{FUS\_CONFG}}
		\rput(6.7,7.4){FCONF\_DESC}
		\rput(8.75,7.4){FUS\_DIAM}
		\rput(10.75,7.4){FUS\_LENGTH}
		\rput(13.05,7.4){BOATAIL\_DIAM}
		\rput(15.2,7.4){FUS\_WET}
		\rput(17.1,7.4){FUS\_XSECT}
		\rput(19.15,7.4){FUS\_UPSWP}
		\rput(21.3,7.4){PLAN\_SIDE}
		\psline(0.95,7)(0.95,5.8)
		\psline(2.6,7)(2.6,5.8)
		\psline(4.55,7)(4.55,6.8)
		\psline(22.4,7.8)(22.8,7.8)(22.6,7.6)(22.7,7.3)(22.6,7)(22.4,7)(22.4,7.8)
		\pscircle(4.55,6.55){0.25}
		\rput(4.55,6.55){1}
		%
		\psframe(17.8,6)(19.8,6.8)
		\psframe(19.8,6)(21.7,6.8)
		\psframe(21.7,6)(23.6,6.8)
		\rput(18.8,6.4){FUS\_XTRAN}
		\rput(20.75,6.4){FUS\_INTF}
		\rput(22.65,6.4){FUS\_FORM}
		\psline(17.8,6.8)(17.6,6.8)(17.4,6.6)(17.5,6.3)(17.4,6)(17.8,6)(17.8,6.8)
		%
		\rput(1.6,6){CASE}
		\psframe(0.2,5)(1.7,5.8)
		\psframe(1.7,5)(3.5,5.8)
		\psframe(3.5,5)(5.2,5.8)
		\psframe(5.2,5)(6.9,5.8)
		\psframe(6.9,5)(8.6,5.8)
		\psframe(8.6,5)(10.2,5.8)
		\psframe(10.2,5)(12.2,5.8)
		\psframe(12.2,5)(13.8,5.8)
		\psframe(13.8,5)(14.8,5.8)
		\psframe(14.8,5)(17,5.8)
		\psframe(17,5)(19.1,5.8)
		\rput(0.95,5.35){\uline{PROJ\_NO}}
		\rput(2.6,5.35){\uline{AC\_CONFG}}
		\rput(4.35,5.35){\uline{CASE\_NO}}
		\rput(6.05,5.3){\uwave{SUROG\_1}}
		\rput(7.75,5.3){\uwave{LIFT\_ED}}
		\rput(9.4,5.3){\uwave{MACH\_ED}}
		\rput(11.2,5.4){DATE\_RUN}
		\rput(13,5.3){\uwave{EXE\_EMP}}
		\rput(14.3,5.4){ALT}
		\rput(15.9,5.4){RWING\_AREA}
		\rput(18.05,5.4){RWING\_MAC}
		\psline(6.05,5)(6.05,4.6)(1.05,4.6)(1.05,2.6)
		\psline(7.75,5)(6.25,4.2)
		\psline(9.4,5)(10.6,4.2)
		\psline(13,5.8)(13,6.2)
		\pscircle(13,6.45){0.25}
		\rput(13,6.45){2}
		%
		\rput(7.6,4.4){LIFT}
		\psframe(5.4,3.4)(7.1,4.2)
		\psframe(7.1,3.4)(9.1,4.2)
		\rput(6.25,3.75){\uline{LIFT\_ED}}
		\rput(8.1,3.75){\uline{LIFT\_COEF}}
		\psline(8.1,3.4)(9.4,2.6)
		%
		\rput(11.2,4.4){MACHT}
		\psframe(9.8,3.4)(11.4,4.2)
		\psframe(11.4,3.4)(12.6,4.2)
		\rput(10.6,3.75){\uline{MACH\_ED}}
		\rput(12,3.75){\uline{MACH}}
		\psline(12,3.4)(11,2.6)
		%
		\rput(1.6,2.8){FLOC}
		\psframe(0.2,1.8)(1.9,2.6)
		\psframe(1.9,1.8)(3.5,2.6)
		\psframe(3.5,1.8)(5.5,2.6)
		\psframe(5.5,1.8)(7.6,2.6)
		\rput(1.05,2.15){\uline{SUROG\_1}}
		\rput(2.7,2.15){\uline{FUS\_LOC}}
		\rput(4.5,2.1){\uwave{FUS\_CONFG}}
		\rput(6.55,2.2){FLOC\_DESC}
		\psline(1.05,1.8)(1.05,1)
		\psline(4.4,1.8)(2.8,1)
		\psline(4.4,2.6)(4.4,3)
		\pscircle(4.4,3.25){0.25}
		\rput(4.4,3.25){1}
		%
		\rput(10.2,2.8){LIFMACH}
		\psframe(8.4,1.8)(10.4,2.6)
		\psframe(10.4,1.8)(11.6,2.6)
		\psframe(11.6,1.8)(13.3,2.6)
		\rput(9.4,2.15){\uline{LIFT\_COEF}}
		\rput(11,2.15){\uline{MACH}}
		\rput(12.45,2.1){\uwave{SUROG\_2}}
		\psline(12.45,1.8)(12.45,1.4)(4.65,1.4)(4.65,1)
		%
		\rput(1.7,1.2){FDRAG}
		\psframe(0.2,0.2)(1.9,1)
		\psframe(1.9,0.2)(3.9,1)
		\psframe(3.9,0.2)(5.6,1)
		\psframe(5.6,0.2)(7.6,1)
		\psframe(7.6,0.2)(9.9,1)
		\psframe(9.9,0.2)(12,1)
		\psframe(12,0.2)(14.3,1)
		\psframe(14.3,0.2)(16.2,1)
		\psframe(16.2,0.2)(18.2,1)
		\psframe(18.2,0.2)(20.5,1)
		\psframe(20.5,0.2)(22.8,1)
		\rput(1.05,0.55){\uline{SUROG\_1}}
		\rput(2.9,0.55){\uline{FUS\_CONFG}}
		\rput(4.75,0.55){\uline{SUROG\_2}}
		\rput(6.6,0.6){FORM\_FCTR}
		\rput(8.75,0.6){BDREV\_DRAG}
		\rput(10.95,0.6){BASE\_DRAG}
		\rput(13.15,0.6){UPSWP\_DRAG}
		\rput(15.25,0.6){MIN\_DRAG}
		\rput(17.2,0.6){INTF\_DRAG}
		\rput(19.35,0.6){CMPNT\_DRAG}
		\rput(21.65,0.6){COMPR\_DRAG}
		%
		\psset{linewidth=2pt,linecolor=lightgray}
		\psframe(4.9,4.8)(10.5,6)
		\psframe(3.3,1.6)(5.8,2.8)
		\psframe(11.4,1.6)(13.6,2.8)
	\end{pspicture}
	\medskip\hrule
	\caption[Derivation of invalid foreign keys]%
		{Derivation of invalid foreign 
		 keys~\protect\cite[Figure~7]{Smit:HC:1985}}
	\label{Fig.SmithsMethod.InvalidFKs}
\end{figure}

%%%%%%%%%%%%%%%%%%%%

A foreign key is a set of attributes that act as a link between associated relations. A foreign key links to a candidate key of some relation in the database; typically this candidate key is also the primary key of the relation being linked to \cite[p.~117]{Date:CJ:1995:IDS}. \emph{Referential integrity} states that the value of a foreign key must be identical to a key value in the linked relation, or it must be null \cite{Date:CJ:1995:IDS,Elma:R:1994}. In Figure~\ref{Fig.SmithsMethod.InvalidFKs} the highlighted `foreign keys' do not reference candidate keys; rather they are referencing only part of the primary key of the referenced relation.

This issue arises because the term `prime-key bubble' can be applied to bubbles at the start of both single- and multivalued dependencies. Applying the target bubble rule to a target bubble that is also the prime-key bubble of a multivalued dependency produces the type of `foreign keys' shown in Figure~\ref{Fig.SmithsMethod.InvalidFKs}. The only effective way to address this issue is to alter Smith's bubble terminology (see Section~\ref{Sec:SmithsMethod:Solution}).

\subsection{Automation of the domain flag rule}

Smith's method was an interesting candidate for the author's research into automated translations among different data modelling representations \cite{Stan:N:1997:ECIS97,Stan:N:1999:PhD}. However, the domain flag rule is difficult to automate in its current form. The domain flag rule states that attributes within a target bubble become foreign keys of that relation if they are tagged with a domain flag. When translating a collection of domain flags into relational form, it is required to know which of the tagged attributes is the `target' attribute for the purposes of generating the correct foreign key references (that is, the attribute that the foreign keys will reference). Smith's FDD notation cannot identify the `target' attribute of a domain flag, so automation of this rule is problematic.

Smith gave no explanation of how to properly treat domain flags, yet in his examples, domain flags are translated correctly. This is possibly because Smith expected the process to be carried out manually and used the dependency-list statements to resolve ambiguities. This is, however, not particularly amenable to automation.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Proposed modifications}
\label{Sec:SmithsMethod:Solution}

Smith's target bubble rule needs to be replaced. To facilitate this change, the author has modified Smith's original terminology for bubble types. Smith uses the term \emph{prime key} to denote the start bubble of a single-valued dependency, and sometimes the start bubble of a multivalued dependency. The term `prime' implies that the prime key attributes are sole determinants of the target attributes. Thus it seems rather counter-intuitive that a prime-key bubble can be part of an uplink key chain, as it is no longer the sole determinant. The following bubble terminology is therefore proposed:
\begin{description}
	\item[Single-key bubble:] the start bubble of a single-valued dependency.
	\item[Target bubble:] the end bubble of a single-valued dependency.
	\item[Multi-key bubble:] the start bubble of a multivalued dependency.
	\item[End-key bubble:] the end bubble of a multivalued dependency.
	\item[Isolated bubble:] a bubble with no attached dependencies (identical to Smith's definition).
\end{description}

Bubbles may only be of one type. Smith did not enforce this restriction, for example, target bubbles could also be prime-key bubbles, although he did state that the multiple bubbles could be used to clarify such situations. Since the new terminology requires every bubble to be of a single type, multiple bubbles become essential.

Having made these changes, it is now possible to replace Smith's target bubble rule with the following:
\begin{description}
	\item[Key bubble rule:] Let $B$ be a bubble of any type, and $R_{B}$ be the derived relation to which this bubble contributes. If the attributes of $B$ form the entire contents of a single-key bubble $S$ ($S \neq B$, contributing to a derived relation $R_{S}$), then the attributes contained by $S$ become a foreign key of $R_{B}$ that refers to $R_{S}$.
\end{description}

% \shortpage
Smith's domain flag rule could remain unchanged, but it cannot be fully automated in its current form. Consequently, the author has introduced the notation shown in Figure~\ref{Fig.SmithsMethod.DFSource} to indicate that the tagged attribute is the `target' attribute for that domain flag. The domain flag rule can now be redefined as:
\begin{description}
	\item[New domain flag rule:] Let $B$ be a bubble of any type containing an attribute $A$ that is tagged with a domain flag, and let $R_{B}$ be the derived relation to which this bubble contributes. The domain flag is `targeted' on another attribute $D$ that is the sole attribute contained by a  single-key bubble $S$ ($S \neq B$, derived relation $R_{S}$). Attribute $A$ becomes a foreign key of $R_{B}$ that refers to attribute $D$ of $R_{S}$.
\end{description}

%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htb]
	\centering
	\hrule\medskip
	\psset{linewidth=0.5pt,unit=1cm}
	\begin{pspicture}(5,1)
		\rput[bl](0,0){\pstribox[doubleline=true]{\textsf{1}}}
		\rput[bl](1,0){\psovalbox{\textsf{EMP\_NO}}}
	\end{pspicture}
	\medskip\hrule
	\caption{`Target' attribute domain flag notation}
	\label{Fig.SmithsMethod.DFSource}
\end{figure}

%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Example}
\label{Sec:SmithsMethod:Examples}

It was originally planned to use Smith's examples to illustrate the new rules in action. However, upon closer examination, it was discovered that neither of the two examples are particularly useful. The first example (that of a University database) is not really complete enough illustrate the new rules. By contrast, the second example (a drag prediction database) is far too complex, and has so many complicated dependencies among attributes that it is arguable whether a relational implementation is the best solution.

Instead, an example devised by the author will be used. Consider a database that stores assessment marks for a course. An entity-relationship diagram representing this database is shown in Figure~\vref{Fig.MarksERDNorm}.

%%%%%%%%%%%%%%%%%%%%

\begin{figure}[htbp]
	\centering
	\hrule\medskip
	\includegraphics[scale=0.75]{MarksERDNorm}
	\medskip\hrule
	\caption{E-R description of the assessment marks viewpoint (normalised)}
	\label{Fig.MarksERDNorm}
\end{figure}

%%%%%%%%%%%%%%%%%%%%

The final result for the course is determined by the results of a collection of assessment elements (such as practical exercises and examinations), each of which comprises a collection of questions. Each question may or may not comprise a collection of sub-questions.

Students individually complete several assessment elements during the course, submitting each as an assignment that is marked by a single staff member. As with assessment elements, an assignment comprises a collection of answers (corresponding to questions), which in turn comprise a collection of sub-answers.

The total mark for an assignment is broken down into a collection of marks for each individual answer. Each answer is marked according to a marking schedule that specifies a set of marking criteria and the marks allocation for each criterion. Marks may be adjusted at a later date for reasons of illness or technical difficulties.

In Figure~\vref{Fig.MarksFDD} is shown a functional dependency diagram for the example, based on the following set of dependencies:
\begin{itemize}
	\item \svd{student\_id}{name, password}
	\item \svd{staff\_id}{name, password}
	\item \svd{element\_id}{name, total\_mark, percent, due\_date, late\_penalty}
	\item \mvd{element\_id}{question\_id}
	\item \svd{question\_id}{number, marks, guidelines, parent\_question (question\_id)}
	\item \svd{assign\_id}{date\_submitted, date\_marked, raw\_mark, comments, student\_id, \\ staff\_id, element\_id}
	\item \mvd{assign\_id}{answer\_id}
	\item \svd{answer\_id}{mark, comments, question\_id, parent\_answer (answer\_id)}
	\item \svd{assign\_id, adjustment\_no}{reason, amount}
	\item \svd{answer\_id, criterion\_name}{mark, comments}
\end{itemize}
%%%%%%%%%%%%%%%%%%%%
\begin{figure}[tb]
	\centering
	\hrule\medskip
	\includegraphics[scale=0.75]{MarksFDD}
	\medskip\hrule
	\caption{Functional dependency description of the assessment marks 
	         viewpoint}
	\label{Fig.MarksFDD}
\end{figure}
%%%%%%%%%%%%%%%%%%%%
The last two functional dependencies include the \emph{embedded} multivalued dependencies \cite[p.~341]{Date:CJ:1995:IDS} \mvd{assign\_id}{adjustment\_no} and \mvd{answer\_id}{criterion\_name} respectively. The relations corresponding to this set of dependencies are in at least fourth normal form.

Applying Smith's original foreign key rules to the FDD shown in Figure~\ref{Fig.MarksFDD}, the following set of relations can be derived (primary keys are \uline{underlined}):
\begin{enumerate}
	\item Staff(\uline{staff\_id}, name, password)
	\item Student(\uline{student\_id}, name, password)
	\item Element(\uline{element\_id}, name, total\_mark, percent, date\_due, late\_penalty)
	\item Assignment(\uline{assign\_id}, element\_id, student\_id, staff\_id, date\_submitted, \\ raw\_mark, comments)	\\
	      \emph{element\_id} is a foreign key to Element (target bubble rule)	\\
	      \emph{student\_id} is a foreign key to Student (target bubble rule)	\\
	      \emph{staff\_id} is a foreign key to Staff (target bubble rule)
	\item Adjustment(\uline{assign\_id, adjustment\_no}, reason, amount)	\\
	      \emph{assign\_id} should be a foreign key to Assignment, but this cannot be derived because none of the bubbles containing \emph{assign\_id} are target bubbles.
	\item Question(\uline{question\_id}, number, marks, guidelines, parent\_question)	\\
	      \emph{parent\_question} is a foreign key to Question (domain flag rule)
	\item Answer(\uline{answer\_id}, question\_id, mark, comments, parent\_answer)	\\
	      \emph{question\_id} is a foreign key to Question (target bubble rule)	\\
	      \emph{parent\_answer} is a foreign key to Answer (domain flag rule)
	\item Criterion(\uline{answer\_id, criterion\_name}, mark, comments)	\\
	      \emph{answer\_id} should be a foreign key to Answer, but this cannot be derived because none of the bubbles containing \emph{answer\_id} are target bubbles.
	\item Assign\_Answer(\uline{assign\_id, answer\_id})	\\
	      \emph{assign\_id} should be a foreign key to Assignment and \emph{answer\_id} should be a foreign key to Answer, but these cannot be derived because neither of the bubbles involved are target bubbles.
	\item Element\_Question(\uline{element\_id, question\_id})	\\
	      \emph{element\_id} should be a foreign key to Element and \emph{question\_id} should be a foreign key to Question, but these cannot be derived because neither of the bubbles involved are target bubbles.
\end{enumerate}

Using the new rules defined in Section~\ref{Sec:SmithsMethod:Solution}, the following set of relations can be derived:
\begin{enumerate}
	\item Staff(\uline{staff\_id}, name, password)
	\item Student(\uline{student\_id}, name, password)
	\item Element(\uline{element\_id}, name, total\_mark, percent, date\_due, late\_penalty)
	\item Assignment(\uline{assign\_id}, element\_id, student\_id, staff\_id, date\_submitted, \\ raw\_mark, comments)	\\
	      \emph{element\_id} is a foreign key to Element (key bubble rule)	\\
	      \emph{student\_id} is a foreign key to Student (key bubble rule)	\\
	      \emph{staff\_id} is a foreign key to Staff (key bubble rule)
	\item Adjustment(\uline{assign\_id, adjustment\_no}, reason, amount)	\\
	      \emph{assign\_id} is a foreign key to Assignment (key bubble rule)
	\item Question(\uline{question\_id}, number, marks, guidelines, parent\_question)	\\
	      \emph{parent\_question} is a foreign key to Question (new domain flag rule)
	\item Answer(\uline{answer\_id}, question\_id, mark, comments, parent\_answer)	\\
	      \emph{question\_id} is a foreign key to Question (key bubble rule)	\\
	      \emph{parent\_answer} is a foreign key to Answer (new domain flag rule)
	\item Criterion(\uline{answer\_id, criterion\_name}, mark, comments)	\\
	      \emph{answer\_id} is a foreign key to Answer (key bubble rule)
	\item Assign\_Answer(\uline{assign\_id, answer\_id})	\\
	      \emph{assign\_id} is a foreign key to Assignment (key bubble rule)	\\
	      \emph{answer\_id} is a foreign key to Answer (key bubble rule)
	\item Element\_Question(\uline{element\_id, question\_id})	\\
	      \emph{element\_id} is a foreign key to Element (key bubble rule)	\\
	      \emph{question\_id} is a foreign key to Question (key bubble rule)
\end{enumerate}

It can be seen from this example that the key bubble rule has allowed the derivation of six foreign keys that could not be identified using the original target bubble rule (in relations Adjustment, Criterion, Assign\_Answer and Element\_Question). The new domain flag rule has not produced any additional foreign keys, but this is to be expected as this rule was intended primarily to support the automation of Smith's method \cite{Stan:N:1997:ECIS97,Stan:N:1997:DP9708,Stan:N:1997:APSEC97}.

\section{Conclusion}
\label{Sec:Conclusions}

Smith's method is a technique that allows the derivation of normalised relations from a functional dependency diagram. Smith's original rules for deriving foreign keys were difficult to automate, failed to produce some foreign keys and could also produce invalid foreign keys under certain conditions. In this paper, new rules that address these issues were defined to replace Smith's original rules. These new rules allow the derivation of all foreign keys, and do not produce invalid foreign keys. To facilitate these rule changes, some modifications were also made to Smith's bubble terminology. These changes have resulted in a robust method for deriving relations from functional dependency diagrams that can be easily automated.

\bibliographystyle{dcu}
\bibliography{Baxter:Research:Biblio}

\end{document}