diff --git a/APCCM2017_Stanger.tex b/APCCM2017_Stanger.tex index 3d4f427..18c5641 100644 --- a/APCCM2017_Stanger.tex +++ b/APCCM2017_Stanger.tex @@ -238,7 +238,7 @@ View updating can be considered a special case of schema translation: we effectively have one schema comprising a set of views and another schema comprising a set of base relations, and need to translate updates from one schema to the other.\footnote{These schemas may be actually separate, or more likely may be sub-schemas of a larger database schema, with effective separation achieved via the access privilege mechanism.} It therefore makes sense that formalisms developed to characterise schema translations would also be applicable to view updates. We therefore propose to use Hull's notion of \emph{relative information capacity} \cite{Hull.R-1986a-Relative} as a means to formally characterise the information equivalence of the schemas that make up a given view configuration. -In particular, we develop an extension of Miller's \emph{schema intension graph} (SIG) formalism, which was originally designed to facilitate the comparison of relative information capacity across schemas. We first extend the SIG formalism with relevant relational concepts such as relation and tuple types, key constraints, and functional dependencies. We then show a proof of concept of how this extended formalism can be used to model the information equivalence of a view configuration, by applying it to Date's example of disjoint restriction views \cite{Date.C-2013a-View}. Our results are consistent with Date's operational characterisation of information equivalence for this example. +In particular, we develop an extension of Miller's \emph{schema intension graph} (SIG) formalism, which was originally designed to facilitate the comparison of relative information capacity across schemas. We first extend the SIG formalism with relevant relational concepts such as relation types, tuple types, and key constraints. We then show a proof of concept of how this extended formalism can be used to model the information equivalence of a view configuration, by applying it to Date's example of disjoint restriction views \cite{Date.C-2013a-View}. Our results are consistent with Date's operational characterisation of information equivalence for this example. In Section~\ref{sec-view-update-problem}, we briefly review the view update problem and relevant research, and in Section~\ref{sec-relations-types} we introduce some core concepts that are used throughout the remainder of the paper. In Section~\ref{sec-info-capacity} we discuss relative information capacity and the SIG formalism, then explain in Section~\ref{sec-relational-sigs} how to extend the SIG formalism to more precisely model a relational schema. In Section~\ref{sec-characterising} we analyse Date's restriction view example \cite{Date.C-2013a-View} in detail, and characterise the information equivalence of its various sub-schemas. We discuss conclusions and future work in Section~\ref{sec-conclusion}. @@ -249,7 +249,7 @@ \section{The view update problem} \label{sec-view-update-problem} -\noindent The view update problem was first discussed (in the relational context) by Codd in 1974 \cite{Codd.E-1974a-Recent}. When an update operation is applied to a view, that update needs to be translated into corresponding updates on the base relations from which the view is derived \cite{Keller.A-1985a-Algorithms}. Sometimes this translation is clear, but sometimes there are multiple such translations that all produce the desired view update, but produce different---possibly incompatible---changes in the base relations. Worse, there may be translations that have side effects (e.g., for some views based on joins), which produce a different result in the view than in an equivalent base relation that could be updated directly \cite{Keller.A-1985a-Algorithms}. The problem then is how to choose the most appropriate translation. As yet there has been no solution that completely and automatically resolves this ambiguity. +\noindent The view update problem was first discussed (in the relational context) by Codd in 1974 \cite{Codd.E-1974a-Recent}. When an update operation is applied to a view, that update needs to be translated into corresponding updates on the base relations from which the view is derived \cite{Keller.A-1985a-Algorithms}. Sometimes this translation is clear, but sometimes there are multiple such translations that all produce the desired view update, but produce different---possibly incompatible---changes in the base relations. Worse, there may be translations that have side effects (e.g., for some views based on joins), which produce a different result in the view than if the updates were applied directly to an equivalent base relation \cite{Keller.A-1985a-Algorithms}. The problem then is how to choose the most appropriate translation. As yet there has been no solution that completely and automatically resolves this ambiguity. There has been considerable research into this problem over the last four decades, and numerous different approaches have been suggested. We shall now briefly review some key developments in the relational context. @@ -322,11 +322,13 @@ \end{equation} where \(k\) is some finite upper bound. Thus, the type of an attribute \(A_{i}\) (denoted \(\Type{A_{i}}\)) is by definition the set of all possible values of \(A_{i}\). -Consider a relvar \(R\) with heading \(\{A_{1}, A_{2}, \dotsc, A_{n}\}\). Following from equation~\ref{eqn-type}, the (relation) type of \(R\), denoted here as \(\Type{R}\), is the set of all possible relation values with \(R\)'s heading. Similarly, the (tuple) type of \(R\)'s tuples, denoted here as \TT{R}, is the set of all possible tuple values with \(R\)'s heading, which equates to the Cartesian product of \(R\)'s attribute types, i.e.: +Consider a relvar \(R\) with heading \(\{A_{1}, A_{2}, \dotsc, A_{n}\}\). Following from equation~\ref{eqn-type}, the (relation) type of \(R\), which we denote here as \(\Type{R}\), is the set of all possible relation values with \(R\)'s heading. Similarly, the (tuple) type of \(R\)'s tuples, which we denote here as \TT{R}, is the set of all possible tuple values with \(R\)'s heading, which equates to the Cartesian product of \(R\)'s attribute types, i.e.: \begin{equation}\label{eqn-tuple-type} \TT{R} = \Type{A_{1}} \times \Type{A_{2}} \times\dotsb\times \Type{A_{n}}\text{.} \end{equation} -By definition any given value of \(R\) must comprise a set (possibly empty) of tuples \(\{t_{1}, t_{2}, \dotsc, t_{m}\} \subseteq \TT{R}\). \(\Type{R}\) therefore represents every possible \(k\)-combination of the elements of \TT{R}, for \(0 \le k \le \left|\TT{R}\right|\). +(This construction can be applied to any tuple type, including projections of other tuple types.) + +By definition any given value of a relvar \(R\) must comprise a set (possibly empty) of tuples \(\{t_{1}, t_{2}, \dotsc, t_{m}\} \subseteq \TT{R}\). \(\Type{R}\) therefore represents every possible \(k\)-combination of the elements of \TT{R}, for \(0 \le k \le \left|\TT{R}\right|\). For example, Date's usual suppliers-and-parts schema \cite{Date.C-2013a-View} includes a relvar \(S\) with heading \(\{\Sno, \Sname, \Status, \City\}\), which contains details of suppliers. From equation~\ref{eqn-tuple-type}, the tuples of \(S\) have type \(\TT{S} = \Type{\Sno} \times \TSSC\). @@ -352,7 +354,7 @@ Relative information capacity provides a means to identify the differences between schema instances, and can therefore be used as an aid to schema integration and translation \cite{Miller.R-1994a-PhD}. As noted earlier, view updating is a special case of this, so it follows that information capacity can also be used to characterise the information content of views compared to the base relvar(s) from which they are derived. -Miller et al.\ \cite{Miller.R-1993a-Information} defined relative information capacity in terms of two types of mapping. An \emph{information (capacity) preserving mapping} from schema \(\SC{1}\) to schema \(\SC{2}\) maps every element of \(\SC{1}\) to an element of \(\SC{2}\), but only some elements of \(\SC{2}\) to elements of \(\SC{1}\). \(\SC{2}\) has an information capacity greater than \(\SC{1}\), and is said to \emph{dominate} \(\SC{1}\), denoted \Dominates{\SC{2}}{\SC{1}}. An \emph{equivalence preserving mapping} from \(\SC{1}\) to \(\SC{3}\) maps every element of \(\SC{1}\) to an element \(\SC{3}\) and vice versa (i.e., both the mapping and its inverse are information preserving). The information capacities of \(\SC{1}\) and \(\SC{3}\) are identical, and the schemas are said to be \emph{equivalent}, denoted \Equivalent{\SC{1}}{\SC{3}}. +Miller et al.\ \cite{Miller.R-1993a-Information} defined relative information capacity in terms of two types of mapping. An \emph{information (capacity) preserving mapping} from schema \(\SC{1}\) to schema \(\SC{2}\) maps every element of \(\SC{1}\) to an element of \(\SC{2}\), but only some elements of \(\SC{2}\) to elements of \(\SC{1}\). \(\SC{2}\) has an information capacity greater than \(\SC{1}\), and is said to \emph{dominate} \(\SC{1}\), denoted \Dominates{\SC{2}}{\SC{1}}. An \emph{equivalence preserving mapping} from \(\SC{1}\) to \(\SC{3}\) maps every element of \(\SC{1}\) to an element of \(\SC{3}\) and vice versa (i.e., both the mapping and its inverse are information preserving). The information capacity of \(\SC{1}\) is identical to \(\SC{3}\), and the schemas are said to be \emph{equivalent}, denoted \Equivalent{\SC{1}}{\SC{3}}. (Note that Miller et al.\ \cite{Miller.R-1994a-Schema} use stricter definitions of equivalence and dominance than those used here; strictly speaking, the discussion of equivalence and dominance in this section should properly refer to \emph{absolute} dominance and equivalence, whereas the following discussion on SIG transformations should properly refer to \emph{internal} dominance and equivalence \cite{Miller.R-1994a-Schema}. The distinction, however, is not crucial to the discussion in this paper.) @@ -375,11 +377,11 @@ \end{quotation} A \emph{bijection} is a binary relation that has all four properties \cite{Miller.R-1994a-SIG}. -The use of the term ``relation'' here, while mathematically correct, is potentially confusing in any discussion that also involves the Relational Model. From now on, we shall therefore refer to the binary relations represented by SIG edges as ``associations''. +The use of the term ``binary relation'' here, while mathematically correct, is potentially confusing in any discussion that also involves the Relational Model. From now on, we shall therefore refer to the binary relations represented by SIG edges as ``associations''. -A selection edge \(e : A \LabelledEdge{\sigedge{00}}{\RelRestrict} B\) indicates that the elements of \(B\) comprise a subset or specialisation of the elements of \(A\). If \(B = A\), we have what we term a \emph{trivial} selection. Every element of B must appear in A, and vice versa, and every element of \(A\) is associated with exactly one element of \(B\), and vice versa. Trivial selection edges are therefore bijective, i.e., \(A \TrivialSelection B\). If \(B \subset A\), we have what we term a \emph{non-trivial} selection. Every element of \(B\) must appear in \(A\), but not vice versa. The mapping remains one-to-one, as before. From the perspective of the supertype (\(A\) in this example), non-trivial selection edges are therefore surjective, functional, and injective, i.e., \(A \SelectionEdge B\). +A selection edge \(e : A \LabelledEdge{\sigedge{00}}{\RelRestrict} B\) indicates that the elements of \(B\) comprise a subset or specialisation of the elements of \(A\). If \(B = A\), we have what we term a \emph{trivial} selection. Every element of B must appear in A, and vice versa, and every element of \(A\) is associated with exactly one element of \(B\), and vice versa. Trivial selection edges are therefore bijective, i.e., \(A \TrivialSelection B\). If \(B \subset A\), we have what we term a \emph{non-trivial} selection. Every element of \(B\) must appear in \(A\), but not vice versa. The mapping remains one-to-one, as before. From the perspective of the superset \(A\), non-trivial selection edges are therefore surjective, functional, and injective, i.e., \(A \SelectionEdge B\). -A projection edge \(e : A \LabelledEdge{\sigedge{00}}{\RelProject} B\), indicates that the elements of \(B\) are a projection of \(A\), where \(A\) is a constructed node of the form \(C \times D \times \dotsb\). A bijective projection edge, i.e., \(A \TrivialProjection B\), indicates that \(B\) is unique with respect to \(A\). For non-unique projections, every element of \(A\) must have a corresponding element in \(B\), and vice versa. Every element of \(A\) is also associated with at most one element of \(B\), but not vice versa. Non-unique projection edges are therefore total, surjective, and functional, i.e., \(A \ProjectionEdge B\). +A projection edge \(e : A \LabelledEdge{\sigedge{00}}{\RelProject} B\), indicates that the elements of \(B\) are a projection of \(A\), where \(A\) is a constructed node of the form \(C \times D \times \dotsb\) (i.e., a tuple type). A bijective projection edge, i.e., \(A \TrivialProjection B\), indicates that \(B\) is unique with respect to \(A\). For non-unique projections, every element of \(A\) must have a corresponding element in \(B\), and vice versa. Every element of \(A\) is also associated with at most one element of \(B\), but not vice versa. Non-unique projection edges are therefore total, surjective, and functional, i.e., \(A \ProjectionEdge B\). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -563,9 +565,9 @@ The key constraint on \Sno\ implies that the functional dependency \(\{\Sno\} \rightarrow \{\Sname, \Status, \City\}\) holds in \(S\). From equation \ref{eqn-tuple-type}, we can infer the type of \(\{\Sname, \Status, \City\}\) as \(\TSSC\). The association is functional and not injective by definition. Every supplier number (when used in a value of \(S\)) must be associated with an element of \(\TSSC\), and vice versa. This implies that the association is both total and surjective. Indeed, this will be true for every edge that represents a key constraint \cite{Miller.R-1994a-SIG}. We introduce the edge label \(\mathit{key}\) to indicate that an edge represents a key constraint, which gives us \(\Type{\Sno} \KeyEdge \TSSC\) in Figure~\ref{fig-sig-s-alone}. -Both \(\Type{\Sno}\) and \(\TSSC\) are projections of \(\TT{S}\), so we can add (non-unique) projection edges \(\TT{S} \ProjectionEdge \Type{\Sname} \times \Type{\Status} \times T_{\City}\) and \(\TT{S} \ProjectionEdge \Type{\Sno}\). Similarly, \(\Type{\Sname}\), \(\Type{\Status}\), and \(\Type{\City}\) are all projections of \(\TSSC\), so we can add further projection edges to these nodes, as shown in Figure~\ref{fig-sig-s-alone}. +Both \(\Type{\Sno}\) and \(\TSSC\) are projections of \(\TT{S}\), so we can add a non-unique projection edge \(\TT{S} \ProjectionEdge \Type{\Sname} \times \Type{\Status} \times T_{\City}\) and a unique projection edge \(\TT{S} \TrivialProjection \Type{\Sno}\). Similarly, \(\Type{\Sname}\), \(\Type{\Status}\), and \(\Type{\City}\) are all projections of \(\TSSC\), so we can add further non-unique projection edges to these nodes, as shown in Figure~\ref{fig-sig-s-alone}. -Looking at Figure~\ref{fig-sig-s-alone}, it could be argued that the \(T_{\dots}\) type notation is redundant and unnecessarily complicates the diagram. Why not just use, e.g., \(S\) instead of \(\Type{S}\), \(\Sno\) instead of \(\Type{\Sno}\), and \(\Sno \times \Sname \times \Status \times \City\) instead of \(\TT{S}\)? It is important to remember, however, that SIGs like that shown in Figure~\ref{fig-sig-s-alone} refer to multiple elements of quite different kinds, including relations, tuples, and various combinations of attributes. Discarding the type notation might be reasonable in informal situations, but could lead to confusion between the different kinds of elements represented. The explicit type notation protects the semantics of the various elements of the schema and makes it clear what is being considered. +Looking at Figure~\ref{fig-sig-s-alone}, it could be argued that the \(T_{\dots}\) type notation is redundant and unnecessarily complicates the diagram. Why not just use, e.g., \(S\) instead of \(\Type{S}\), \(\Sno\) instead of \(\Type{\Sno}\), and \(\Sno \times \Sname \times \Status \times \City\) instead of \(\TT{S}\)? It is important to remember, however, that SIGs like that shown in Figure~\ref{fig-sig-s-alone} refer to multiple elements of quite different kinds, including relations, tuples, and various combinations of attributes. Discarding the type notation might be reasonable in informal situations, but could lead to confusion between the different kinds of elements represented. The explicit type notation protects the semantics of the various elements of the schema and makes it clear what is being represented. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -578,7 +580,7 @@ Views and the base relvars they are derived from normally exist within the same ``base'' database schema (although this is not strictly required), which we shall refer to as \(\SC{0}\). However, an end user may be granted access only to the views, and thus not be aware of the existence of the base relvar(s), or vice versa. This effectively partitions the schema into logical sub-schemas \(\SC{1}, \SC{2}, \dotsc, \SC{n}\). -There are explicit and implicit constraints in \(\SC{0}\) that should also apply to the sub-schemas \cite{Date.C-2013a-View}. Ignoring these constraints in a sub-schema \(\SC{k}\) will affect its information capacity, and could negatively impact the information equivalence of view updates involving \(\SC{k}\). In the next section we will show how such constraints can be propagated into sub-schemas by rewriting them in terms of the sub-schema alone, which enables us to explicitly incorporate into the sub-schema the type information implicit in \(\SC{0}\) that would otherwise be omitted. We will then discuss the construction of SIGs for Date's restriction view example (Section~\ref{sec-date-sigs}), and show how to transform and compare these (Section~\ref{sec-transforming}). +There are explicit and implicit constraints in \(\SC{0}\) that should also apply to the sub-schemas \cite{Date.C-2013a-View}. Ignoring these constraints in a sub-schema \(\SC{k}\) will affect its information capacity, and could negatively impact the information equivalence of view updates involving \(\SC{k}\). In the next section we will show how such constraints can be propagated into sub-schemas by rewriting them in terms of the sub-schema alone, which enables us to explicitly incorporate into the sub-schema semantic information implicit in \(\SC{0}\) that would otherwise be lost. We will then discuss the construction of SIGs for Date's restriction view example (Section~\ref{sec-date-sigs}), and show how to transform and compare these (Section~\ref{sec-transforming}). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -689,7 +691,7 @@ \subsubsection{Schema \(\SC{1}\)} \label{sec-constraints-s-i} -\noindent A user granted access only to \(\SC{1}\) can see only relvar \(S\). Constraint \Constraint{\ref{constraint-key}}(a) can be propagated directly from \(\SC{0}\) into \(\SC{1}\) because it is already expressed in terms of \(S\) only. Constraints \Constraint{\ref{constraint-implied-types-london}}--\Constraint{\ref{constraint-implied-types-TSSNL}} are not expressed in terms of any named relvars in \(\SC{0}\), and thus can also be propagated directly from \(\SC{0}\) into \(\SC{1}\). Constraints \Constraint{\ref{constraint-key}}(b), \Constraint{\ref{constraint-key}}(c), and \Constraint{\ref{constraint-union}}--\Constraint{\ref{constraint-tuple-types}} cannot be propagated directly from \(\SC{0}\) to \(\SC{1}\), but can all be rewritten in terms of \(S\) only by substituting the definitions of \(\LS\) (equation~\ref{eqn-ls}) and \(\NLS\) (equation~\ref{eqn-nls}), as appropriate. To simplify expressions in the following discussion, we shall use \(\LSsub\) to represent \(\RelRestrict_{\City = \mathit{'London'}}S\) (\ref{eqn-ls}) and \(\NLSsub\) to represent \(\RelRestrict_{\City \ne \mathit{'London'}}S\) (\ref{eqn-nls}). (This also clearly distinguishes the substitutions from the \(\LS\) and \(\NLS\) relvars in \(\SC{0}\), \(\SC{2}\), and \(\SC{3}\).) A constraint that has been rewritten so that it can be propagated from \(\SC{0}\) into \(\SC{1}\) is denoted by, e.g., \Constraint[\SC{1}]{\ref{constraint-union}}. Consequently we can derive the following set of rewritten constraints for \(\SC{1}\): +\noindent A user granted access only to \(\SC{1}\) can see only relvar \(S\). Constraint \Constraint{\ref{constraint-key}}(a) can be propagated directly from \(\SC{0}\) into \(\SC{1}\) because it is already expressed in terms of \(S\) only. Constraints \Constraint{\ref{constraint-implied-types-london}}--\Constraint{\ref{constraint-implied-types-TSSNL}} are not expressed in terms of any named relvars in \(\SC{0}\), and thus can also be propagated directly from \(\SC{0}\) into \(\SC{1}\). Constraints \Constraint{\ref{constraint-key}}(b), \Constraint{\ref{constraint-key}}(c), and \Constraint{\ref{constraint-union}}--\Constraint{\ref{constraint-tuple-types}} cannot be propagated directly from \(\SC{0}\) into \(\SC{1}\), but can all be rewritten in terms of \(S\) only by substituting the definitions of \(\LS\) (equation~\ref{eqn-ls}) and \(\NLS\) (equation~\ref{eqn-nls}), as appropriate. To simplify expressions in the following discussion, we shall use \(\LSsub\) to represent \(\RelRestrict_{\City = \mathit{'London'}}S\) (\ref{eqn-ls}) and \(\NLSsub\) to represent \(\RelRestrict_{\City \ne \mathit{'London'}}S\) (\ref{eqn-nls}). (This also clearly distinguishes the substitutions from the \(\LS\) and \(\NLS\) relvars in \(\SC{0}\), \(\SC{2}\), and \(\SC{3}\).) A constraint that has been rewritten so that it can be propagated from \(\SC{0}\) into \(\SC{1}\) is denoted by, e.g., \Constraint[\SC{1}]{\ref{constraint-union}}. Consequently we can derive the following set of rewritten constraints for \(\SC{1}\): \begin{ConstraintList}[\SC{1}] \item \Sno\ is a key for each of \(\LSsub\) and \(\NLSsub\), therefore the functional dependency \(\{\Sno\} \rightarrow \{\Sname, \Status, \City\}\) also holds in each of (b) \(\LSsub\); and (c) \(\NLSsub\). [The fact that we are dealing with two disjoint subsets of \(\City\) does not change this.] @@ -1003,7 +1005,7 @@ \item[Duplicate nodes] in \(\SC{3}\) (producing \(\SC{3}'\)) that correspond to supertypes in \(\SC{1}\), in order to create nodes corresponding to the extra nodes that \(\SC{1}\) has over \(\SC{3}\). The goal is to produce a node structure similar to that of \(\SC{1}\). The duplicated nodes will be connected to their original nodes by trivial selection edges. These transformations (if applicable) will not alter the information capacity of \(\SC{3}'\) (i.e., \(\Equivalent{\SC{3}}{\SC{3}'}\)). - \item[Copy and move edges] across around the SIG to produce an edge structure similar to that of \(\SC{1}\). The annotations of each moved or copied edge will be determined by the path it is composed with, as discussed in Section~\ref{sec-sig-compare}. These transformations (if applicable) will \emph{increase} the information capacity of \(\SC{3}'\) (i.e., \(\Dominates{\SC{3}'}{\SC{3}}\)). + \item[Copy and move edges] around the SIG in order to produce an edge structure similar to that of \(\SC{1}\). The annotations of each moved or copied edge will be determined by the path it is composed with, as discussed in Section~\ref{sec-sig-compare}. These transformations (if applicable) will \emph{increase} the information capacity of \(\SC{3}'\) (i.e., \(\Dominates{\SC{3}'}{\SC{3}}\)). \end{description} @@ -1012,12 +1014,12 @@ \item[Delete duplicate nodes] in \(\SC{3}'\) that are no longer needed, especially those that do not exist in \(\SC{1}\). These transformations (if applicable) will not alter the information capacity of \(\SC{3}'\) (\(\equiv\)). - \item[Remove totality annotations] from trivial selection edges in \(\SC{3}'\) to change them into non-trivial selection edges. This decouples subtypes and supertypes to ensure that they represent different domains. These transformations (if applicable) will \emph{increase} the information capacity of \(\SC{3}'\) (\(\preceq\)). + \item[Remove totality annotations] from trivial selection edges in \(\SC{3}'\) to change them into non-trivial selection edges. This decouples duplicate nodes into subtypes and supertypes, ensuring that they represent different domains. These transformations (if applicable) will \emph{increase} the information capacity of \(\SC{3}'\) (\(\preceq\)). \end{description} For example, we might duplicate some nodes, remove some totality annotations, move or copy some edges, delete some no longer needed nodes, move or copy some more edges, then remove some more totality annotations. -Applying this process to \(\SC{3}\), we first duplicate node \(\Type{\LS}\) to \(\Type{S}\) and node \(\TT{\LS}\) to \(\TT{S}\). To obtain a node structure similar to that of \(\SC{1}\), we also need to further duplicate \(\Type{S}\) to \(\Type{\NLS}\) and \(\TT{S}\) to \(\TT{\NLS}\). The result of these duplications is shown in Figure~\ref{fig-transform-all-duplicates}. The transformed SIG now has essentially the same node structure as that for \(\SC{1}\). +Applying this process to \(\SC{3}\), we first duplicate node \(\Type{\LS}\) to \(\Type{S}\) and node \(\TT{\LS}\) to \(\TT{S}\). To obtain a node structure similar to that of \(\SC{1}\), we also need to further duplicate \(\Type{S}\) to \(\Type{\NLS}\) and \(\TT{S}\) to \(\TT{\NLS}\). The result of these duplications is shown in Figure~\ref{fig-transform-all-duplicates}. The transformed SIG now has the same node structure as that for \(\SC{1}\). %%%%%%%%%%%%%%%%%%%% @@ -1086,9 +1088,9 @@ The next phase is to copy and move edges. To produce the edge \(\Type{S} \SurTotal \TT{S}\), we can first \emph{copy} \(\Type{\LS} \SurTotal \TT{\LS}\) across the path \(\TT{S} \TrivialSelection \TT{\LS}\) (giving \(\Type{\LS} \SurTotal \TT{S}\)), then \emph{move} it across \(\Type{S} \TrivialSelection \Type{\LS}\). We can carry out a similar series of transformations to copy and move \(\Type{S} \SurTotal \TT{S}\) to \(\Type{\NLS} \SurTotal \TT{\NLS}\). -Next, we copy and move the projection edge \(\TT{\LS} \TrivialProjection \Type{\Sno}\) to produce the edges \(\TT{S} \TrivialProjection \Type{\Sno}\) and \(\TT{\NLS} \TrivialProjection \Type{\Sno}\). Similarly, we copy and move \(\TT{\LS} \ProjectionEdge \TSSL\) to produce \(\TT{S} \LabelledEdge{\sigedge{12}}{\RelProject} \TSSC\) and \(\TT{\NLS} \LabelledEdge{\sigedge{02}}{\RelProject} \TSSNL\). Note that the first edge in the latter pair of transformations loses the totality annotation, and the second edge the surjectivity annotation, as a result of composition with the pre-existing non-bijective selection edges running vertically down the centre of the SIG. +Next, we copy and move the unique projection edge \(\TT{\LS} \TrivialProjection \Type{\Sno}\) to produce the edges \(\TT{S} \TrivialProjection \Type{\Sno}\) and \(\TT{\NLS} \TrivialProjection \Type{\Sno}\). Similarly, we copy and move \(\TT{\LS} \ProjectionEdge \TSSL\) to produce \(\TT{S} \LabelledEdge{\sigedge{12}}{\RelProject} \TSSC\) and \(\TT{\NLS} \LabelledEdge{\sigedge{02}}{\RelProject} \TSSNL\). Note that the first edge in the latter pair of transformations loses its totality annotation, and the second edge its surjectivity annotation, as a result of composition with the pre-existing non-bijective selection edges between the constructed tuple types running vertically down the centre of the SIG. -Finally, we carry out similar copy and move transformations to copy the key edge \(\Type{\Sno} \KeyEdge \TSSC\) to \(\Type{\Sno} \LabelledEdge{\sigedge{03}}{\mathit{key}} \TSSL\), and \(\Type{\Sno} \LabelledEdge{\sigedge{03}}{\mathit{key}} \TSSNL\). Both new edges lose the totality annotation for the same reason as the projection edges above. The result of all these edge transformations is shown in Figure~\ref{fig-transform-edge-moves}. +Finally, we carry out similar copy and move transformations to copy the key edge \(\Type{\Sno} \KeyEdge \TSSC\) to \(\Type{\Sno} \LabelledEdge{\sigedge{03}}{\mathit{key}} \TSSL\), and \(\Type{\Sno} \LabelledEdge{\sigedge{03}}{\mathit{key}} \TSSNL\). Both new edges lose their totality annotations for the same reason as the projection edges above. The result of all these edge transformations is shown in Figure~\ref{fig-transform-edge-moves}. %%%%%%%%%%%%%%%%%%%% @@ -1170,7 +1172,7 @@ %%%%%%%%%%%%%%%%%%%% -If we compare Figure~\ref{fig-sig-s} with Figure~\ref{fig-transform-edge-moves}, we can see that the transformed SIG for \(\SC{3}'\) now has the same topology as the SIG for \(\SC{1}\). We still need to remove the totality annotations from the newly created trivial selection edges on the left, but even when this is done, we do not achieve the desired isomorphism, as the edges \(\TT{S} \LabelledEdge{\sigedge{12}}{\RelProject} \TSSC\), \(\TT{\NLS} \LabelledEdge{\sigedge{02}}{\RelProject} \TSSNL\), \(\Type{\Sno} \LabelledEdge{\sigedge{03}}{\mathit{key}} \TSSL\), and \(\Type{\Sno} \LabelledEdge{\sigedge{03}}{\mathit{key}} \TSSNL\) all have different annotations from the corresponding edges in the SIG for \(\SC{1}\). This clearly shows that the information capacities of \(\SC{1}\) and \(\SC{3}\) are incomparable, and that view updates based on this pair of schemas will be problematic. +If we compare Figure~\ref{fig-sig-s} with Figure~\ref{fig-transform-edge-moves}, we can see that the transformed SIG for \(\SC{3}'\) now has the same topology as the SIG for \(\SC{1}\). We still need to remove the totality annotations from the newly created trivial selection edges between the relation and tuple types on the left, but even when this is done, we do not achieve the desired isomorphism, as the edges \(\TT{S} \LabelledEdge{\sigedge{12}}{\RelProject} \TSSC\), \(\TT{\NLS} \LabelledEdge{\sigedge{02}}{\RelProject} \TSSNL\), \(\Type{\Sno} \LabelledEdge{\sigedge{03}}{\mathit{key}} \TSSL\), and \(\Type{\Sno} \LabelledEdge{\sigedge{03}}{\mathit{key}} \TSSNL\) all have different annotations from the corresponding edges in the SIG for \(\SC{1}\). This clearly shows that the information capacities of \(\SC{1}\) and \(\SC{3}\) are incomparable, and that view updates based on this pair of schemas will be problematic. If we were to create a fourth sub-schema \(\SC{4} = \{\NLS\}\) and compare this with \(\SC{1}\), the result would be similar to that for \(\SC{3}\). @@ -1183,17 +1185,17 @@ \noindent In this paper we have shown how to use the concept of relative information capacity and an extension of the schema intension graph (SIG) formalism to more formally model Date's notion of information equivalence with respect to relational view updates. Our analysis of Date's restriction view example \cite{Date.C-2013a-View} using these techniques is consistent with his findings, so the approach shows promise. The information capacities of \(\SC{1} = \{S\}\) and \(\SC{2} = \{\LS,\NLS\}\) are equivalent, so it should be possible to effectively propagate view updates between \(\SC{1}\) and \(\SC{2}\), regardless of the configuration of views and base relvars. Conversely, the information capacity of \(\SC{3} = \{\LS\}\) is not comparable with that of \(\SC{1}\), implying that there are updates that cannot be propagated between \(\SC{1}\) and \(\SC{3}\). -Our analysis shows the importance of clearly identifying all explicit and implicit constraints that can be propagated from the base schema \(\SC{0}\) to its sub-schemas. Omission of constraints could lead to a different SIG structure that might not be comparable, or might lead to an erroneous conclusion about a schema's information capacity relative to another. +Our analysis reinforces the importance of explicitly identifying all constraints that can be propagated from the base schema \(\SC{0}\) to its sub-schemas. Omission of constraints could lead to a different SIG structure that might lead to an erroneous conclusion about a schema's information capacity relative to another. -There are several avenues for future work. First, the proof of concept discussed in this paper only deals with disjoint restriction views, which are arguably the simplest case of view updating. Further work needs to be done to characterise other types of view configuration, such as projection views, join views, and union views. This will enable us to test and refine the application of SIGs in this context. +There are several avenues for future work. First, the proof of concept discussed in this paper only deals with disjoint restriction views, which are arguably the simplest case of view updating. Further work needs to be done to characterise other types of view configuration, such as projection views, union views, and in particular, join views. This will enable us to test and refine the application of SIGs in this context. The discussion in this paper focused only on the relational context, but relative information capacity and the SIG formalism are data model agnostic. It would therefore be interesting to investigate the application of these techniques to non-relational view updating contexts (e.g., XML). -It is unclear at present whether the omission of inclusion dependencies such as foreign keys is significant. If such dependencies are significant, the question then is how best to represent them in the SIG formalism. +It is unclear at present whether the omission of inclusion dependencies such as foreign keys is significant. If such dependencies are significant, the question then is how best to represent them in the SIG formalism. Adding such constraints will clearly change the structure of a SIG, but we suspect that this will not change the relationship between the relative information capacities of different schemas. % It is also unclear at present whether copying or moving an edge across a bijective path should necessarily increase information capacity, as no annotations are changed in such a transformation. It may be that the structural change to the SIG implies an increase in information capacity regardless. Further investigation is required. -Finally, it is interesting to note that with the equivalent schemas \(\SC{1}\) and \(\SC{2}\), we were able to propagate all of the constraints from the base schema \(\SC{0}\) (Section~\ref{sec-constraints-s-ii}), whereas with the incomparable schemas \(\SC{1}\) and \(\SC{3}\), only a subset of the constraints could be propagated (Section~\ref{sec-constraints-s-iii}). Intuitively, it seems reasonable to conjecture that the degree to which constraints can be propagated from \(\SC{0}\) to its sub-schemas could be an indicator of information capacity equivalence among the sub-schemas, but this needs to be further investigated. +Finally, it is interesting to note that with the equivalent schemas \(\SC{1}\) and \(\SC{2}\), we were able to propagate all of the constraints from the base schema \(\SC{0}\) (Section~\ref{sec-constraints-s-ii}), whereas with the incomparable schemas \(\SC{1}\) and \(\SC{3}\), only a subset of the constraints could be propagated (Section~\ref{sec-constraints-s-iii}). Intuitively, it seems reasonable to conjecture that the degree to which constraints can be propagated from \(\SC{0}\) to its sub-schemas could be a direct indicator of information equivalence among the sub-schemas, but this needs to be further investigated. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%