• Added results of database complexity metrics.
1 parent 7f4b457 commit 299149a927a783c731aba9251627405c6924d31b
Nigel Stanger authored on 3 Aug 2017
Showing 2 changed files
View
38
Koli_2017/Koli_2017_Stanger.tex
Second, the observed effect may be due to the imposition of minimum requirements from 2013 onwards. Since any schema that met the minimum requirements would score at least 50\%, we could reasonably expect this to have a positive effect on the mean. If we compare with the period where minimum requirements were not enforced, we do find a significant increase in the mean, from 72.4\% to 75.6\% (\(p \approx 0.047\)). We argue, however, that this effect would be minimal in the absence of a convenient mechanism for students to check conformance with the minimum requirements. Indeed, for 2013--2014 when student mode was available, the mean (81.7\%) was significantly higher than for 2015--2016 (70.2\%) when student mode was not available (\(p \approx 10^{-9}\)). As further confirmation, there was no significant difference between the means for 2009--2012 and 2015--2016.
 
Third, the switch to second semester in 2012--2013 could have negatively impacted students' performance by increasing the length of time between their exposure to basic data management concepts in first year, and their entry into the second year database course. In effect, they had longer to forget relevant material they learned in first year. If so, we could reasonably expect the grades in second semester offerings of the course to be lower. However, grades for second semester offerings of the course (2012--2013, mean 76.9\%) were significantly \emph{higher} (\(p \approx 0.015\)) than those for first semester offerings (2009--2011 and 2014--2016, mean 72.9\%). This should not be surprising, given that 2013 (second semester) had the highest grades overall. This effectively rules out semester changes as a factor.
 
Fourth, perhaps the years with higher grades used less complex scenarios. To test this, we computed a collection of different database complexity metrics \cite{Jamil.B-2010a-SMARtS,Piattini.M-2001a-Table,Pavlic.M-2008a-Database,Calero.C-2001a-Database,Sinha.B-2014a-Estimation} for each of the four scenarios used across the period. These showed that the ``BDL'', ``used cars'', and ``student records'' scenarios were all of similar complexity, while the ``postgrad'' scenario was about \(\frac{2}{3}\) the complexity of the others. It therefore seems unlikely that scenario complexity is a factor in performance differences. It is also interesting to note that the ``used cars'' scenario was used in 2014 and 2015, and yet the 2015 grades were significantly \emph{lower} than those for 2014. The only clear difference here is that our system was not used in 2015.
Fourth, perhaps the years with higher grades used less complex---and therefore easier---scenarios. To test this, we computed the following database complexity metrics for each of the four scenarios used: database complexity index (DCI) \cite{Sinha.B-2014a-Estimation}; referential degree (RD), depth of referential tree (DRT), and number of attributes (NA) \cite{Calero.C-2001a-Database,Piattini.M-2001a-Table}; database complexity (DC) \cite{Pavlic.M-2008a-Database}; and ``Software Metric Analyzer for Relational Database Systems'' (SMARtS)
\cite{Jamil.B-2010a-SMARtS}. The results are shown in \cref{tab-metrics}. All but the DRT metric showed that the ``BDL'', ``used cars'', and ``student records'' scenarios were of comparable complexity, while the ``postgrad'' scenario was less complex. It therefore seems unlikely that scenario complexity is a factor in the performance differences. It is also interesting to note that the ``used cars'' scenario was used in 2014 and 2015, and yet the 2015 grades were significantly \emph{lower} than those for 2014. The only clear difference here is that our system was not used in 2015.
 
 
\begin{table}
\begin{tabular}{lrrrrrr}
\toprule
Scenario & DCI & RD & NA & DRT & DC & SMARtS \\
\midrule
``postgrad'' & 277 & 9 & 32 & 7 & 37 & 28.75 \\
``student records'' & 367 & 12 & 43 & 9 & 61 & 38.25 \\
``BDL'' & 370 & 11 & 50 & 8 & 59 & 40.75 \\
``used cars'' & 380 & 13 & 46 & 7 & 53 & 36.50 \\
\bottomrule
\end{tabular}
\caption{Database complexity metrics for the scenarios used over the period 2009--2016.}
\label{tab-metrics}
\end{table}
 
 
Fifth, class size could be a factor. We might plausibly expect a smaller class to have a more collegial atmosphere that promotes better learning. However, if we look at the sizes of the classes in \cref{tab-data}, we can see no discernible pattern between class size and performance. Indeed, both the best (2013) and worst (2012, 2015) performances came from classes of similar size (75, 77, and 71, respectively).
 
Sixth, it could be that better performance occurred in years where the students were just more capable in general. We obtained GPA data for the students enrolled in each year, and computed the median as an indication of the general capability of the class. Looking at \cref{tab-data}, we can immediately see that the year with the best results (2013) was also the year with the second-lowest median GPA (3.2). Contrast this with the poor performance in 2012, where the median GPA was 3.4. Indeed, in both years that student mode was available, median GPA was lower than in many other years, yet performance was better than in years with higher median GPA. This argues against the idea that we simply had a class full of very capable students in the better performing years.
View
Koli_2017/db_metrics.xlsx
Not supported