GitBucket
4.21.2
Toggle navigation
Snippets
Sign in
Files
Branches
12
Releases
Issues
Pull requests
Labels
Priorities
Milestones
Wiki
Forks
nigel.stanger
/
Publications
Browse code
• Added example of text output.
DP_2017
1 parent
6a25075
commit
ba06acc06acc9cb69b15841850977b00b1904aba
Nigel Stanger
authored
on 21 Jul 2017
Patch
Showing
2 changed files
Koli_2017/Koli_2017_Stanger.tex
Koli_2017/notes.txt
Ignore Space
Show notes
View
Koli_2017/Koli_2017_Stanger.tex
\documentclass[sigconf, authordraft]{acmart} \usepackage{tcolorbox} % \title{(Mis)using unit testing to semi-automatically grade SQL schemas} \title{Semi-automated grading of SQL schemas \\ by (mis)use of database unit testing} \author{Nigel Stanger} \orcid{orcid.org/0000-0003-3450-7443} \affiliation{ \institution{University of Otago} \department{Department of Information Science} \city{Dunedin} \country{New Zealand} } \email{nigel.stanger@otago.ac.nz} \begin{document} \begin{abstract} abstract \end{abstract} \maketitle \cite{Bhangdiya.A-2015a-XDa-TA,Chandra.B-2015a-Data,Chandra.B-2016a-Partial,Dekeyser.S-2007a-Computer,Kearns.R-1997a-A-teaching,Prior.J-2004a-Backwash,Russell.G-2005a-Online,Gong.A-2015a-CS-121-Automation,Farre.C-2008a-SVTe,Dietrich.S-1997a-WinRDBI,Binnig.C-2008a-Multi-RQP,Chays.D-2008a-Query-based,Marcozzi.M-2012a-Test,Haller.K-2010a-Test,Vatanawood.W-2004a-Formal,Lukovic.I-2003a-Proceedings,Bench-Capon.T-1998a-Report,Spivey.J-1989a-An-introduction,Choppella.V-2006a-Constructing,Ambler.S-2006a-Database} \section{Introduction} Any introductory database course needs to cover several core concepts, including what is a database, what is a logical data model, and how to create and interact with a database. Typically such courses will focus on the Relational Model and its embodiment in SQL database management systems (DBMSs). This is partly because the Relational Model provides a sound theoretical framework for discussing key database concepts [cite], and partly because SQL DBMSs are still widely used. The shadow of SQL is so strong that even non-relational systems have adopted some form of SQL-like language in order to leverage existing knowledge (e.g., OQL \cite{Cattell.R-2000a-ODMG3}, HiveQL \cite{Apache-2017a-Hive}, and CQL \cite{Apache-2017a-CQL}). Courses that teach SQL usually include one or more assessments that test students' SQL skills. These test students' ability to create a database using SQL data definition (DDL) statements, and to interact with the database using SQL data manipulation (DML) statements. Manually grading such code can be a slow, tedious, and potentially error-prone process. Automating the grading process enables faster turnaround times and greater consistency [cite]. If the grading can be done in real time, the grading tool could become part of a larger, interactive SQL learning environment \cite{Kenny.C-2005a-Automated,Kleiner.C-2013a-Automated,Mitrovic.A-1998a-Learning,Russell.G-2004a-Improving,Sadiq.S-2004a-SQLator}. There have been many prior efforts to automatically grade SQL DML (see Section~\ref{sec-literature}), we have been unable to find any similar systems for automatically grading SQL DDL. In our department, we offered typical introductory papers on database systems. INFO 212 was offered from 1997(?) to 2011, and was a dedicated semester-long course (13 weeks). It was replaced by INFO 214 in 2012, which included 6\(\frac{1}{2}\) weeks of core database material (the remainder of the paper covered data communications and networking). It was discontinued at the end of 2016. Over the years that these two papers were offered, we tried several different approaches to formulating and grading SQL DDL assessments. The three most significant were: \begin{enumerate} \item Allow students to choose and code their own scenario. It could be argued that this could boost student interest in the assessment, as they can work on a problem domain that interests them. It does however mean that every student's submission is different, and makes the grading process harder. \item Assign a standard scenario, but leave some elements under-specified. This improves the grading experience, but there is still the possibility of variation among student submissions, as they may interpret the under-specified elements in different ways. This is particularly problematic to automate if they choose different names for tables and columns, or implement a different structure. \item Provide a detailed specification of a standard scenario. This can be presented as the detailed output from the requirements analysis phase. Students are told that they need to adhere closely to the specification, as other developers will be independently using the same specification to implement end-user applications. Students still have some room to alter things, but such changes cannot affect the view of the database seen by clients. This approach tests both the studnets' ability to write SQL DDL, and to interpret and correctly convert a written database specification into a corresponding SQL schema. \end{enumerate} The third approach was used from 2009 until 2016 (?dates), and was what inspired the work discussed in this paper. The third approach is also the most amenable to autmoation, as much of the assessment specification is fixed in advance, with less room for deviation. Prior approaches to grading SQL DDL have focused on the \texttt{CREATE TABLE} syntax, but we have taken a different approach, where we verify that the implemented schema conforms to the behaviour expected from the original specification. If the student achieves this, then by definition the DDL syntax must be correct (weakness: we do not consider coding style). This enables us to focus less on the specifics of the syntax and more on whether students have implemnted the requirements correctly. The requirements specification for the assessment is tightly defined, which means it can be readily codified in machine-readable form. Rather than attempt to parse and check the \texttt{CREATE TABLE} statements directly, we instead issue queries against the schema's metadata (catalog), and compare the results of these queries against the machine-readable version of the specification. The process then effectively becomes one of unit testing the schema against the original requirements. In our implementation, we used the PHPunit database unit testing framework to carry out this process, albeit in a somewhat unorthodox way (see Section~\ref{sec-architecture}). % original schema is codified in machine-readable form % rather than attempt to parse CREATE TABLE statements, simply execute the DDL code to generate the database schema in the target DBMS, then run queries against the schema's metadata % use a database unit testing framework (PHPUnit) to automate \section{Prior work} \label{sec-literature} \section{Architecture} \label{sec-architecture} % grey 184 184 184 (72%) % green 46 161 31 % red 146 23 29 % grey text 177 177 177 (69%) \tcbset{boxsep=0pt,boxrule=0pt,arc=0pt,left=0pt,right=0pt,top=0.5pt,bottom=0.5pt} \definecolor{bg grey}{rgb}{0.72,0.72,0.72} \begin{table} \ttfamily\scriptsize \begin{tabbing} 0123\=\kill ------------------------------------------------------------ \\ \tcbox[colback=bg grey]{NOTE: Checking structure of table Product.} \\ TEST: [[ Product ]] \\ \> + OK \\ +++ PASSED: Table Product exists. \\ TEST: [[ Product.Product\_code ]] \\ \> + OK \\ ... \\ +++ PASSED: Table Product contains all the expected columns. \\ TEST: [[ Product.Product\_code: data type is NUMBER | INTEGER ]] \\ \> + OK \\ ... \\ +++ PASSED: All columns of table Product have data types compatible with the\\ specification. \\ TEST: [[ Product.Product\_code precision and scale = 8 (with scale 0) ]] \\ \> + OK \\ ... \\ +++ PASSED: All columns of table Product have lengths compatible with the \\ specification. \\ TEST: [[ Product.Product\_code nullability should be N ]] \\ \> + OK \\ ... \\ +++ PASSED: All columns of table Product have the expected nullability. \\ TEST: [[ Product PK ]] \\ \> + OK \\ +++ PASSED: Primary key of table Product exists. \\ TEST: [[ Product PK: Product\_code ]] \\ \> + OK \\ +++ PASSED: Primary key of table Product includes (only) the expected \\ columns. \\ TEST: [[ Product check constraint PRODUCT\_STOCK\_INVALID ]] \\ \> + OK \\ ... \\ +++ PASSED: All constraints of table Product that should be are explicitly \\ named. \\ NOTE: Testing constraints of table Product. \\ TEST: [[ Product.Stock\_count accepts “0” ]] \\ \> + OK \\ TEST: [[ Product.Stock\_count accepts “99999” ]] \\ \> + OK \\ TEST: [[ Product.Restock\_level accepts “0” ]] \\ \> - FAILED! Column Product.Restock\_level won’t accept legal value 0. \\ Failed asserting that false is true. \\ TEST: [[ Product.Restock\_level accepts “99999” ]] \\ \> + OK \\ TEST: [[ Product.Minimum\_level accepts “0” ]] \\ \> - FAILED! Column Product.Minimum\_level won’t accept legal value 0. \\ Failed asserting that false is true. \\ TEST: [[ Product.Minimum\_level accepts “653” ]] \\ \> + OK \\ TEST: [[ Product.List\_price accepts “0” ]] \\ \> + OK \\ TEST: [[ Product.List\_price accepts “99999.99” ]] \\ \> + OK \\ --- FAILED: 2 of 8 legal values tested were rejected by a CHECK constraint. \end{tabbing} \caption{Example of output} \end{table} \section{Evaluation} \label{sec-evaluation} \section{Conclusions \& future work} \label{sec-conclusion} \newpage \bibliographystyle{ACM-Reference-Format} \bibliography{Koli_2017_Stanger} \end{document}
\documentclass[sigconf, authordraft]{acmart} % \title{(Mis)using unit testing to semi-automatically grade SQL schemas} \title{Semi-automated grading of SQL schemas by (mis)use of database unit testing} \author{Nigel Stanger} \orcid{orcid.org/0000-0003-3450-7443} \affiliation{ \institution{University of Otago} \department{Department of Information Science} \city{Dunedin} \country{New Zealand} } \email{nigel.stanger@otago.ac.nz} \begin{document} \begin{abstract} abstract \end{abstract} \maketitle \cite{Bhangdiya.A-2015a-XDa-TA,Chandra.B-2015a-Data,Chandra.B-2016a-Partial,Dekeyser.S-2007a-Computer,Kearns.R-1997a-A-teaching,Prior.J-2004a-Backwash,Russell.G-2005a-Online,Gong.A-2015a-CS-121-Automation,Farre.C-2008a-SVTe,Dietrich.S-1997a-WinRDBI,Binnig.C-2008a-Multi-RQP,Chays.D-2008a-Query-based,Marcozzi.M-2012a-Test,Haller.K-2010a-Test,Vatanawood.W-2004a-Formal,Lukovic.I-2003a-Proceedings,Bench-Capon.T-1998a-Report,Spivey.J-1989a-An-introduction,Choppella.V-2006a-Constructing,Ambler.S-2006a-Database} \section{Introduction} Any introductory database course needs to cover several core concepts, including what is a database, what is a logical data model, and how to create and interact with a database. Typically such courses will focus on the Relational Model and its embodiment in SQL database management systems (DBMSs). This is partly because the Relational Model provides a sound theoretical framework for discussing key database concepts [cite], and partly because SQL DBMSs are still widely used. The shadow of SQL is so strong that even non-relational systems have adopted some form of SQL-like language in order to leverage existing knowledge (e.g., OQL \cite{Cattell.R-2000a-ODMG3}, HiveQL \cite{Apache-2017a-Hive}, and CQL \cite{Apache-2017a-CQL}). Courses that teach SQL usually include one or more assessments that test students' SQL skills. These test students' ability to create a database using SQL data definition (DDL) statements, and to interact with the database using SQL data manipulation (DML) statements. Manually grading such code can be a slow, tedious, and potentially error-prone process. Automating the grading process enables faster turnaround times and greater consistency [cite]. If the grading can be done in real time, the grading tool could become part of a larger, interactive SQL learning environment \cite{Kenny.C-2005a-Automated,Kleiner.C-2013a-Automated,Mitrovic.A-1998a-Learning,Russell.G-2004a-Improving,Sadiq.S-2004a-SQLator}. There have been many prior efforts to automatically grade SQL DML (see Section~\ref{sec-literature}), we have been unable to find any similar systems for automatically grading SQL DDL. In our department, we offered typical introductory papers on database systems. INFO 212 was offered from 1997(?) to 2011, and was a dedicated semester-long course (13 weeks). It was replaced by INFO 214 in 2012, which included 6\(\frac{1}{2}\) weeks of core database material (the remainder of the paper covered data communications and networking). It was discontinued at the end of 2016. Over the years that these two papers were offered, we tried several different approaches to formulating and grading SQL DDL assessments. The three most significant were: \begin{enumerate} \item Allow students to choose and code their own scenario. It could be argued that this could boost student interest in the assessment, as they can work on a problem domain that interests them. It does however mean that every student's submission is different, and makes the grading process harder. \item Assign a standard scenario, but leave some elements under-specified. This improves the grading experience, but there is still the possibility of variation among student submissions, as they may interpret the under-specified elements in different ways. This is particularly problematic to automate if they choose different names for tables and columns, or implement a different structure. \item Provide a detailed specification of a standard scenario. This can be presented as the detailed output from the requirements analysis phase. Students are told that they need to adhere closely to the specification, as other developers will be independently using the same specification to implement end-user applications. Students still have some room to alter things, but such changes cannot affect the view of the database seen by clients. This approach tests both the studnets' ability to write SQL DDL, and to interpret and correctly convert a written database specification into a corresponding SQL schema. \end{enumerate} The third approach was used from 2009 until 2016 (?dates), and was what inspired the work discussed in this paper. The third approach is also the most amenable to autmoation, as much of the assessment specification is fixed in advance, with less room for deviation. Prior approaches to grading SQL DDL have focused on the \texttt{CREATE TABLE} syntax, but we have taken a different approach, where we verify that the implemented schema conforms to the behaviour expected from the original specification. If the student achieves this, then by definition the DDL syntax must be correct (weakness: we do not consider coding style). This enables us to focus less on the specifics of the syntax and more on whether students have implemnted the requirements correctly. The requirements specification for the assessment is tightly defined, which means it can be readily codified in machine-readable form. Rather than attempt to parse and check the \texttt{CREATE TABLE} statements directly, we instead issue queries against the schema's metadata (catalog), and compare the results of these queries against the machine-readable version of the specification. The process then effectively becomes one of unit testing the schema against the original requirements. In our implementation, we used the PHPunit database unit testing framework to carry out this process, albeit in a somewhat unorthodox way (see Section~\ref{sec-architecture}). % original schema is codified in machine-readable form % rather than attempt to parse CREATE TABLE statements, simply execute the DDL code to generate the database schema in the target DBMS, then run queries against the schema's metadata % use a database unit testing framework (PHPUnit) to automate \section{Prior work} \label{sec-literature} \section{Architecture} \label{sec-architecture} \section{Evaluation} \label{sec-evaluation} \section{Conclusions \& future work} \label{sec-conclusion} \newpage \bibliographystyle{ACM-Reference-Format} \bibliography{Koli_2017_Stanger} \end{document}
Ignore Space
Show notes
View
Koli_2017/notes.txt
Numerous prior systems for automatic grading of student queries, but these are all focused on the DML side of things. • esql (Kearns et al., 2008) supports CREATE TABLE only as a pass-through to set up or modify a schema. It uses its own internal DBMS. • aSQLg (Kleiner et al., 2013) only supports SELECT. Works with any back end DBMS. Uses the DBMS as a syntax checker. • (Kenny & Pahl, 2005) queries only. • ActiveSQL (Cumming & Russell, 2005; Russell & Cumming, 2005) only supports SELECT. • SQLify (Dekeyser et al., 2007) only supports SELECT. • XData (Bhangdiya et al., 2015, Chandra et al., 2015; Chandra et al., 2016) only supports SELECT. • SQL-Tutor (Mitrovic, 1998) is an intelligent tutoring system that only only supports SELECT. • SQLator (Sadiq et al., 2004) only supports SELECT. • AsseSQL (Prior & Lister, 2004) only supports SELECT. • RDBI (Dietrich et al., 1997) supports relational algebra, domain and tuple relational calculus, and SQL (SELECT only). Uses its own internal DBMS. • “CS 121 Automation Tool” (anjoola, cs12x-automate, GitHub, last updated 2015) Automated marking system for SQL that appears be customisable for all kinds of statements? However the code to test CREATE TABLE “Simply executes the CREATE TABLE statements to make sure they have the correct syntax”. It doesn't check against any form of specification. On the pure testing side of things: • SVTe (Farré et al., 2008) tests the “correctness” of a schema, but focuses mainly on consistency of constraints. • Ambler (n.d., http://www.agiledata.org/essays/databaseTesting.html) talks purely about testing the database functionality. There appears to be very little work on automated grading of students’ schema definitions. No-one has used unit testing as a framework for doing so. What use of unit testing there is, is more focused on testing database /applications/ and effective generation of test cases: • (Binnig et al., 2013) Generating test databases (fixtures) for OLTP applications. • (Chays & Shahid, 2008) Generating test inputs for database applications based on analysing the queries used. • (Marcozzi et al, 2012) Generating test inputs for database applications using a formal language (ImperDB). No-one appears to have tried to automatically test whether an extant schema is consistent with its specification. This is almost more of a formal methods thing? These are more focused on /generating/ a conforming schema from the specification (e.g., Vatanawood & Rivepiboon, 2004; Luković et al., 2003), rather than checking that an already existing schema conforms. Also more focused on schema /transformations/ and evolution (e.g., Bench-Capon et al., 1998). ------------------------------------------------------------ NOTE: Checking structure of table Product. TEST: [[ Product ]] + OK +++ PASSED: Table Product exists. TEST: [[ Product.Product_code ]] + OK ... +++ PASSED: Table Product contains all the expected columns. TEST: [[ Product.Product_code: data type is NUMBER | INTEGER ]] + OK ... +++ PASSED: All columns of table Product have data types compatible with the specification. TEST: [[ Product.Product_code precision and scale = 8 (with scale 0) ]] + OK ... +++ PASSED: All columns of table Product have lengths compatible with the specification. TEST: [[ Product.Product_code nullability should be N ]] + OK ... +++ PASSED: All columns of table Product have the expected nullability. TEST: [[ Product PK ]] + OK +++ PASSED: Primary key of table Product exists. TEST: [[ Product PK: Product_code ]] + OK +++ PASSED: Primary key of table Product includes (only) the expected columns. TEST: [[ Product check constraint PRODUCT_STOCK_INVALID ]] + OK ... +++ PASSED: All constraints of table Product that should be are explicitly named. NOTE: Testing constraints of table Product. TEST: [[ Product.Stock_count accepts “0” ]] + OK TEST: [[ Product.Stock_count accepts “99999” ]] + OK TEST: [[ Product.Restock_level accepts “0” ]] - FAILED! Column Product.Restock_level won’t accept legal value 0 [-0.5]. Failed asserting that false is true. TEST: [[ Product.Restock_level accepts “99999” ]] + OK TEST: [[ Product.Minimum_level accepts “0” ]] - FAILED! Column Product.Minimum_level won’t accept legal value 0 [-0.5]. Failed asserting that false is true. TEST: [[ Product.Minimum_level accepts “653” ]] + OK TEST: [[ Product.List_price accepts “0” ]] + OK TEST: [[ Product.List_price accepts “99999.99” ]] + OK --- FAILED: 2 of 8 legal values tested were rejected by a CHECK constraint.
Numerous prior systems for automatic grading of student queries, but these are all focused on the DML side of things. • esql (Kearns et al., 2008) supports CREATE TABLE only as a pass-through to set up or modify a schema. It uses its own internal DBMS. • aSQLg (Kleiner et al., 2013) only supports SELECT. Works with any back end DBMS. Uses the DBMS as a syntax checker. • (Kenny & Pahl, 2005) queries only. • ActiveSQL (Cumming & Russell, 2005; Russell & Cumming, 2005) only supports SELECT. • SQLify (Dekeyser et al., 2007) only supports SELECT. • XData (Bhangdiya et al., 2015, Chandra et al., 2015; Chandra et al., 2016) only supports SELECT. • SQL-Tutor (Mitrovic, 1998) is an intelligent tutoring system that only only supports SELECT. • SQLator (Sadiq et al., 2004) only supports SELECT. • AsseSQL (Prior & Lister, 2004) only supports SELECT. • RDBI (Dietrich et al., 1997) supports relational algebra, domain and tuple relational calculus, and SQL (SELECT only). Uses its own internal DBMS. • “CS 121 Automation Tool” (anjoola, cs12x-automate, GitHub, last updated 2015) Automated marking system for SQL that appears be customisable for all kinds of statements? However the code to test CREATE TABLE “Simply executes the CREATE TABLE statements to make sure they have the correct syntax”. It doesn't check against any form of specification. On the pure testing side of things: • SVTe (Farré et al., 2008) tests the “correctness” of a schema, but focuses mainly on consistency of constraints. • Ambler (n.d., http://www.agiledata.org/essays/databaseTesting.html) talks purely about testing the database functionality. There appears to be very little work on automated grading of students’ schema definitions. No-one has used unit testing as a framework for doing so. What use of unit testing there is, is more focused on testing database /applications/ and effective generation of test cases: • (Binnig et al., 2013) Generating test databases (fixtures) for OLTP applications. • (Chays & Shahid, 2008) Generating test inputs for database applications based on analysing the queries used. • (Marcozzi et al, 2012) Generating test inputs for database applications using a formal language (ImperDB). No-one appears to have tried to automatically test whether an extant schema is consistent with its specification. This is almost more of a formal methods thing? These are more focused on /generating/ a conforming schema from the specification (e.g., Vatanawood & Rivepiboon, 2004; Luković et al., 2003), rather than checking that an already existing schema conforms. Also more focused on schema /transformations/ and evolution (e.g., Bench-Capon et al., 1998).
Show line notes below