diff --git a/tiddlers/content/labs/05/_Labs_05_Code injection.md b/tiddlers/content/labs/05/_Labs_05_Code injection.md deleted file mode 100644 index b898ea8..0000000 --- a/tiddlers/content/labs/05/_Labs_05_Code injection.md +++ /dev/null @@ -1,20 +0,0 @@ -Once again, xkcd has a pertinent cartoon for our lab topic: - -[![https://xkcd.com/327/](https://imgs.xkcd.com/comics/exploits_of_a_mom.png)](https://xkcd.com/327/) -[Source: https://xkcd.com/327/] - -By way of explanation, a carefully crafted piece of text, when treated as data by an unsuspecting computer system, can cause malicious code to be run. The payload code is *injected* into the system within a container that the system expects to be ordinary, harmless data. Often, the injection will make use of special characters such as text delimiters, comment markers, and statement terminators (these will be discussed in more detail). In this case, an SQL `DROP TABLE` statement is embedded within the data intended to contain just the name of a student in the database; if this code is run, the `Student` table (and all its contents) will be deleted! - -A general rule should be: never trust user data. There are a number of mitigations to code injection, but all stem from that general principle. - - -## Mitigation - -A number of strategies can be used against injection attacks. Many of the following are generally applicable, though some are most relevant for the common case of SQL injection attacks against a back-end database: - -* Validate client-side input to make sure it looks reasonable before attempting to run it. -* Use specialised user accounts with minimal privileges for different application functionality, e.g. a user for viewing product details should not be able to update anything. -* Use prepared statements rather than concatenation. -* Use database constraints (integrity rules) as an extra guard against invalid data, e.g. prohibit anything resembling HTML or JavaScript inside normal table data. Some database designs even prohibit single quotes in user data. -* Enable software "safety switches" in the database and/or application layer, e.g. to disallow unreasonably large input or multiple statements separated by semicolons. -* Ensure that user passwords are strong, properly salted, peppered and hashed in their stored form, so that even if they are compromised, they will be of little use to the attacker. diff --git a/tiddlers/content/labs/05/_Labs_05_Code injection.md.meta b/tiddlers/content/labs/05/_Labs_05_Code injection.md.meta deleted file mode 100644 index e7e6d2b..0000000 --- a/tiddlers/content/labs/05/_Labs_05_Code injection.md.meta +++ /dev/null @@ -1,4 +0,0 @@ -section: 2 -tags: lab lab05 hidden -title: $:/Labs/05/Code injection -type: text/x-markdown \ No newline at end of file diff --git a/tiddlers/content/labs/05/_Labs_05_Introduction.md b/tiddlers/content/labs/05/_Labs_05_Introduction.md deleted file mode 100644 index bd6f40d..0000000 --- a/tiddlers/content/labs/05/_Labs_05_Introduction.md +++ /dev/null @@ -1,9 +0,0 @@ -This week's lab will cover the following main topics: - -* Language features that create injection vulnerabilities -* Injection attacks involving the database language SQL -* SQL injection safeguards -* JavaScript injection vulnerabilities -* Command shell injection attacks - -We will also introduce the Assignment 1 tasks and how to run the virtual machine image provided on Blackboard. We recommend making a start on the assignment work during this lab session (remember that you can pair up with another student for the assignment if you wish). \ No newline at end of file diff --git a/tiddlers/content/labs/05/_Labs_05_Introduction.md.meta b/tiddlers/content/labs/05/_Labs_05_Introduction.md.meta deleted file mode 100644 index 5553e81..0000000 --- a/tiddlers/content/labs/05/_Labs_05_Introduction.md.meta +++ /dev/null @@ -1,4 +0,0 @@ -section: 1 -tags: lab lab05 hidden -title: $:/Labs/05/Introduction -type: text/x-markdown \ No newline at end of file diff --git a/tiddlers/content/labs/05/_Labs_05_JavaScript Injection.md b/tiddlers/content/labs/05/_Labs_05_JavaScript Injection.md deleted file mode 100644 index cbee924..0000000 --- a/tiddlers/content/labs/05/_Labs_05_JavaScript Injection.md +++ /dev/null @@ -1 +0,0 @@ -If a client-side attacker can inject JavaScript code into database data, it may end up being activated in a user's browser. We will demonstrate a couple of ways of carrying out this sort of attack, and show how the database contents map to what is displayed in the user's browser. \ No newline at end of file diff --git a/tiddlers/content/labs/05/_Labs_05_JavaScript Injection.md.meta b/tiddlers/content/labs/05/_Labs_05_JavaScript Injection.md.meta deleted file mode 100644 index d0370f1..0000000 --- a/tiddlers/content/labs/05/_Labs_05_JavaScript Injection.md.meta +++ /dev/null @@ -1,4 +0,0 @@ -section: 6 -tags: lab lab05 hidden -title: $:/Labs/05/JavaScript Injection -type: text/x-markdown \ No newline at end of file diff --git a/tiddlers/content/labs/05/_Labs_05_Lab 5_ Injection Attacks.tid b/tiddlers/content/labs/05/_Labs_05_Lab 5_ Injection Attacks.tid deleted file mode 100644 index 4b304f3..0000000 --- a/tiddlers/content/labs/05/_Labs_05_Lab 5_ Injection Attacks.tid +++ /dev/null @@ -1,11 +0,0 @@ -tags: lab toc lab05 hidden -title: $:/Labs/05/Lab 5: Injection Attacks -type: text/vnd.tiddlywiki - -
Click the <> button below to open all of the sections for this lab.
- -!! Contents -<$set name="path" value="/Labs/05/"> -<$macrocall $name="contents-tree" path=<> /> -
<$macrocall $name="openByPath" path=<> />
- diff --git a/tiddlers/content/labs/05/_Labs_05_Linguistics of Programming.md b/tiddlers/content/labs/05/_Labs_05_Linguistics of Programming.md deleted file mode 100644 index e415c9b..0000000 --- a/tiddlers/content/labs/05/_Labs_05_Linguistics of Programming.md +++ /dev/null @@ -1,123 +0,0 @@ -Programming languages are designed to allow humans to instruct computers to perform automated tasks. The rules of *syntax* and *semantics* for the language give the programmer and the language implementation (e.g. compiler or interpreter) a common understanding of what code is valid and what it means, respectively. Unfortunately, a number of useful programming language features can give rise to injection vulnerabilities. We will provide a brief overview here, especially for the benefit of students with no background in programming. - -It is useful to be aware that most computer languages will scan program text in left-to-right order. Syntax errors can be reported immediately, although other errors (run-time errors) may only be detected when the (syntactically-valid) code is attempted. - -Take for example the following hypothetical code: - -``` -# Print a greeting to the user: -name = read(); -greeting = "Hello, $name"; -print(greeting); -``` - -The first line is a comment, to assist human readers in understanding the purpose or function of the code, and to be ignored by the compiler/interpreter. - -The remaining lines show three statements to be run in succession. First, user input is read into a variable called "name". Then, the variable is combined with a greeting and stored in another variable named "greeting". Lastly, the contents of the "greeting" variable are printed for display. - -Various special characters ("#", ";", "#", "(", ")") are used here to mark or *delimit* particular features of the code. These will be discussed in more detail in the following subsections. - - -# Delimiters - -In general terms, delimiters are special characters in a programming language that serve to denote boundaries within code. Some common types of boundary include: - -* Statement/command/instruction boundaries -* Character literals (strings) -* Comments or other code to be ignored -* Lists and list elements -* Line/field boundaries within data files such as CSV (comma-separated values) - -Character literals (strings) are one of the most common types of delimited code. In many languages, these are delimited using pairs of quotation marks, with the text between being treated verbatim by the programming language, much like quoted speech in written English. As the language compiler/interpreter is reading the input, after it encounters a string delimiter, it will treat the following text as literal character data, looking for the ending or closing delimeter. If it fails to find one when the input ends, it will raise a syntax error. - -In some languages, single- (') and double- (") quotation marks have different functions. For example, in C and related languages, double-quotes delimit strings, and single-quotes delimit individual character values. In other languages (such as the Linux command shell), single-quoted strings are treated strictly literally, while double-quoted strings may be subject to further processing such as the substitution of values for variables. - -Other language elements may have delimiters too, such as blocks of code within "{" and "}", lists of function parameters/arguments within "(" and ")", true/false conditions within "(" and ")", and vector/array/list indexes within "[" and "]". The use of asymmetric pairs of related characters helps with the readability. - -While paired delimiters are often used to mark regions of a certain type, single-character delimiters are also common. For example, statements in many languages are terminated by semicolons (";"). - -A similar lexical element in some languages is the *separator*. These occur *between* language elements of a certain type, as opposed to after them as with statement terminators. An example is the comma used in SQL to separate items in a list, as in: - -```sql -insert into Student (Student_ID, Name, Address) values (123, 'MORRIS, Horace', '99 Some St'); -``` - - -## Escape characters - -Some programming languages provide special "escape" characters, which serve to change the meaning of the character(s) following. A common example is the backslash ("\\"), which in many languages can be used to denote special characters such as tabs ("\t") and line-breaks ("\n") without having to use those characters literally ("\\\\" can be used to express a single literal backslash). - -Escape characters can also be used to indicate that a variable or other expression should be evaluated rather than simply being treated literally. The "$" sign is often used for this purpose, although ":", "@" and "%" are also encountered. - - -## Statement terminators - -Computer programs are often expressed as collections of *statements*, where each statement represents a discrete instruction or command being given. - -In the hypothetical example above, semicolon characters (";") are used to mark the end of a statement, much like a fullstop (".") in written English. In fact, some computer languages do use fullstops for this purpose, though semicolons (C, C++, Java, C#, Pascal) are more common in modern languages. In other languages (e.g. Python, Tcl), statements may be terminated simply by the end of a line (ASCII Carriage Return and/or Line Feed characters). - -In most programming languages, a *sequential flow of control* is the default, meaning that statements are run one after the other, in order. In most languages, changing the order will change the behaviour of the program. - - -## Comments - -Program code can be dense, complex, and at times cryptic. The design rationale and development process that resulted in the code will often not be apparent from the final code. For this reason, most programming languages allow *comments* to be inserted into the code for the benefit of other programmers. The computer will ignore the comments, either treating them as a "no-op", skipping over them, or removing them from further processing. - -Comments can often be written in single-line or multi-line (block) form. Single-line comments take effect until the end of the line (i.e. until the next line break character), and may be introduced by delimiter characters such as "#" (Python, Tcl, Unix shell), "//" (C and friends), and "--" (SQL, Lua). - -Multi-line comment syntax often uses asymmetric delimiters, as in C-like languages ("/*" marks the start, and "*/" marks the end.) - -Restrictions may exist on nesting comments within other comments (e.g. a single-line comment within a block comment). - - -# Wildcards - -Another class of special character is used when performing an inexact match on a string or filename. Instead of having to specify an exhaustive list of filenames, a wildcard character (such as "*") can be used instead (much like a "wild card" in certain card games, or the blank letter tiles in Scrabble). The wildcard can function either as a generic placeholder, or it can be expanded by the system to a list of matching names. - -For example, the following shell command will remove all the files within the current working directory: - -```bash -rm * -``` - -Some languages distinguish between single-character and multi-character wildcards. For example, SQL's `LIKE` expression uses "_" to match any single character and "%" to match any string of any length (including the empty string). Similarly, wildcards in the style of MS-DOS support "?" for single character matches and "*" for any string. - -For even more powerful string pattern matching, the [regular expression](https://en.wikipedia.org/wiki/Regular_expression) language goes far beyond the capabilities of basic wildcards. - - -# String Concatenation - -String concatenation is the process of combining multiple character string values into one, e.g. "spider" + "web" = "spiderweb". - -Again, the syntax varies, but some common ways to perform concatenation in various languages are: - -* The `+` operator (Java, Python) (usually overloaded with numeric addition) -* The `||` operator (SQL) (not to be confused with the logical OR operation is C-like languages) -* The `&` operator (Visual BASIC, Ada) -* A function such as `CONCAT()` - -Some programs generate code to be run, often in a different language. One of the most common examples of this is the use of embedded SQL database access code within a host language such as Java. The code may be parameterised: for example, a search string might be substituted into an SQL statement in order to perform the desired search. - -The naive way to combine the parameter value would be to concatenate a variable's value with the literal text of the search query. Suppose we were dealing with a search query of the following form: - -```sql -select Name, Price from Product where Name = 'calculator'; -``` - -To allow searching for any product, we would need to generalise the product name, i.e. the "calculator" string, using a variable in place of the literal string. - -Often these sorts of queries will also make use of wildcard pattern matching, e.g. - -```sql -select Name, Price from Product where Name like '%chocolate%'; -``` - -When embedded in a host program (in Java, in this case), the SQL command itself might have to be expressed as a character string: - -```java -String sql = "select Name, Price from Product where Name = '" + productName + "'"; -``` - -Note also the tricky quoting: SQL uses single-quotes to delimit strings, and Java uses double-quotes. Three values are being concatenated: the main part of the query, the product name variable, and a literal single-quote to terminate the SQL string. - -This approach is highly vulnerable to injection attacks, as we will see. \ No newline at end of file diff --git a/tiddlers/content/labs/05/_Labs_05_Linguistics of Programming.md.meta b/tiddlers/content/labs/05/_Labs_05_Linguistics of Programming.md.meta deleted file mode 100644 index 5681fa4..0000000 --- a/tiddlers/content/labs/05/_Labs_05_Linguistics of Programming.md.meta +++ /dev/null @@ -1,4 +0,0 @@ -section: 3 -tags: lab lab05 hidden -title: $:/Labs/05/Linguistics of Programming -type: text/x-markdown \ No newline at end of file diff --git a/tiddlers/content/labs/05/_Labs_05_SQL Injection.md b/tiddlers/content/labs/05/_Labs_05_SQL Injection.md deleted file mode 100644 index 4e1e353..0000000 --- a/tiddlers/content/labs/05/_Labs_05_SQL Injection.md +++ /dev/null @@ -1,3 +0,0 @@ -You can download the [SQL injection demo code](https://blackboard.otago.ac.nz/bbcswebdav/pid-2711936-dt-content-rid-17263610_1/xid-17263610_1) from Blackboard, unzip it into a folder under you home folder, and use `gradle build` in the terminal to prepare the system. You will need to copy the resulting Web archive (`.war`) file into your Tomcat folder (from the earlier lab work on HTTP). - -We will step you through the process of crafting some "nasty" strings to enter into the system to conduct SQL injection attacks on the back-end database. \ No newline at end of file diff --git a/tiddlers/content/labs/05/_Labs_05_SQL Injection.md.meta b/tiddlers/content/labs/05/_Labs_05_SQL Injection.md.meta deleted file mode 100644 index 48f34e6..0000000 --- a/tiddlers/content/labs/05/_Labs_05_SQL Injection.md.meta +++ /dev/null @@ -1,4 +0,0 @@ -section: 5 -tags: lab lab05 hidden -title: $:/Labs/05/SQL Injection -type: text/x-markdown \ No newline at end of file diff --git a/tiddlers/content/labs/05/_Labs_05_SQL crash course.md b/tiddlers/content/labs/05/_Labs_05_SQL crash course.md deleted file mode 100644 index 0f62ae5..0000000 --- a/tiddlers/content/labs/05/_Labs_05_SQL crash course.md +++ /dev/null @@ -1,47 +0,0 @@ -Some of you may not have encountered the database language SQL before, so this section will introduce the basics. SQL databases are collections of named tables, each with a uniform column structure. Each column has a name and a type (e.g. text, numeric, date). Rows in a table have a value for each column (although SQL does provide for `NULL` markers for missing or unknown data). Each table should have a *primary key*: a set of columns whose values must, in combination, be unique across all rows in the table. They may also have *foreign keys*, which are sets of columns whose values must *reference* or correspond to values in another set of columns (usually the primary key columns in another table). For example, a table for paper enrolments would have a foreign key referencing the student, and another foreign key referencing the paper. A database design can be depicted using an Entity/Relationship Diagram (ERD). - -SQL statements are divided into a number of categories, primarily the data definition language (DDL) for creating and managing database objects such as tables, and the data manipulation language (DML) for querying and modifying user data. - -The most common DDL statement is `CREATE TABLE`, which creates a new table in the database with the specified structure: - -```sql -create table Student ( - Student_ID number, - Surname varchar, - First_Names varchar, - Birth_Date date, - - constraint Student_PK primary key (Student_ID) -); -``` - -Data can then be added to the table using the `INSERT` statement: - -```sql -insert into Student (Student_ID, Surname, First_Names) values (123, 'Jones', 'Jenny'); -``` - -or modified using the `UPDATE` statement: - -```sql -update Student set First_Names = 'Jennifer' where Student_ID = 123; -``` - -or removed using the `DELETE` statement: - -```sql -delete from Student where Student_ID = 123; -``` - -Data can be retrieved (queried) using the `SELECT` statement: - -```sql -select Student_ID, First_Names || ' ' || Surname -from Student -where Student_ID = 123 -order by Surname, First_Names; -``` - -## Exercise - -Adapt this code (make up a few more rows of data) to set up a small student database using the [http://sqlfiddle.com](SQL Fiddle) site. Set the database engine (top left) to SQLite (WebSQL). Note that you will need to combine the DDL code and run it in the left-hand Schema Pane. You can then try running some queries in the Query Pane on the right. \ No newline at end of file diff --git a/tiddlers/content/labs/05/_Labs_05_SQL crash course.md.meta b/tiddlers/content/labs/05/_Labs_05_SQL crash course.md.meta deleted file mode 100644 index 72338f9..0000000 --- a/tiddlers/content/labs/05/_Labs_05_SQL crash course.md.meta +++ /dev/null @@ -1,4 +0,0 @@ -section: 4 -tags: lab lab05 hidden -title: $:/Labs/05/SQL crash course -type: text/x-markdown \ No newline at end of file diff --git a/tiddlers/content/labs/06/_Labs_05_Code injection.md b/tiddlers/content/labs/06/_Labs_05_Code injection.md new file mode 100644 index 0000000..b898ea8 --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_Code injection.md @@ -0,0 +1,20 @@ +Once again, xkcd has a pertinent cartoon for our lab topic: + +[![https://xkcd.com/327/](https://imgs.xkcd.com/comics/exploits_of_a_mom.png)](https://xkcd.com/327/) +[Source: https://xkcd.com/327/] + +By way of explanation, a carefully crafted piece of text, when treated as data by an unsuspecting computer system, can cause malicious code to be run. The payload code is *injected* into the system within a container that the system expects to be ordinary, harmless data. Often, the injection will make use of special characters such as text delimiters, comment markers, and statement terminators (these will be discussed in more detail). In this case, an SQL `DROP TABLE` statement is embedded within the data intended to contain just the name of a student in the database; if this code is run, the `Student` table (and all its contents) will be deleted! + +A general rule should be: never trust user data. There are a number of mitigations to code injection, but all stem from that general principle. + + +## Mitigation + +A number of strategies can be used against injection attacks. Many of the following are generally applicable, though some are most relevant for the common case of SQL injection attacks against a back-end database: + +* Validate client-side input to make sure it looks reasonable before attempting to run it. +* Use specialised user accounts with minimal privileges for different application functionality, e.g. a user for viewing product details should not be able to update anything. +* Use prepared statements rather than concatenation. +* Use database constraints (integrity rules) as an extra guard against invalid data, e.g. prohibit anything resembling HTML or JavaScript inside normal table data. Some database designs even prohibit single quotes in user data. +* Enable software "safety switches" in the database and/or application layer, e.g. to disallow unreasonably large input or multiple statements separated by semicolons. +* Ensure that user passwords are strong, properly salted, peppered and hashed in their stored form, so that even if they are compromised, they will be of little use to the attacker. diff --git a/tiddlers/content/labs/06/_Labs_05_Code injection.md.meta b/tiddlers/content/labs/06/_Labs_05_Code injection.md.meta new file mode 100644 index 0000000..d827ad2 --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_Code injection.md.meta @@ -0,0 +1,4 @@ +section: 2 +tags: lab lab06 hidden +title: $:/Labs/06/Code injection +type: text/x-markdown diff --git a/tiddlers/content/labs/06/_Labs_05_Introduction.md b/tiddlers/content/labs/06/_Labs_05_Introduction.md new file mode 100644 index 0000000..bd6f40d --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_Introduction.md @@ -0,0 +1,9 @@ +This week's lab will cover the following main topics: + +* Language features that create injection vulnerabilities +* Injection attacks involving the database language SQL +* SQL injection safeguards +* JavaScript injection vulnerabilities +* Command shell injection attacks + +We will also introduce the Assignment 1 tasks and how to run the virtual machine image provided on Blackboard. We recommend making a start on the assignment work during this lab session (remember that you can pair up with another student for the assignment if you wish). \ No newline at end of file diff --git a/tiddlers/content/labs/06/_Labs_05_Introduction.md.meta b/tiddlers/content/labs/06/_Labs_05_Introduction.md.meta new file mode 100644 index 0000000..f715f4f --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_Introduction.md.meta @@ -0,0 +1,4 @@ +section: 1 +tags: lab lab06 hidden +title: $:/Labs/06/Introduction +type: text/x-markdown diff --git a/tiddlers/content/labs/06/_Labs_05_JavaScript Injection.md b/tiddlers/content/labs/06/_Labs_05_JavaScript Injection.md new file mode 100644 index 0000000..cbee924 --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_JavaScript Injection.md @@ -0,0 +1 @@ +If a client-side attacker can inject JavaScript code into database data, it may end up being activated in a user's browser. We will demonstrate a couple of ways of carrying out this sort of attack, and show how the database contents map to what is displayed in the user's browser. \ No newline at end of file diff --git a/tiddlers/content/labs/06/_Labs_05_JavaScript Injection.md.meta b/tiddlers/content/labs/06/_Labs_05_JavaScript Injection.md.meta new file mode 100644 index 0000000..0edb0c7 --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_JavaScript Injection.md.meta @@ -0,0 +1,4 @@ +section: 6 +tags: lab lab06 hidden +title: $:/Labs/06/JavaScript Injection +type: text/x-markdown diff --git a/tiddlers/content/labs/06/_Labs_05_Lab 5_ Injection Attacks.tid b/tiddlers/content/labs/06/_Labs_05_Lab 5_ Injection Attacks.tid new file mode 100644 index 0000000..14bce53 --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_Lab 5_ Injection Attacks.tid @@ -0,0 +1,11 @@ +tags: lab toc lab06 hidden +title: $:/Labs/06/Lab 6: Injection Attacks +type: text/vnd.tiddlywiki + +
Click the <> button below to open all of the sections for this lab.
+ +!! Contents +<$set name="path" value="/Labs/06/"> +<$macrocall $name="contents-tree" path=<> /> +
<$macrocall $name="openByPath" path=<> />
+ diff --git a/tiddlers/content/labs/06/_Labs_05_Linguistics of Programming.md b/tiddlers/content/labs/06/_Labs_05_Linguistics of Programming.md new file mode 100644 index 0000000..e415c9b --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_Linguistics of Programming.md @@ -0,0 +1,123 @@ +Programming languages are designed to allow humans to instruct computers to perform automated tasks. The rules of *syntax* and *semantics* for the language give the programmer and the language implementation (e.g. compiler or interpreter) a common understanding of what code is valid and what it means, respectively. Unfortunately, a number of useful programming language features can give rise to injection vulnerabilities. We will provide a brief overview here, especially for the benefit of students with no background in programming. + +It is useful to be aware that most computer languages will scan program text in left-to-right order. Syntax errors can be reported immediately, although other errors (run-time errors) may only be detected when the (syntactically-valid) code is attempted. + +Take for example the following hypothetical code: + +``` +# Print a greeting to the user: +name = read(); +greeting = "Hello, $name"; +print(greeting); +``` + +The first line is a comment, to assist human readers in understanding the purpose or function of the code, and to be ignored by the compiler/interpreter. + +The remaining lines show three statements to be run in succession. First, user input is read into a variable called "name". Then, the variable is combined with a greeting and stored in another variable named "greeting". Lastly, the contents of the "greeting" variable are printed for display. + +Various special characters ("#", ";", "#", "(", ")") are used here to mark or *delimit* particular features of the code. These will be discussed in more detail in the following subsections. + + +# Delimiters + +In general terms, delimiters are special characters in a programming language that serve to denote boundaries within code. Some common types of boundary include: + +* Statement/command/instruction boundaries +* Character literals (strings) +* Comments or other code to be ignored +* Lists and list elements +* Line/field boundaries within data files such as CSV (comma-separated values) + +Character literals (strings) are one of the most common types of delimited code. In many languages, these are delimited using pairs of quotation marks, with the text between being treated verbatim by the programming language, much like quoted speech in written English. As the language compiler/interpreter is reading the input, after it encounters a string delimiter, it will treat the following text as literal character data, looking for the ending or closing delimeter. If it fails to find one when the input ends, it will raise a syntax error. + +In some languages, single- (') and double- (") quotation marks have different functions. For example, in C and related languages, double-quotes delimit strings, and single-quotes delimit individual character values. In other languages (such as the Linux command shell), single-quoted strings are treated strictly literally, while double-quoted strings may be subject to further processing such as the substitution of values for variables. + +Other language elements may have delimiters too, such as blocks of code within "{" and "}", lists of function parameters/arguments within "(" and ")", true/false conditions within "(" and ")", and vector/array/list indexes within "[" and "]". The use of asymmetric pairs of related characters helps with the readability. + +While paired delimiters are often used to mark regions of a certain type, single-character delimiters are also common. For example, statements in many languages are terminated by semicolons (";"). + +A similar lexical element in some languages is the *separator*. These occur *between* language elements of a certain type, as opposed to after them as with statement terminators. An example is the comma used in SQL to separate items in a list, as in: + +```sql +insert into Student (Student_ID, Name, Address) values (123, 'MORRIS, Horace', '99 Some St'); +``` + + +## Escape characters + +Some programming languages provide special "escape" characters, which serve to change the meaning of the character(s) following. A common example is the backslash ("\\"), which in many languages can be used to denote special characters such as tabs ("\t") and line-breaks ("\n") without having to use those characters literally ("\\\\" can be used to express a single literal backslash). + +Escape characters can also be used to indicate that a variable or other expression should be evaluated rather than simply being treated literally. The "$" sign is often used for this purpose, although ":", "@" and "%" are also encountered. + + +## Statement terminators + +Computer programs are often expressed as collections of *statements*, where each statement represents a discrete instruction or command being given. + +In the hypothetical example above, semicolon characters (";") are used to mark the end of a statement, much like a fullstop (".") in written English. In fact, some computer languages do use fullstops for this purpose, though semicolons (C, C++, Java, C#, Pascal) are more common in modern languages. In other languages (e.g. Python, Tcl), statements may be terminated simply by the end of a line (ASCII Carriage Return and/or Line Feed characters). + +In most programming languages, a *sequential flow of control* is the default, meaning that statements are run one after the other, in order. In most languages, changing the order will change the behaviour of the program. + + +## Comments + +Program code can be dense, complex, and at times cryptic. The design rationale and development process that resulted in the code will often not be apparent from the final code. For this reason, most programming languages allow *comments* to be inserted into the code for the benefit of other programmers. The computer will ignore the comments, either treating them as a "no-op", skipping over them, or removing them from further processing. + +Comments can often be written in single-line or multi-line (block) form. Single-line comments take effect until the end of the line (i.e. until the next line break character), and may be introduced by delimiter characters such as "#" (Python, Tcl, Unix shell), "//" (C and friends), and "--" (SQL, Lua). + +Multi-line comment syntax often uses asymmetric delimiters, as in C-like languages ("/*" marks the start, and "*/" marks the end.) + +Restrictions may exist on nesting comments within other comments (e.g. a single-line comment within a block comment). + + +# Wildcards + +Another class of special character is used when performing an inexact match on a string or filename. Instead of having to specify an exhaustive list of filenames, a wildcard character (such as "*") can be used instead (much like a "wild card" in certain card games, or the blank letter tiles in Scrabble). The wildcard can function either as a generic placeholder, or it can be expanded by the system to a list of matching names. + +For example, the following shell command will remove all the files within the current working directory: + +```bash +rm * +``` + +Some languages distinguish between single-character and multi-character wildcards. For example, SQL's `LIKE` expression uses "_" to match any single character and "%" to match any string of any length (including the empty string). Similarly, wildcards in the style of MS-DOS support "?" for single character matches and "*" for any string. + +For even more powerful string pattern matching, the [regular expression](https://en.wikipedia.org/wiki/Regular_expression) language goes far beyond the capabilities of basic wildcards. + + +# String Concatenation + +String concatenation is the process of combining multiple character string values into one, e.g. "spider" + "web" = "spiderweb". + +Again, the syntax varies, but some common ways to perform concatenation in various languages are: + +* The `+` operator (Java, Python) (usually overloaded with numeric addition) +* The `||` operator (SQL) (not to be confused with the logical OR operation is C-like languages) +* The `&` operator (Visual BASIC, Ada) +* A function such as `CONCAT()` + +Some programs generate code to be run, often in a different language. One of the most common examples of this is the use of embedded SQL database access code within a host language such as Java. The code may be parameterised: for example, a search string might be substituted into an SQL statement in order to perform the desired search. + +The naive way to combine the parameter value would be to concatenate a variable's value with the literal text of the search query. Suppose we were dealing with a search query of the following form: + +```sql +select Name, Price from Product where Name = 'calculator'; +``` + +To allow searching for any product, we would need to generalise the product name, i.e. the "calculator" string, using a variable in place of the literal string. + +Often these sorts of queries will also make use of wildcard pattern matching, e.g. + +```sql +select Name, Price from Product where Name like '%chocolate%'; +``` + +When embedded in a host program (in Java, in this case), the SQL command itself might have to be expressed as a character string: + +```java +String sql = "select Name, Price from Product where Name = '" + productName + "'"; +``` + +Note also the tricky quoting: SQL uses single-quotes to delimit strings, and Java uses double-quotes. Three values are being concatenated: the main part of the query, the product name variable, and a literal single-quote to terminate the SQL string. + +This approach is highly vulnerable to injection attacks, as we will see. \ No newline at end of file diff --git a/tiddlers/content/labs/06/_Labs_05_Linguistics of Programming.md.meta b/tiddlers/content/labs/06/_Labs_05_Linguistics of Programming.md.meta new file mode 100644 index 0000000..7132154 --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_Linguistics of Programming.md.meta @@ -0,0 +1,4 @@ +section: 3 +tags: lab lab06 hidden +title: $:/Labs/06/Linguistics of Programming +type: text/x-markdown diff --git a/tiddlers/content/labs/06/_Labs_05_SQL Injection.md b/tiddlers/content/labs/06/_Labs_05_SQL Injection.md new file mode 100644 index 0000000..4e1e353 --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_SQL Injection.md @@ -0,0 +1,3 @@ +You can download the [SQL injection demo code](https://blackboard.otago.ac.nz/bbcswebdav/pid-2711936-dt-content-rid-17263610_1/xid-17263610_1) from Blackboard, unzip it into a folder under you home folder, and use `gradle build` in the terminal to prepare the system. You will need to copy the resulting Web archive (`.war`) file into your Tomcat folder (from the earlier lab work on HTTP). + +We will step you through the process of crafting some "nasty" strings to enter into the system to conduct SQL injection attacks on the back-end database. \ No newline at end of file diff --git a/tiddlers/content/labs/06/_Labs_05_SQL Injection.md.meta b/tiddlers/content/labs/06/_Labs_05_SQL Injection.md.meta new file mode 100644 index 0000000..9480beb --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_SQL Injection.md.meta @@ -0,0 +1,4 @@ +section: 5 +tags: lab lab06 hidden +title: $:/Labs/06/SQL Injection +type: text/x-markdown diff --git a/tiddlers/content/labs/06/_Labs_05_SQL crash course.md b/tiddlers/content/labs/06/_Labs_05_SQL crash course.md new file mode 100644 index 0000000..0f62ae5 --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_SQL crash course.md @@ -0,0 +1,47 @@ +Some of you may not have encountered the database language SQL before, so this section will introduce the basics. SQL databases are collections of named tables, each with a uniform column structure. Each column has a name and a type (e.g. text, numeric, date). Rows in a table have a value for each column (although SQL does provide for `NULL` markers for missing or unknown data). Each table should have a *primary key*: a set of columns whose values must, in combination, be unique across all rows in the table. They may also have *foreign keys*, which are sets of columns whose values must *reference* or correspond to values in another set of columns (usually the primary key columns in another table). For example, a table for paper enrolments would have a foreign key referencing the student, and another foreign key referencing the paper. A database design can be depicted using an Entity/Relationship Diagram (ERD). + +SQL statements are divided into a number of categories, primarily the data definition language (DDL) for creating and managing database objects such as tables, and the data manipulation language (DML) for querying and modifying user data. + +The most common DDL statement is `CREATE TABLE`, which creates a new table in the database with the specified structure: + +```sql +create table Student ( + Student_ID number, + Surname varchar, + First_Names varchar, + Birth_Date date, + + constraint Student_PK primary key (Student_ID) +); +``` + +Data can then be added to the table using the `INSERT` statement: + +```sql +insert into Student (Student_ID, Surname, First_Names) values (123, 'Jones', 'Jenny'); +``` + +or modified using the `UPDATE` statement: + +```sql +update Student set First_Names = 'Jennifer' where Student_ID = 123; +``` + +or removed using the `DELETE` statement: + +```sql +delete from Student where Student_ID = 123; +``` + +Data can be retrieved (queried) using the `SELECT` statement: + +```sql +select Student_ID, First_Names || ' ' || Surname +from Student +where Student_ID = 123 +order by Surname, First_Names; +``` + +## Exercise + +Adapt this code (make up a few more rows of data) to set up a small student database using the [http://sqlfiddle.com](SQL Fiddle) site. Set the database engine (top left) to SQLite (WebSQL). Note that you will need to combine the DDL code and run it in the left-hand Schema Pane. You can then try running some queries in the Query Pane on the right. \ No newline at end of file diff --git a/tiddlers/content/labs/06/_Labs_05_SQL crash course.md.meta b/tiddlers/content/labs/06/_Labs_05_SQL crash course.md.meta new file mode 100644 index 0000000..03ac54f --- /dev/null +++ b/tiddlers/content/labs/06/_Labs_05_SQL crash course.md.meta @@ -0,0 +1,4 @@ +section: 4 +tags: lab lab06 hidden +title: $:/Labs/06/SQL crash course +type: text/x-markdown