Newer
Older
labs / tiddlers / content / labs / lab04 / _Labs_04_Insert Data.md

Before you can do anything else, you need to get some data into your tables. This is done using SQL’s INSERT statement, which has the following general form:

insert into <table name> [(<list of column names>)]
values (<list of value expressions>);

The list of column names after the table name is optional, but it’s generally good practice to include it. Why is this good practice? (Hint: Think about what could go wrong if you didn’t include the column names.)

Can you think of other reasons why you might want to explicitly list the column names? (Hint: Look at the columns in the Sample and Site tables.)


Why does it say “list of value expressions” rather than just “list of values” above? What are the advantages of this?


Scientist table

First, you’ll load some data into the Scientist table. For example, the following will insert a new Scientist row:

insert into Scientist (Scientist_Num, Surname, Other_Names, Email, Mobile_Phone)
values ('12345', 'Smith', 'Jane', 'jsmith@wet.co.nz', '+64 22 0191 1468');

Open DBeaver, connect to your PostgreSQL schema, and run the above INSERT statement on your Scientist table.

You may need to modify the column names in the INSERT statement to match your table. You may also need to modify the values to be compatible with the data types in your table, e.g., the values above may be longer than what you’ve allowed for, or you may need to remove the quotes from values that you stored as numbers rather than text (e.g., Scientist_Num). Ask for advice if you’re unsure.

If all goes well, you’ll see “1 row inserted.” appear in the Script Output panel at the bottom of the window.
What happens if you try to insert the same row again, and why? (Hint: Just run the INSERT statement again. Make sure you fully read the error message that appears in the Script Output panel.)


Change the data in the INSERT statement to correct the reported issue and try running it again. Did it work this time? If not, why not?


Continue to correct the data in the INSERT statement until you’re able to insert a second row into the Scientist table. Check that the data are actually there by running the statement

select * from Scientist;.

You should see two rows appear in the Query Result panel at the bottom of the window.
Invent some more Scientist data and insert them into the table. Five to ten rows will be plenty. Try to make the data at least somewhat realistic, e.g., don’t just use the same name or mobile number for all rows. Make sure you save a copy of your INSERT statements for future reference (either copy and paste them into a text document, or save the SQL Worksheet contents to a file).

Site table

You had to invent your own data for the Scientist table, but you already have data for Site table, in the first eight columns of the water quality CSV file from Lab 02. Open it in Excel and scroll down until you find some data in the CatchmentArea, CatchmentHeight, and Altitude columns. Create a few INSERT statements by copying and pasting the corresponding values from the CSV file. Include some rows that have values for CatchmentArea, CatchmentHeight, and Altitude, and some that don’t. For those that don’t, use NULL (without quotes) instead of a value. Make sure that each row is for a different region. Here are a couple of examples to get you started:

insert into Site (Site_ID, Region, Description, Latitude, Longitude,
                  Catchment_Area, Catchment_Height, Altitude)
values ('TK3', 'Tekapo', 'Opuha at Skipton Br.', -44.07888926, 170.9797435,
        458, 1020, 238);

insert into Site (Site_ID, Region, Description, Latitude, Longitude,
                  Catchment_Area, Catchment_Height, Altitude)
values ('AX1', 'Alexandra', 'Clutha at Luggate Br.', -44.72911152, 169.2805516,
        null, null, null);

Writing individual INSERT statements will clearly get quite tedious if there are a lot of rows to insert. For large amounts of data, you could use something to generate INSERT statements for you, but the best option is to use a bulk data loading tool that can directly import formats like CSV.

Is there a way to get Excel to generate INSERT statements for you, which you can then paste into SQL Developer?


Does DBeaver provide any way to import data in bulk from a disk file?


There are about 20 sites listed in the CSV data. Load the data for all of these into the Site table using one of the three methods mentioned above (manually-written INSERTs, generated INSERTs, or bulk import).

Sample table

Finally, you need to load the relevant CSV data into the Sample table. Here are some tips:

• The relevant columns in the CSV data are SiteID and RecordedOn. The latter contains a date/time value. To get this into your database, you’ll need to convert the value into a DATE using the TO_DATE function, e.g.,: to_date('6/01/2015 8:45', 'DD/MM/YYYY HH:MI').

• The CSV data specifies the site ID for each sample, but not the scientist. You’ll need to assign this yourself, drawing on the Scientist data you created.

• You’ll need to generate a sample ID value. You could just make this up yourself, but better options include using Excel to generate the IDs, or using a sequence (if the sample ID is stored as a NUMBER).

• There are no comments for any of the samples in the CSV data. For now, just leave the Comments column null. (Hint: If a column is allowed to be null, you can omit it from the list of column names in the INSERT statement, and the database will automatically insert a null into that column if it doesn’t have a default value.)

Why did we ask you to load data into the Scientist and Site tables before loading data into the Sample table? What would have happened if you tried to insert into Sample first?