# Taking Steps

School is for making relations

## Today's DB Objectives:

• Basics of Relational Algebra for Databases
• Practice the SELECT statement like a boss
• Learn how to setup and use MySQL
• MySQL vs. SQLite
• Connect to MySQL using PHP and Node
• Hear about the next project

### Lesson 1: Relations for Databases

So far we have a notion of databases as a bunch of tables which you can see as collections, with their rows seen as objects, and their columns seen as attributes. That is object-oriented thinking, and it works just fine. But in math we have a better model of they actually work.

The relational model. You might remember, from some distant math class, the notion of a binary relation. It was a pretty generic way of relating two things, and we do it daily: $$(a = b, y = f(x), n \leq m)$$ are all examples of binary relations. A binary relation is a set of ordered pairs, like $$\{(0,0), (1,1), (2,0), (0,2), (-1, 1), (1,-1), \cdots \}$$ is the set of pairs which define equality modulo 3.

Formally a relation is any subset of $$S_1 \times \cdots S_k$$ for any sets $$S_i$$. So vectors are relations, tuples are relations, and all of our database tables are relations. Our rows are instances of the relations defined by the table. In databases our sets are limited to "atomic" sets, i.e. things which are pretty simple as data. For instance: values in SQLite databases are limited to Integers, Reals, Text, Blobs (a blob of value stored as given), and the Null value.

Let's see this in action:


CREATE TABLE relation (v1 integer, v2 integer, v3 integer);
INSERT INTO relation VALUES (0, 1, 2)
INSERT INTO relation VALUES (3, 4, 5)
INSERT INTO relation VALUES (6, 7, 8)
INSERT INTO relation VALUES (9, 10, 11)



Mathematically that is saying, let $$\mathcal{R} := \{(0,1,2), (3,4,5), (6,7,8), (9,10,11) \}$$. Now you can see most SQL operations as (multi-)set operations and vice-versa. As a quick for instance: $$\mathcal{R} \times \mathcal{R}$$ is the cartesian product of the set $$\mathcal{R}$$ and itself. It has 16 elements, one for each pair of triples in the original set. We can get that relation in SQL by issuing the command: SELECT * FROM relation r1, relation r2;.

There are a set of common relational operations/functions that transform relations into different relations:

• Select $$\sigma$$ (this is a filter on the elements in the relation)
• Project $$\Pi$$ (this takes away entries in tuples)
• Union $$\cup$$ (this is the union of two relations defined on the same sets)
• Set Difference $$-$$ (this removes values in one relation from another)
• Cartesian Product $$\times$$ (this makes a new super-relation by combining every element in one relation with every element in the other)
• Intersection $$\cap$$ (the elements common to two relations)
• Natural Join $$\bowtie$$ (it's a cartesian product that combines identical values from the same set)

These each of a natural SQL counterpart. $$sigma$$ is the WHERE clause, $$Pi$$ is the SELECT clause, $$\times, \bowtie$$ are the FROM clause (just comma separate or NATURAL JOIN), while $$\cup , \cap , -$$ are the select combiners UNION, INTERSECT, and EXCEPT.

This also forms a new mental model of how results are selected by SQL statements. Every SQL statement creates a new relation, which acts like a table. So when you get your results back it is always as an array of rows, so to speak. Another observation is that SELECT results are created (logically, not implementation-wise) in the following order:

1. Use the FROM statement to create a master relation
2. Use the WHERE statement to toss out elements from that relation
3. Use the SELECT statement to decide how to display those elements

To practice this new viewpoint: fire up a DB (maybe chinook, maybe scrabble, maybe your posts). Now imagine each table as a set of tuples. Combine the tuples of two small sets using $$\times$$ (select * from table1, table2;) and using $$\bowtie$$ (select * from table1 natural join table2;). How are the results different? If you have an error you might need to give the tables aliases table1 t1. Now try UNION or INTERSECT using two different WHERE clauses and try to predict your outcome.

### Lesson 2: SELECT 2.0

Now that we can see tables and select queries as relations, let's really explore the cool tricks we can pull off and then start a scavenger hunt.

The set-up, our book uses a university database which you can grab from db-book.com. Grab the DDL file which defines the tables, and then grab the file for creating small relations which populates those tables. You can create a db and run the commands in that db with the following command-line pattern:

sqlite3 the.db < code.sql 
This "pipes" in the file as a set of commands that gets executed in the new db.

So create a db, or download my version and access it with an sqlite terminal (if you have one locally then great, otherwise head up to cloud9). For each task write an SQL statement that solves the question.

First task: find the names of all students that have at least 1 computer science class.

Second task: find name and id of all students that have never taken a class earlier than Fall 2009.

Third task: find the department which has the lowest maximum salary of all departments.

All instructors that are paid less than the lowest paid computer science professor.


select name from instructor where salary < (
select min(salary) from instructor where dept_name = 'Comp. Sci.');


All courses taught in both Fall 2009 and in Spring 2010:


select course_id from section as S where semester='Fall' and
year=2009 and exists (select *
from section as T where
semester = 'Spring' and year=2010 and S.course_id = T.course_id);



Which department pays out the most in total salary, and how much?


select dept_name, max(tot_salary) from (
select dept_name, sum(salary) as tot_salary
from instructor group by dept_name) dept_total;



### Lessons 3-5: MySQL 101

##### Step 1: Install MySQL and PHPMyAdmin

I want you to create an instance of MySQL on a cloud9 environment. Create a table, insert a row. I'm up past my bedtime so I'm going to give you docs and a promise of help in class:


mysql-ctl install
mysql-ctl cli


Ask me to show you my antiques DB.

##### Step 2: Create the university database in MySQL

OK, so getting the same University DB to work in MySQL took a little bit of massaging:

In general MySQL is a process which is served on port 3306, you connect to it as a user (with a password). The process will be managing many databases. The general connection command is like this:


mysql -h localhost -u myname -p mydb


On cloud9 I had a bizarre socket problem which was fixed by connecting using a fully specified localhost and port:


mysql -h 127.0.0.1 -P 3306 -u root -p


###### Step: Create a Database

Show databases; to see the current databases.

Create database dbnamehere; to create a new database.

use dbnamehere; to switch to that database directly.

###### Create a User

 CREATE USER 'testuser'@'localhost' IDENTIFIED BY 'test'; 

 GRANT ALL PRIVILEGES ON *.* TO 'testuser'@'localhost'; 

 FLUSH PRIVILEGES; 

###### Create University Tables

I had to remove the foreign key constraints from the above DDL.sql, here is today's MySQL friendly DDL

Run it like this:

mysql -h 127.0.0.1 -P 3306 -u testuser -p university < DDL.sql 

###### Populate University Tables

Populate it like this:

mysql -h 127.0.0.1 -P 3306 -u testuser -p university < smallRelationsInsertFile.sql 

NEXT TASK: now create a query which calculates each student's GPA.

### Comparing MySQL to SQLite

What makes MySQL different than SQLite?

A great starter guide to MySQL, note the differences

Connect to your mysql database using PHP and NodeJS? I don't have time to give sample code, let's try it together if we have time in class.

### Last Topic: Intro to the next project

I want you to build a six degrees (of Kevin Bacon) engine. Use IMDB's data, an ECE workspace, MySQL, and some ingenuity. I require that you find connections between two actors/actresses respecting some restraints: the app should allow specifying acceptable decades of movies, a list of possible directors, a list of possible screen writers, and/or popularity of the stars.

##### Collaboration and project expectations

I've created a Piazza site, if you did not get an email to join it then let me know and I'll get you in there. You can ask questions through that and other students can answer too.

Here are some guidelines for making a GitHub repo out of any folder, which might help with sharing code or saving cloud9 code.

Here are my project submission guidelines, follow them for all projects including the chinook data project.