ACID trips for architects

Today's DB Objectives:

Using Transactions and ACID properties
Try to break a DB
Some emergency db fixing techniques
Run an ORM IRL
More practice designing relational database architectures

ACID Transactions

OK, so we're starting to think about building big applications with databases. Let's say you want to make a business tool allowing two parties to edit and sign a contract. Party A updates the contract and simultaneously party B makes their own update to the same paragraph. Party A signs what they think is their version of the contract. What do we do?

There are two layers to this problem. The top layer is that your application logic needs to be aware that a signature is being made on an unread update. The bottom layer is that the database needs to manage two concurrent updates and avoid exploding.

Transactions are a way of thinking about interactions with databases. They represent small but logically complete units of interaction with a database. A user can force a database to start and commit a transaction and sometimes a db will do so under the hood for you. A transaction is said to pass the ACID test if it obeys the principles of ATOMICITY, CONSISTENCY, ISOLATION, DURABILITY.

Roughly ACID means that either the whole transaction happens or none of it happens, the state of the database will be valid after the transaction, concurrent transactions must result in the same state as equivalent serialized transactions, and that the results of a transaction will persist even if bad stuff happens.

TASK 1: Try to break an SQLite database (grab one of the PHP programs, connect and execute many updates from many tabs, interact from the command line during this, etc.)

TASK 2: Use BEGIN TRANSACTION and COMMIT to wrap up some SQL commands.

TASK 3: Begin a transaction, do some stuff (observe the journal) then kill the session. Now head back and see what the state of the database is.

Now let's explore some slides from the book on the topic (I think the first half is good enough to use here) Powerpoint or PDF.

Some Useful Emergency Procedures

It's tough to break a well made database. I've done more times that I would like. If you are in the SQLite session you can use .backup filename to backup the database to filename. If you need a bash procedure this has worked for me:


echo .dump | sqlite3 databasename > commands.sql
sqlite3 -init commands.sql backupdbname

The MySQL version looks like this:


mysqldump -h hostname -u username -pPASSWORD databasename > backup.sql
mysql -h hostname -u username -pPASSWORD databasename < backup.sql

TASK Go ahead and try it out, take a db, dump it into a file then start a new db from that file. Take a look at the file of commands and compare the two databases. The notes from Class 3 might help with the MySQL setup.

ORMs round 2

So at the end of the last class I made a case for using a data-access layer in your data-driven applications. I'm going to assume that very few people went home and built an app with an ORM so I'm going to walk you through setting up and using a PHP ORM. I figure if we really hello world something now then you'll be ten times more likely to use something like this in your future.

One more thing. Most of the headache in a new project happens in the hello world. Almost every major project (more than a week to code) I've worked on has had a slow setup process then sped along once the environment was solid. That means a lot of the setup is code that you get going once and kinda forget about it. Particularly if you can use a boilerplate setup from project to project. So this process now might be more valuable to you in the future than you think.

For this project I'm going to take the top google search library when you search php orm, propel. Take a second to look at their sandbox. Some of you might prefer to do a Node (I say we try Bookshelf with Um Zug) or Python version (here is a comparison list of Python ORMs) of this task, but it's late and I doubt I'll have time to make a complete walkthrough for other environments. While this might seem more painful than most installs, this is something which can help you grow a project from classroom-size to enterprise-size. Working on huge projects always takes some sacrifice.

The Setup

Create a new Cloud9 PHP environment
Follow the Via Composer steps at this link
export PATH=$PATH:/home/ubuntu/workspace/vendor/bin (this just puts the propel binary on your path for speed during this process)
wget http://db.prof.ninja/class7/uni.db (to load our small uni.db)
propel database:reverse sqlite:uni.db (this reverse engineers the parameters of that database)
propel init (the answers that matter, sqlite, and yes from existing)
mv generated-reversed-database/schema.xml .
propel build (this will create a lot of useful PHP classes, we could have made this cleaner by specifying a place to put them all)
propel convert-conf (this creates a php config file from our settings)

Node Setup

Fireup a node workspace
Install Knex, Bookshelf
node for an interactive terminal
Connect
Profit

Using it

Now we are ready to interact with the database using an Active Record style. To pull this off we'll need to autoload classes that we want to use (without loading all classes). So any PHP code that wants full access to this stuff can use the following three lines:


$autoloader = require 'vendor/autoload.php';
$autoloader->add('', __DIR__);
require './generated-conf/config.php';

TASKS: Use interactive mode for PHP by typing php -a

The ethos of this library is that every table is now a class with methods setColumnName(input), save(), delete(), toJSON(), toArray(), and more. Also for each table there is a TableQuery() which can chain filters and create objects from the tables. For instance our uni.db has a student table with a name column. We could get an array of "S" students by doing $students = StudentQuery::create()->filterByName('S%')->find()

Task A: A student failed CS-101, let's kick them out of school. (Using active record methods. Check out Embedding a Related Query on this page)

Task B: Create a new Music course.

Task C: Update Mozart's salary (pay him more if you think he deserves it).

Practice Architectures Part 2

I'm going to give you 4 squirrel games that I have sketched. Which one is most suited to active record? Which ones need the most active interaction with a database? Which ones need the least? How would you design the architecture as the backend engineer on these projects?

Trivia Wars Using an ELO/TrueSkill system, players challenge each other to a play-by-email / asynchronous trivia battle. When a user is ready to take a turn they get three minutes to string together correct answers to increasingly difficult trivia questions. When they get one wrong or their time is up they attack (hurl a salvo of nuts at...) the other player with a strength based on the total difficulty of their answers.
Winter is Coming Single player against an AI. Nut theives are after the nuts that you bury for winter. You can roam around burying them, or even build robot-helpers to help you collect nuts from trees. The enemies try to destroy the trees they find and dig up your nuts. You can build towers that first napalm at the enemies within a certain radius. Each round is another wave/summer.
World of Squirrelcraft Massively multi-player online RPG. You control a squirrel warrior with inventory, stats, experience, armor, a history of exploits, etc. At any given moment many squirrels are online battling each other and accomplishing missions in this huge forrest. Multiple servers are active to help latency.
Squirrel-ciety Class-based Education. A teacher sets up lessons and quizzes. When a student joins they join one of several empires (red, blue, green, etc.) as a squirrel baron. The empire is trying to grow their cities and build monuments. Whenever a student finishes a lesson they unlock a new worker (artist, builder, soldier, banker, etc). They put their workforce to work on projects for their empire (build this statue). Whenever such a project is finished the workers can now be reassigned. Over time the structures fall apart and quizzes must be solved to repair them.

Task write down your first shot at a working architecture for each project. If the three layers are client-side, server-side, and database which layers get which tasks?