day 4 - models
TRANSCRIPT
Assignment 1 Review
Questions? Comments? Concerns? Too much? Too little? Anything we
should have covered more?
Pillars of Relational Databases
• ACID Compliance• Atomicity
– All or nothing, multipart transactions
• Consistency– Data written must be valid according to all rules
• Isolation– If transactions are executed concurrently, the result
will be the same as if they are executed serially
• Durability– Once committed the data is stored safely in the event
of power failure or crash
How is Atomicity achieved?
BEGIN
INSERT INTO ….
UPDATE ….
DELETE …
ALTER TABLE …
DROP TABLE …
COMMIT / ROLLBACK
Transactions
BEGIN a transaction
Execute multiple statements against the database
COMMIT to finalize them and process all of them
ROLLBACK to cancel all of them
How is Consistency achieved?
Schemas• Data types and sizes• Allowed columns• Defaults• Null allowance
Unique indexes ensure no duplicate value allowances
Foreign Keys validate id’s and remove dependent data in a cascading fashion
Constraints provide programmatic rules to check and verify column values
Triggers execute behavior based on observed changes in OTHER data in the system
Contraints
• Table Schemas
• Unique Indexes
• Foreign Keys
• Column Constraints
• Triggers
How is Isolation achieved?
Locking• Table locking
– ALTER TABLE
• Row locking– Updating this row, you can’t touch it
• Field locking– Updating this field of this row
Atomic Updates• Instead of
– my_counter = (3 + 1)– my_counter = my_counter + 1
MVCC• While an update is happening on a
row, you see the previous version rather than waiting for it to unlock
Locking
Atomic Updates
Multi Version Concurrency Control
How is Durability achieved?
It must write to the hard drive
Also, synchronous replication to ensure a backup is always available
This is pretty simple…
Race Conditions
Web Server 1
SELECT * FROM users WHERE username = ‘new_dude’
INSERT INTO users (username) VALUES (‘new_dude’)
Web Server 2
SELECT * FROM users WHERE username = ‘new_dude’
INSERT INTO users (username) VALUES (‘new_dude’)
Network Failure / Outage
Web Server
UPDATE my_table SET some_field = ‘some data’
UPDATE related_data…FAILED
Database
WRITE UPDATE
Network Failure / Power Outage / Crash
So let’s talk about ActiveRecord
Callbacks• :after_initialize• :after_find• :after_touch• :before_validation• :after_validation• :before_save• :around_save• :after_save• :before_create• :around_create• :after_create• :before_update• :around_update• :after_update• :before_destroy• :around_destroy• :after_destroy• :after_commit• :after_rollback
Hugely Convenient• Usually reliable• Great for 3rd party hooks• Totally dependent on application code• Can be bypassed easily though
– decrement– decrement_counter– delete– delete_all– increment– increment_counter– toggle– touch– update_column– update_columns– update_all– update_counters
The Tradeoff?
• How long does it take to let the database handle it?– Does Rails make letting the database do it easy?
• Increment/Decrement - YES• Unique Indexes – YES• Custom data constraints – NO• Stored Procedures - NO• Database Triggers – NO
– How hard is it if I do it in the database?• Stored Procedure – HARD• Prebuilt Constraints or Triggers – NOT HARD (just run a migration)
• Are other systems accessing this database directly?– Not yet, but how much time will it cost you if none EVER can?
• Internal API vs Connect to the DB
• How critical is this data if it gets messed up?– Is money involved? Legal?– Can the information be corrected if it gets out of sync? (Recount, Cache)
• How much time will I spend fixing this data if it gets messed up?– How long will it take to find where the problem is and fix it VS making sure I never have to?
• What is the time cost of NOT doing this in the database?– Does it take about the same amount of time?
Gems to make the database easier
• pg_search– Comprehensive Full Text Search, Similarity,
Sound-a-like– Polymorphic multisearch– Fix broken indexes
• textacular– Simple text search
• squirm_rails– PG Stored Procedures
• postgres_ext– Common Table Expressions– Network Address Datatypes
• postgresql_cursor– Use a DB cursor to return a large dataset
• carrierwave-postgresql– Store large files IN the database
• partioned– PG database partitioning for Rails
• cctopus– Multiple database connection for master-
slave / sharding scenarios
• schema_plus– Foreign keys– Enum– Indexes with conditions / expressions– View creation from migrations– Error types for database errors
• schema_associations– Automatically define model relationships
based on schema
• schema_validations– Automatically define model validations based
on DB rules
• activerecord-postgis-adapter– Upgrade database– Datatypes and migrations
• activerecord-postgis-earthdistance– Calculate lat/lon distance for queries
So, let’s do it
• Go to your project and open a rails console
– Type `rails console`
– Fully loaded rails application that you can code against
Build a query in parts
• Add conditions
• Update your query in multiple steps
• Change things with overrides
– http://guides.rubyonrails.org/active_record_querying.html#overriding-conditions
Transactions and Locking
• http://guides.rubyonrails.org/active_record_querying.html#locking-records-for-update
PostgreSQL Custom Datatypes
• Excellent blog post that covers, uuid, hstore, array
• http://yousefourabi.com/blog/2014/03/rails-postgresql/
uuid
hstore
Array