Download - Advanced PHP and MySQL Some Adventures and Experiments DIG 4104c – Spring 2013 J. M. Moshell
-2 - -2 -
Midterm Exam Results
Cumulative grades not available – not all presentations
(finish those today)
BUT – most projects & presentations 85% to 95%, "orbiting
around A/B", so the Midterm, Final & Project 3 are 75% of
overall score.
-3 - -3 -
Midterm Exam Results
Cumulative grades not available – not all presentations
(finish those today)
BUT – most projects & presentations 85% to 95%, "orbiting
around A/B", so the Midterm, Final & Project 3 are 75% of
overall score.
MTX: 225-250: Like an A. 200-224: Like a B. 150-199: C-ish
-4 - -4 -
The Rest of the Semester(by popular request:)
PHP and MySQL
SOAP and Web Services
Evaluating Web Services: Classroom Feedback Systems
Commercial Payment Systems & E-Commerce
Security Adventures and PCI
-5 - -5 -
Context:Registration Systems Lab
PHP and MySQL
SOAP and Web Services
Evaluating Web Services: Classroom Feedback Systems
Commercial Payment Systems & E-Commerce
Security Adventures and PCI
-6 - -6 -
Context:Registration Systems Lab
PHP custom coded
registration system
MySQL database
(one per conference)
Uses several credit
card gateways
(client owned) as well as RSL's own authorize.net gateway
-7 - -7 -
25 to 30 conferences/year
We charge $9 to $14
per registrant
We had 26 conferences
in 2012
Employees:
Carole Mann, President
David Mann, IT Manager
Mandy Mann, Conference Manager
+2 ladies and one part time professor/designer
-8 - -8 -
Context:Registration Systems Lab
Specialized feature:
multiple gateways
for one (complex)
conference.
Problem: Hackers are growing more sophisticated
PCI (Payment Card Industry) compliance – getting tougher
-9 - -9 -
cURL
SystemArchitecture
ICSE13core011
icse13MJAA13
mjaa13PLDI13
mjmembers
ieee
conf. code core code gateways
rslrslrslrsl
mjaa
... etc
-12 - -12 -
Today's problem:Insider Attack
Assume a hacker wants to capture our clients' credit card info.
Assume they're already inside our system, can modify code.
(We consider how to keep 'em out ... later)
What can we do to stop these bandits?
Idea 1: Don't keep cc info in the database.
- This is a basic rule for PCI* compliance -
Payment Card Industry Association
-13 - -13 -
Today's problem:Insider Attack
Idea 2: Develop a system to detect any changes to your code.
A kind of 'burglar alarm'.
Design constraints:
* must run whenever the code runs, to prevent use when contaminated.
* must not impact the system's functionality
== speed
== frequent interruptions of service
-14 - -14 -
Developing a Burglar Alarm
Attacking the Burglar Alarm Idea
1) What if bandito replaces 100% of your code?
* Must have a periodic external 'audit' to detect this ploy.
* Unless this audit runs frequently, SOME data will be lost.
2) What if the bandito scans your code and deactivates the alarm?
* Don't make this easy for them.
-15 - -15 -
Developing a Burglar Alarm
Some axioms of computer security:
1) Nothing is going to work ALL the time. You need layers.
2) Humans are the weakest point in the system. Automate it!
3) Security by Obscurity is a weak basis for a design. But you
must start somewhere.
== >> ACT NOW << ==
-16 - -16 -
Digital Signatures
Why don't we just make a duplicate
copy of the software, and compare for modifications?
ICSE13core011
ICSE13acore011a
= =? = =?
-17 - -17 -
Easy solutions that don't work
1) Comparing a hundred files over & over ... inefficient
2) The bandit could simply modify BOTH copies
ICSE13core011
ICSE13acore011a
= =? = =?
-18 - -18 -
What about a signature?
?? Can we design a unique shadow of some kind
which is (a) fast to compute, (b) unique? (c) informative?
ICSE13core011
ICSE13a
signature
core011a
-19 - -19 -
What about a signature?
?? Can we design a unique shadow of some kind
which is (a) fast to compute, (b) unique? (c) informative?
Fast: Something built into PHP's system, not a PHP loop
across 100+ files, 150,000 lines of code.
Unique: every different code-set has a different shadow.
Informative: if shadow1 != shadow2,
what does that tell us?
-20 - -20 -
Comes the HASH CODE:
A hash code is produced by an algorithm.
Input: a body of data (e. g. a text file.)
Output: a big integer or string.
Properties:
1) Same input tomorrow yields same output.
2) Different inputs are very unlikely to yield same output.
3) Process is not reversible.
-21 - -21 -
Really dumb HASH CODE:
Take in all the characters, convert to numbers,
add 'em up. Throw away high order digits.
This is a text for which we want the hash code.
84
104
105
etc...
---------
453664 now the 4 digit hash is 3664.
Change any text letter ... hashcode (probably) changes.
-22 - -22 -
Really smart HASH CODE:
sha1 is a hash algorithm built into PHP
- widely used for cryptographic purposes
- used for creating unique keys in git
- input: any file of up to 2^64 bits (a LARGE number)
- it's quite fast, because its widely used & needed
- Produces something like this:
4b5437055d8adaeb9b47c7dfda18f400907cc146
-23 - -23 -
The architectural concept of the Alarm:
ICSE13
self-check
core011
self-check
ICSE13a
signaturecore011a
signature
First line of defense: self-checking against a stored signature.
(Hidden, somewhere in our file hierarchy)
ICSE13a
signaturecore011a
signatureHidden signature files
-24 - -24 -
The architectural concept of the Alarm:
ICSE13core011
ICSE13a
signaturecore011a
signature
Second line of defense: periodic audit checks
against signatures on a DIFFERENT computer
ICSE13a
signature
core011a
signature
Remote signature files
Remote
audit
managerlocal
audit
agent
-26 - -26 -
Focus on the first line of defense:
How would you attack this system?
1) find the hidden
signature files
2) find the self-check
code in ICSE13
or in core011
-27 - -27 -
Focus on the first line of defense:
How would you attack this system? Why that's hard:
1) find the hidden * our system has 11 gb
signature files in 17,000 files
* filenames not known
2) find the self-check
code in ICSE13 * our system has
or in core011 lots of places to look
(and what are you
looking for?)
-28 - -28 -
Here's a partial list of the code modules:
and it's not going to be called "security scan" ... !
-29 - -29 -
A common tactic: Trigger an error message
and then search the code base for that error message.
* Defenses:
1) generate your error messages from a database
2) scramble the source code so it's unsearchable.
* But remember ... Security by Obscurity is a weak defense!
-30 - -30 -
Another common tactic: run image of code
Bandit copies our code, runs in his on WAMP environment.
Looks for file accesses, error messages if not found.
* Remedy: use file_exists to check for files,
only write to files already found.
* Turn off error messages, so no squawks if files not found.
* A VERY GOOD hacker will get you anyway, by hacking
PHP itself. But maybe we're too much trouble ....
-31 - -31 -
Designing our Burglar Alarm
Criterion 3:
Informative: if shadow1 != shadow2,
what does that tell us?
We want our signature to not only holler BURGLAR!
but to tell us which "room" he's in,
so that we can examine the attack.
-33 - -33 -
An idea: An XML signature
directory1
file1
directory2
file2
file3
file4
<rsl>
<dir>
<name>directory1</name>
<file>
<name>file1</name>
<sha>3f4eaa7843...</sha>
</file><file>
<name>file2</name>
<sha>a7844afed...</sha>
</file>
</dir>
etc
-34 - -34 -
Compare two signatures.
Where sha
don't match,
retrieve the filename
and report it.
<rsl>
<dir>
<name>directory1</name>
<file>
<name>file1</name>
<sha>3f4eaa7843...</sha>
</file><file>
<name>file2</name>
<sha>a7844afed...</sha>
</file>
</dir>
etc
-35 - -35 -
So now I know what my tasks are.
1) read the directory structure
2) construct an XML representation, with sha for each file
3) construct a comparator that can report file with difference
4) build Level 1 (self-test) into a conference
(both for core and for conference-specific code)
5) build Level 2 (auditor test) into moma, across all conferences
-36 - -36 -
Step 1: Prototype directory reading
prototype code hmm1.php
Key PHP functions: YOUR job: understand, investigate or ASK!
You need to know WHAT it does, and WHY I used it.
$d = dir($path);
$entry = $d -> read();
file_exists($filepath);
$fs=filesize($filepath);
$fstuff=implode('',file($filepath))
$fsha=sha1($fstuff)
-37 - -37 -
Step 1: Prototype directory reading
prototype code hmm1.php
Key programming techniques:
1) Show your results in detail (with <table>)
to make it easier to diagnose and debug
2) Recursion: dirget CALLS ITSELF!
3) Limiting recursion. Why do we exclude path '.' ?
-38 - -38 -
Step 1: Prototype directory reading
prototype code hmm1.php
Key programming techniques:
1) Show your results in detail (with <table>)
to make it easier to diagnose and debug
2) Recursion: dirget CALLS ITSELF!
3) Limiting recursion. Why do we exclude path '.' ?
-39 - -39 -
PRACTICE PROBLEM #1
Note: There will be several Practice Problems through this
lecture. If you want an A on the final exam,
WORK MOST OR ALL OF THEM.
If you want to not get a demerit for next Monday's lecture,
WORK AT LEAST ONE OF THEM.
Your entire team can work the same one, as long as you
can demonstrate and explain your results.
-40 - -40 -
PRACTICE PROBLEM #1
Take the demo program hmm1.php and modify
it so that it simply prints out a nice looking, hierarchical listing
of the contents of the directory to which it is pointed.
example:
test1
file1.php
file2.php
test2
file3.php
file4.php
-41 - -41 -
Step 2: XML
Prototype hmmXML2.php
Goal: create an XML text file that stores the results of
the directory traverse from prototype 1.
Method: Find a working XML example, and "steal" elements
of it.
The example function 'xemit' is my "resource mine".
-42 - -42 -
Step 2: XML
Prototype hmmXML2.php
examine 'xemit'. Note how it wants an xml string as a 'seed'.
I discover that the example's XML string seed setup requires
a specific syntax (left over from VERY EARLY PHP.)
Analyze hmmXML2.php.
Identify the key new commands.
-43 - -43 -
PRACTICE PROBLEM #2
"Retrograde" Example hmmXML2.php
That is, make it write the file 'xout.html' from the
movie example, rather than from the directory system.
Note: at this point I'm using an old function 'textsaver'
that was designed to write out arrays of text. But I have
only one 'line' of text (i. e. one string variable) and so
I put it into an array cell, $text[0].
-44 - -44 -
Step 3: Read a stored file & compare
Skipping forward to prototype hmmXML6.php:
Read MAIN to see what's happening:
1. Load a file named xdata.xml (the previous scan.)
2. store this text in $xtext1.
3. Do the dirget magic to create new $xtext2.
4. Write this as the NEW xdata.xml file
5. Now we scan for a mismatch, using substr.
if no mismatch, print "no mismatch found"
else try to find the <file> tag and say WHERE!
-45 - -45 -
Step 4: Production Code
I have replaced critical information with xyz in the
fourth ("production") version, as it's embedded in live commercial code.
Demonstrate with localhost:icse13
control=xyz; then modify regsystem.php, try xyzcheck
then try control=regtest
examine the function 'unspooger'.
Discuss how vulnerable this code REALLY is ...
Dreamweaver can seek out the word 'correct' in <1 second.
-46 -
Part 2: MySQL Extended Example
In 3134 we do 'toy' problems with small tables.
In RSL we have real-world
databases
(complex, but small)
Table structure:
-47 -
Part 2: MySQL Extended Example
In 3134 we do 'toy' problems with small tables.
Objectives of the system:
1) Flexibility: each conference has different data needs, but
we DO NOT want a unique database structure for each.
2) Historical record: We need to know all additions, deletions,
errors and corrections. This is accounting for big bucks.
So – we analyzed Drupal's table structure and stole (much of) it.
-48 -
Part 2: MySQL Extended Example
users:
attendee number, login ID, password (encrypted),
salt
(We'll discuss 'salt' in the security lecture.)
transactions:
transaction ID, attendee number, date, time, worker
So a given user can have any number of transactions
Identified by 'tid' (transaction ID) an integer.
-49 -
Part 2: MySQL Extended Examplea transaction tracks 4 kinds of information:
transtrings: Any data not financial, e. g. names, addresses.
tid, fieldname, fieldvalue (up to 50 characters)
trantexts: like transtring but can have text of ANY size
tid, fieldname, fieldvalue (any size)
-50 -
Part 2: MySQL Extended Examplea transaction tracks 4 kinds of information:
transtrings: Any data not financial, e. g. names, addresses.
tid, fieldname, fieldvalue (up to 50 characters)
trantexts: like transtring but can have text of ANY size
tid, fieldname, fieldvalue (any size)
tranumbers: how many of something, the person buys
tid, fieldname, value, attendee type, paywhen, annotation
tranmoney: payments, refunds, balances due
tid, fieldname, amount, payclass, ..when, .. etc
-51 -
Part 2: MySQL Extended Exampletranstrings: Any data not financial, e. g. names, addresses.
tid, fieldname, fieldvalue (up to 50 characters)
Example: Attendee 15001 – Joe Bloe
transaction: attnum=15001 tid = 1 Date = 3 Jan 13
transtrings:
tid fieldname fieldvalue
1 lastname Bloe
1 firstname Joe
1 address1 345 River Street
etc ... you can store INFINITE detail with simple structure.
-52 -
Part 2: MySQL Extended Exampletranstrings: Later... change his address
Example: Attendee 15001 – Joe Bloe
transaction: attnum=15001 tid = 1 date=3 Jan 13
transaction: attnum=15001 tid = 192 date = 5 Mar 13
transtrings:
tid fieldname fieldvalue
1 lastname Bloe
1 firstname Joe
1 address1 345 River Street
192 address1 678 Elm Street
etc ... demonstrate history report.
-53 -
Focus briefly on tranmoney:
tranmoney: payments, refunds, balances due
tid, fieldname, amount, payclass, paywhen, paytype,
cctransactionid, ccapprovalcode,ccexpdate, annotation
'fieldname': like moneybaldue, moneypayment
'payclass': which of the gateways was used
'paywhen': early, late, onsite
'paytype': visa, mc, american express, cash, check, etc.
'cctransactionid etc': codes to track back to the gateway
'annotation': worker can explain unusual situations here
-54 -
Part 2: MySQL Extended ExampleThis structure lends itself to producing complex
reports ... e. g.
-55 -
Part 2: MySQL Extended ExampleThis structure requires complex queries, e. g.
$q="SELECT fieldname, sum(fieldvalue),tranumbers.tid, transactions.attnum
FROM tranumbers,transactions,batchlinks
WHERE tranumbers.fieldname LIKE 't%'
AND tranumbers.tid=transactions.tid
AND transactions.attnum=batchlinks.attnum
AND batchlinks.batchname='$thisbatch'
GROUP BY fieldname";
-57 -
Let's analyze this query:
$q="SELECT paytype, paywhen, date, SUM(amount), COUNT(amount)
FROM tranmoney ,transactions, batchlinks
WHERE fieldname='moneypayment$showpayclass'
AND tranmoney.tid=transactions.tid"
AND transactions.attnum=batchlinks.attnum
AND batchlinks.batchname='$thisbatch'
AND tranmoney.amount>0
GROUP by paytype, paywhen, date";
Typical questions: Why is a variable like $thisbatch in the middle
of the query $q?
Why is the term 'fieldname' not specified as 'tranmoney.fieldname'?
What is the effect of GROUP? Relation to COUNT?
-58 -
How to gain experience with MySQL?
I don't want to make up a toy problem for you
or pull out a chunk of my working code
(like I have done with the 'security' activity)
So ... the best way to gain more experience with MySQL
is to attack a problem that you need to solve
and
take advantage of Adam and myself, while you
have us!
-59 -
FOR NEXT WEEK:
1) The programming mini-projects set out above
2) Read
http://en.wikipedia.org/wiki/SOAPhttp://www.w3schools.com/webservices/ws_intro.asp
http://www.w3schools.com/soap/soap_intro.asp
3) Make up an imaginary SOAP service to provide some
simple information (such as your weight, today)
4) Compose an example message to your imaginary
service, to request the information it can provide.