opensky infrastructure

47
OpenSky Jonathan H. Wage Tuesday, May 8, 12

Upload: jonathan-wage

Post on 15-Jan-2015

8.765 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: OpenSky Infrastructure

OpenSky

Jonathan H. Wage

Tuesday, May 8, 12

Page 2: OpenSky Infrastructure

OpenSky

Who am I?• My name is Jonathan H. Wage

• Director of Technology at OpenSky.com

• Started OpenSky Nashville office

• Open Source Software evangelist

• Long time Symfony and Doctrine core contributor

Tuesday, May 8, 12

Page 3: OpenSky Infrastructure

OpenSky

What is OpenSky?• Social discovery shopping website• Select your own team of people• Experts, influencers and tastemakers from the

fields of:–fashion–food–healthy living–home–kids

• They'll select the best products out there—just for you.

Tuesday, May 8, 12

Page 4: OpenSky Infrastructure

OpenSky

1 Year of Business• 1 year of business on April 1st 2012

–1.5 million users–100 plus high profile curators•Martha Stewart• Alicia Silverstone• The Judds• Bobby Flay• Cake Boss

–10 million connections–500k in revenue a week

Tuesday, May 8, 12

Page 5: OpenSky Infrastructure

OpenSky

Offices• Headquarters in Manhattan

• Satellite offices across the United States

–Nashville–New Hampshire–Portland –San Francisco

Tuesday, May 8, 12

Page 6: OpenSky Infrastructure

OpenSky

Technology Overview• PHP (DEVO)

• Symfony2 (Framework)• Doctrine2 (Database Persistence Libraries)

–Object Relational Mapper + MySQL–Object Document Mapper + MongoDB

• Java (OSIS)–Mule (Framework)–HornetQ (Message Queue)

• MongoDB• MySQL• Memcached• Varnish

Tuesday, May 8, 12

Page 7: OpenSky Infrastructure

OpenSky

Technology Overview–Apache–Nginx–Puppet (server configuration)–Fabric (deploys)–Github (source control)–Jenkins–pr-nightmare

»ruby bot that integrates GitHub pull requests with jenkins.

–JIRA–Nagios–Statsd–Graphite

Tuesday, May 8, 12

Page 8: OpenSky Infrastructure

MongoDB

Group 1

MongoDBPrimary

MongoDBSecondary

MongoDBSecondary

DEVO OSIS

web1 web3 web5

Group 1

web2 web4 web6

hornetq hornetq hornetq

hornetq hornetq

VIP hornetq

hornetq

failover

hornetqcluster

MySQL

MySQLMaster

MySQLSlave

MySQLSlave

Replication

Replication

varnish/load balancernginxRequest

OpenSky

Basic System Structure

Text

Group 2

Tuesday, May 8, 12

Page 9: OpenSky Infrastructure

OpenSky

Databases• MongoDB• MySQL

Tuesday, May 8, 12

Page 10: OpenSky Infrastructure

OpenSky

MongoDB• What do we store in MongoDB?– Non transactional CMS type data• Products• Catalog• Categories• Follows•Offers• CMS Site Data•Other misc. non mission critical data

Tuesday, May 8, 12

Page 11: OpenSky Infrastructure

OpenSky

MySQL• What do we store in MySQL?

–Important transactional data

•Orders

• Inventory

• Stock items

Tuesday, May 8, 12

Page 12: OpenSky Infrastructure

OpenSky

HornetQ• Cluster of HornetQ nodes–HornetQ runs on each web node–DEVO sends messages to HornetQ–OSIS consumes messages from the HornetQ cluster and performs actions on the messages• interact with third party API• chunk the work and multiple messages to other queues• upload images to s3

Tuesday, May 8, 12

Page 13: OpenSky Infrastructure

OpenSky

HornetQ Failover• If the local HornetQ is not available on the

web node it fails over to a VIP

• Protects us from losing messages if we have an issue with a local hornetq node.

Tuesday, May 8, 12

Page 14: OpenSky Infrastructure

OpenSky

Example• Image uploads in DEVO admin

–User uploads an image

– Image is stored in MongoDB gridfs temporarily

–Send a message to OSIS about the image

–OSIS downloads the image and sends it to Amazon

–When done OSIS posts back to DEVO to update the database with the new url

– Image is served from CloudFront

Tuesday, May 8, 12

Page 15: OpenSky Infrastructure

OpenSky

Example• At OpenSky we listen to the seller.follow event

and perform other actions–forward the event to OSIS• send e-mail for the follow• notify sailthru API of the user following the seller

–log the follow to other databases like an activity feed for the whole site

–rules engines. When actions like “follow” are performed we compare the action to a database of rules and act based on what the rule requires

Tuesday, May 8, 12

Page 16: OpenSky Infrastructure

OpenSky

Why sit behind HornetQ?• Ability to retry things when they fail

• Keep heavy and long running operations out of the scope of the web request

• Imagine if your mail service goes down while users are registering, they will still get the join e-mail it will just be delayed since we don’t send the mail directly from DEVO. Instead we simply forward the user.create event to OSIS

Tuesday, May 8, 12

Page 17: OpenSky Infrastructure

OpenSky

DEVO• Main components that make up DEVO–Symfony2–Doctrine2 ORM–Doctrine MongoDB ODM

Tuesday, May 8, 12

Page 18: OpenSky Infrastructure

OpenSky

Domain Model• Plain old PHP objects–MongoDB Documents–ORM Entities

/** @ODM\Document(...) */class Seller{ /** @ODM\Id */ protected $id;

// ...}/** @ODM\Document(...) */class User{ /** @ODM\Id */ protected $id;

// ...}class SellerFollow{ // ...

/** @ODM\ReferenceOne(targetDocument="User") */ protected $user;

/** @ODM\ReferenceOne(targetDocument="Seller") */ protected $seller;}/** @ODM\Document(...) */class Product{ /** @ODM\Id */ protected $id;

// ...}

/** @ORM\Entity(...) */class Order{ /** @ORM\Id */ protected $id;

/** @ODM\ObjectId */ protected $productId;

/** @Gedmo\ReferenceOne( * type="document", * targetDocument="Product", * identifier="productId" * ) */ protected $product;

// ...}

Tuesday, May 8, 12

Page 19: OpenSky Infrastructure

OpenSky

Separate model and persistence–Easier to test

–Don’t need connection or mock connection in order to test model since it is just POPO(plain old php objects)

–More flexible and portable. Make your model a dependency with a submodule• share across applications that are split up•more controlled change environment for the model and database

Tuesday, May 8, 12

Page 20: OpenSky Infrastructure

OpenSky

Working with model

$user = $dm->getRepository('User') ->createQueryBuilder() ->field('email')->equals('[email protected]') ->getQuery() ->getSingleResult();

$seller = $dm->getRepository('Seller') ->createQueryBuilder() ->field('slug')->equals('marthastewart') ->getQuery() ->getSingleResult();

$sellerFollow = new SellerFollow($seller, $user);

Tuesday, May 8, 12

Page 21: OpenSky Infrastructure

OpenSky

Thin Controllers• Keep controllers thin and delegate work to

PHP libraries with clean and intuitive APIs• A controller action in DEVO looks something

like this:class FollowController{ // ...

public function follow($sellerSlug) { // $seller = $this->findSellerBySlug($sellerSlug); // $user = $this->getLoggedInUser(); $this->followManager->follow($seller, $user); }

// ...}

Tuesday, May 8, 12

Page 22: OpenSky Infrastructure

OpenSky

Decoupled Code• Functionality is abstracted away in libraries

that are used in controllers.

class FollowManager{ // ...

public function follow(Seller $seller, User $user) { // ... }}

Tuesday, May 8, 12

Page 23: OpenSky Infrastructure

OpenSky

Decoupled Code• Decoupled code leads to–better unit testing

–easier to understand

–easier maintain

–evolve and add features to

–longer life expectancy

Tuesday, May 8, 12

Page 24: OpenSky Infrastructure

OpenSky

DEVO Events• In DEVO we use events heavily for managing

the execution of our own app code but also for communicating between systems<service id="app.listener.name" class="App\Listener\SellerFollowListener"> <tag name="kernel.event_listener" event="seller.follow" method="onSellerFollow" /></service>

class FollowManager{ // ...

public function follow(Seller $seller, User $user) { // ...

$this->dispatcher->notify(new Event($seller, 'seller.follow', array( 'user' => $user ))); }}

Tuesday, May 8, 12

Page 25: OpenSky Infrastructure

OpenSky

Listening to events• Now we can listen to seller.follow and

perform other actions when it happens.

–Create SellerFollowListener::onSellerFollow()

class SellerFollowListener{ /** * Listens to 'seller.follow' */ public function onSellerFollow(EventInterface $event) { $seller = $event->getSubject(); $user = $event['user']; // do something }}

Tuesday, May 8, 12

Page 26: OpenSky Infrastructure

OpenSky

Forwarding events to OSIS• We also have a mechanism setup in DEVO to

forward certain events to OSIS

• Configure EventForwarder to forward the seller.follow and seller.unfollow events

<parameter key="memoryqueue.queue.seller">jms.queue.opensky.seller</parameter>

<service id="follow.event_forwarder" class="EventForwarder" scope="container"> <tag name="kernel.event_listener" event="seller.follow" method="forward" /> <tag name="kernel.event_listener" event="seller.unfollow" method="forward" /> <argument>%memoryqueue.queue.seller%</argument> <argument type="service" id="serializers.sellerFollower" /> <argument type="service" id="memoryqueue.client" /></service>

Tuesday, May 8, 12

Page 27: OpenSky Infrastructure

OpenSky

The EventForwarderclass EventForwarder{ protected $client; protected $queueName; protected $serializer; protected $logger;

public function __construct($queueName, AbstractSerializer $serializer, ClientInterface $client, LoggerInterface $logger) { $this->serializer = $serializer; $this->queueName = $queueName; $this->client = $client; $this->logger = $logger; }

public function forward(Event $event) { $headers = array( BasicMessage::EVENT_NAME => $event->getName(), BasicMessage::HOSTNAME => php_uname('n'), );

if ($event->has('delay')) { $headers['_HQ_SCHED_DELIVERY'] = (time() + $event->get('delay')) * 1000; }

$parameters = $this->serializer->toArray($event->getSubject());

$message = new BasicMessage(); $message->setHeaders($headers); $message->setQueueName($this->queueName); $message->setParameters($parameters);

if ($this->logger) { $this->logger->info(sprintf('Forwarding "%s" event to "%s"', $event->getName(), $this->queueName)); $this->logger->debug('Message parameters: '.print_r($parameters, true)); }

$this->client->send($message); }}

Tuesday, May 8, 12

Page 28: OpenSky Infrastructure

OpenSky

Development Lifecycle

Tuesday, May 8, 12

Page 29: OpenSky Infrastructure

OpenSky

JIRA• Manage product/development requests and

workflow

–Managing releases

–What QA needs to test in each release

–What a developer should be working on

Tuesday, May 8, 12

Page 30: OpenSky Infrastructure

OpenSky

github• Pull requests–Code review/comments–Integration with jenkins for continuous integration

• In house github–Keep sensitive information safe and in our control• passwords mainly

–Ability to deploy when github has issues

• git flow and project branches

Tuesday, May 8, 12

Page 31: OpenSky Infrastructure

OpenSky

pr-nightmare• Robot written in ruby by Justin Hileman

(@bobthecow)–Monitors pull requests on github–Runs jenkins build for pull requests when first created and each time it is changed and comments on the pull request with success or failure–Keeps our build always stable–pr-nightmare runs on a beast of a build server so tests run fast and in groups so you get feedback fast

Tuesday, May 8, 12

Page 32: OpenSky Infrastructure

OpenSky

fabric• One click deploys–Makes deploying trivial–Get new functionality out in to the wild fast–Hotfix issues quickly

• Example commands

$ fab staging proxy.depp$ fab staging cron.stop$ fab staging ref:release/3.5.1 deploy

Tuesday, May 8, 12

Page 33: OpenSky Infrastructure

OpenSky

No downtime deploys• Web nodes split in two groups–group1•web1•web3•web5

–group2•web2•web4•web6

Tuesday, May 8, 12

Page 34: OpenSky Infrastructure

OpenSky

Deploy to one group at a time

$ fab prod proxy.not_group1

$ fab prod proxy.flip

$ fab prod proxy.all

Make all nodes live

Test group1 and make sure everything is stable.Flip the groups in the load balancer so group1 with the new version starts getting traffic and group2 stops getting traffic

Start the deploy, build everything and distribute it to the web nodes but don’t make it live

$ fab prod ref:v3.5.0 deploy.start

Pull group2 from the load balancer so it is not receiving any traffic

$ fab prod:out ref:v3.5.0 deploy.finish

Finish deploy on the out nodes (group1)

$ fab prod:out ref:v3.5.0 deploy.finish

Finish deploy on the out nodes (group2)

Tuesday, May 8, 12

Page 35: OpenSky Infrastructure

OpenSky

Depped• When a deploy requires downtime we “depp”

the site. Basically, we show a page with pictures of Johnny Depp.

• Depped:–To be put under the spell of Johnny Depp's charming and beautiful disposition.

• Depp the site with fabric and no nodes will receive traffic$ fab prod proxy.depp

Tuesday, May 8, 12

Page 36: OpenSky Infrastructure

OpenSky

Database Migrations• Deploys often require a migration to the

database–Add new tables–Add new fields–Migrate some data–Rename fields–Remove deprecated data–Anything else you can imagine

• Try to make migrations backwards compatible to avoid downtime

• Eventual migrations•Migrate data on read and migrate when updated.

Tuesday, May 8, 12

Page 37: OpenSky Infrastructure

OpenSky

Database Migrations• Doctrine Migrations library allows database

changes to be managed with PHP code in github and deployed with fabric

• Generate a new migration in DEVO$ ./app/console doctrine:migrations:generate

class Version20120330114559 extends AbstractMigration{ public function up(Schema $schema) { }

public function down(Schema $schema) { }}

Tuesday, May 8, 12

Page 38: OpenSky Infrastructure

OpenSky

Database Migrations• Add SQL in the up() and down() methods.

• down() allows you to reverse migrations

class Version20120330114559 extends AbstractMigration{ public function up(Schema $schema) { $this->addSql('ALTER TABLE stock_items ADD forceSoldout TINYINT(1) NOT NULL DEFAULT 0'); }

public function down(Schema $schema) { $this->addSql('ALTER TABLE stock_items DROP COLUMN forceSoldout'); }}

Tuesday, May 8, 12

Page 39: OpenSky Infrastructure

OpenSky

Database Migrations• Deploy migrations from the console

• Deploying with fabric executes migrations if any new ones are available

$ ./app/console doctrine:migrations:migrate

Tuesday, May 8, 12

Page 40: OpenSky Infrastructure

OpenSky

Database Migrations• Our migrations live in a standalone git

repository

• Linked to DEVO with a submodule

• Allows managed database changes to be deployed standalone from a full fabric deploy which requires pulling a group out of the load balancer.

Tuesday, May 8, 12

Page 41: OpenSky Infrastructure

OpenSky

Use third party services• Don’t reinvent the wheel outside of your core

competency–Sailthru - transactional and marketing emails–Braintree - credit card processing–Vendornet - supplier drop-ship system–Fulfillment Works - managed warehouse–Kissmetrics & Google Analytics - analytics–Chartbeat - real time statistics

Tuesday, May 8, 12

Page 42: OpenSky Infrastructure

OpenSky

Social Integration• Facebook comments–Utilize facebook comments system instead of rolling our own–Integrate with our data model via FB.api for local comments storage

• Facebook timeline–Post OpenSky actions/activity to users facebook timeline

• Facebook sharing• Pinterest sharing• Twitter sharing

Tuesday, May 8, 12

Page 43: OpenSky Infrastructure

OpenSky

Reporting/Data WarehouseMongoDB

MySQL

MySQLData

Warehouse

FulfillmentWorks

Braintree

Vendornet

flight_deck

jetstream_mongo

Replication

ETL

ETL

ETLETL

opensky_devo atmosphere

WarehouseDatabases

mongo data mysql slave rollups/aggregates

views/procs

Tuesday, May 8, 12

Page 44: OpenSky Infrastructure

OpenSky

flight_deck• DEVO and other applications only need access

to flight_deck

–Set of stored procedures that run on cron, updating stats, aggregates, rollups, etc.

–MySQL views to expose the data needed for dashboards, reports and other reporting user interfaces.

Tuesday, May 8, 12

Page 45: OpenSky Infrastructure

OpenSky

Internal communications–IRC•Day to day most real time communication between teams is done on IRC

–Jabber–Github Pull Request Comments• All code review is done in pull request comments

–Mumble• Push to talk voice chat used for fire fighting and deploys

–E-Mail lists

Tuesday, May 8, 12

Page 46: OpenSky Infrastructure

OpenSky

Mumble• http://www.mumble.com–Hosted mumble servers that are very affordable

Tuesday, May 8, 12

Page 47: OpenSky Infrastructure

OpenSky

Jonathan H. Wagehttp://twitter.com/jwagehttp://github.com/jwage

Questions?

We’re hiring! [email protected]

PHP and JAVA EngineersSystem AdministratorsFrontend Engineers (Javascript, jQuery, HTML, CSS)Quality Assurance (Selenium, Maven)Application DBA (MySQL, MongoDB)

Tuesday, May 8, 12