trevor joynsontrevor joynson fighting the good fight against entropy. software engineer & devops...

4
TREVOR JOYNSON TREVOR JOYNSON fighting the good fight against entropy. Software Engineer & DevOps Hybrid github , linkedin , youtube , [email protected] , 330-353-8738 OBJECTIVE Multi-disciplined Software Engineer specializing in DevOps with 12+ years of experience across a wide range of environments from health care to government looking to fully utilize and grow my skill set doing what I love, working with a fantastic set of human beings. I am happiest with a healthy balance of software and devops engineering on my plate. Let’s do great things. SKILL SET I’ve done my share of design, implementation, and scaling of the backend as well as the infrastructure for a number of different projects, varying widely in scale from personal toy to many million unique visitors per month. Well versed in best practices for redundancy, performance, concurrency, security, and practical application/infrastructure design/implementation. Adept at understanding and troubleshooting large, complex, and/or unknown systems. That all said, I would love to go deeper, truly. Linux*, Python 2/3*/Cython*, Go, Shell Zsh*/Bash4, SecOps*, Networking, C/C++ | django*, flask*, sanic*, aiohttp*, mux*, celery*, asyncio*, ROS* Databases PostgreSQL* | Redis* | Elastic* | MariaDB/MySQL Clouds/PaaS/IaaS Kubernetes* | AWS* | Mesos/Marathon/Chronos | GCE/Google* | Dokku | Deis | SoftLayer | Baremetal* Containers Docker*, LXD* – Used since the years each was released pretty heavily. Before those, I used LXC, and even before, linux-vserver. Orchestration SaltStack*, Terraform*, Puppet, Ansible, CloudFormation Service Discovery Etcd* | Consul* | ZK | DNS* Networking Linux* | Cisco IOS/PIX | JunOS | Infiniband* (SDR FDR) | Multi-tenant | Advanced Routing* | Redundant Architectures* | QoS* Storage ZFS* | SCST*/LIO/DRBD | iSCSI/SRP*/iSER* | GlusterFS* | Ceph* CI/CD Jenkins | Drone | GitLab | buildbot* | fab | SaltStack* Various nginx* | squid* | varnish* | haproxy* | sensu* | grafana* | kibana* | memcached | fluentd* | aptly* | devpi* | REST* | Eligible for Top Secret clearance, and that’s all I can say about that. (favorites have been asterisked*) Come on then, I will swear to study so to know the thing I am forbid to know –Shakespeare via Berowne

Upload: others

Post on 27-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TREVOR JOYNSONTREVOR JOYNSON fighting the good fight against entropy. Software Engineer & DevOps Hybrid github, linkedin, youtube, career@trevor.joynson.io, 330-353-8738 OBJECTIVE

TREVOR JOYNSONTREVOR JOYNSON fighting the good fight against entropy.

Software Engineer & DevOps Hybrid github, linkedin, youtube, [email protected], 330-353-8738

OBJECTIVE

Multi-disciplined Software Engineer specializing in DevOps with 12+ years of experience across a wide range of environments from health care to government looking to fully utilize and grow my skill set doing what I love, working with a fantastic set of human beings. I am happiest with a healthy balance of software and devops engineering on my plate. Let’s do great things.

SKILL SET

I’ve done my share of design, implementation, and scaling of the backend as well as the infrastructure for a number of different projects, varying widely in scale from personal toy to many million unique visitors per month. Well versed in best practices for redundancy, performance, concurrency, security, and practical application/infrastructure design/implementation. Adept at understanding and troubleshooting large, complex, and/or unknown systems.That all said, I would love to go deeper, truly.

Linux*, Python 2/3*/Cython*, Go, Shell Zsh*/Bash4, SecOps*, Networking, C/C++ | django*, flask*, sanic*, aiohttp*, mux*, celery*, asyncio*, ROS*

Databases PostgreSQL* | Redis* | Elastic* | MariaDB/MySQL

Clouds/PaaS/IaaS Kubernetes* | AWS* | Mesos/Marathon/Chronos | GCE/Google* | Dokku | Deis | SoftLayer | Baremetal*

Containers Docker*, LXD* – Used since the years each was released pretty heavily. Before those, I used LXC, and even before, linux-vserver.

Orchestration SaltStack*, Terraform*, Puppet, Ansible, CloudFormation

Service Discovery Etcd* | Consul* | ZK | DNS*

Networking Linux* | Cisco IOS/PIX | JunOS | Infiniband* (SDR FDR) | Multi-tenant | Advanced Routing* | Redundant Architectures* | QoS* →

Storage ZFS* | SCST*/LIO/DRBD | iSCSI/SRP*/iSER* | GlusterFS* | Ceph*

CI/CD Jenkins | Drone | GitLab | buildbot* | fab | SaltStack*

Various nginx* | squid* | varnish* | haproxy* | sensu* | grafana* | kibana* | memcached | fluentd* | aptly* | devpi* | REST* | Eligible for Top Secret clearance, and that’s all I can say about that.

(favorites have been asterisked*)

Come on then, I will swear to study so to know the thing I am forbid to know –Shakespeare via Berowne

Page 2: TREVOR JOYNSONTREVOR JOYNSON fighting the good fight against entropy. Software Engineer & DevOps Hybrid github, linkedin, youtube, career@trevor.joynson.io, 330-353-8738 OBJECTIVE

2018 – NowOsaroAISoftware Engineer (DevOps)

• Implemented numerous concurrency fixes for our complex applications, examples would be how to handle sharing memory and lock-less implementations. An example would be a blazing fast lock-less implementation of Python events backed by Sys-V shared memory that could be reused for shared resources across concurrent Python processes.

• Packaged all of our applications along with numerous [previously] unpackaged compiled libraries, simplifying installation and usage. Osaro went from having no packaging at all, to having private packages for development as well as debs for deployment to customers. Having N deployments required to be updated independently, I opted for a mixture of a private pypi and s3 hosted deb repositories (using some scripts I wrote to automate aptly) per deployment. Previously, package inter-dependencies were handled via a combination of git submodules and shell scripts that were not cross platform compatible.

• Created template repository providing examples on how to setup a test suite and packaging for new projects, including debianization, setuptools, tox, pytest, a Dockerfile, docker-compose, among more. Brought Docker and docker-compose into proper development and production usage for all projects, as well as streamlined it for easy reuse. I came up with a method to run tensorflow-gpu even on cpu machines, creating docker images that transparently ran both on GPU and CPU nodes without any changes.

• Implemented a dataset uploader and viewer for our uploaded training datasets. The viewer spawned k8s job on demand to extract the data for viewing, and utilized some more modern paradigms for the frontend, ie asynchronous preloading for usability’s sake.

• Worked to increase reliability of our usage of depth cameras via embedded boxes with Intel RealSense depth cameras attached in order to run more than USB bandwidth could support reliably on a single node, as well as providing the ability to reset them from software, as they had a tendency to just crash hard. This involved creating a custom low-latency performance optimized Linux kernel patchset/build for our HLC boxes with hand picked fixes for USB3 bandwidth and our depth camera models, integration and bug fixes to software to control USB hubs to control the power of USB ports, PXE boot of tiny boards supporting USB3, as well as in-software live 802.11af power control of nodes via Cisco switches.

• Debugged numerous segfaults we encountered with gdb, traced it down to a patch that was accepted into Python 2.x but forgotten to be added to Python 3.x relating to memory management of compiled extensions during interpreter shutdown. If you want my analysis notes just ask!

• Deployed SaltStack, automated our entire 40-page HLC installation and setup to a single command. Other things automated were ceph cluster setup, Kubernetes deployment to Paperspace, WireGuard vpn services, SSH key management, etc.

• Implemented 12 factor configuration that optionally integrated with Django’s settings that automatically generated click command line parameters, reducing boilerplate.

• Greatly decreased test run times on multiple projects by 500% on average – from ~25 minutes to 5 minutes. This was done by parallelizing our tests per run, as well as distributing tests to both GPU and CPUs in a balanced fashion according to what nodes were not in use, setting up a custom CI server using buildbot, implementing features into buildbot such as k8s support, gpu support, molding our test suite to be GPU aware (many of our tensorflow tests were math-heavy and sped up considerably using GPUs), optimizing our test suite, as well as optimizing how we handled caching across runs.

• Maintained AWS infrastructure, created and deployed helm charts for applications, an example is available @ osaroai/localshop .

PROJECTS (Feel free to check my pinned on github as well as my projects playlist on my youtube as well!)

• doxy Automagic socks/https/http proxy, DNS server, and browser extension for safe *and* convenient local containerized development.

• mainline Python DI framework. I wrote this for work originally, so there's pretty docs.

• ledbot Social LED matrix control. Plays media urls from Slack, API, and MQTT..

• beerbot Future beer delivery robot using inference via CUDA on a TX2 to recognize fridges, beer selection, and people. Currently can drive around autonomously, creates a constantly updated map from it’s environment using lidar and a depth camera, and most recently can recognizes fridges and people. Primarily a learning exercise.

• boilerplate Boilerplate eradication for an immutable world. I’m somewhat grudgingly adding this here, as it was originally just a group of scripts I had as an idea years ago, but it has proven immensely useful in building containerized applications in a clean way.

• uninhibited Dead simple and easily extensible a?sync event handling in Python via callback or class dispatch.

• salty-whales Layered Salt states to build containers.

• pysanity Emulates sanity for other people's dirty ass non-pep compliant code via only the dirtiest of means

• h ttp-parser Extremely fast Cython wrapper for Joyent/Node/nginx’s C HTTP parser.

Page 3: TREVOR JOYNSONTREVOR JOYNSON fighting the good fight against entropy. Software Engineer & DevOps Hybrid github, linkedin, youtube, career@trevor.joynson.io, 330-353-8738 OBJECTIVE

2016 – 2018Disqus, IncSenior DevOps Engineer

• Brought all creationary/rate limited endpoint (ie post comment, logins) timings down 50% (1.5s 0.7s) by rewriting the code controlling our caching layer. There was a bug in the old that →caused it to leak connections into C land. This fixed that as well as dramatically improved our test suite and test timings by a few minutes each run!

• Pushed for a k8s deployment since it’s so wonderful. Finally got my chance for a new project. Created a nice helm chart for Airflow components, hooked it into CI by rewriting our aging CI build process in a much simpler fashion. This sped up all builds by another few minutes, by allowed caching to be shared amount our build slaves, equating to a 5x speedup in non-trivial diffs. Other benefits I squeezed in there are ensuring then end state is either all deployed or none, with a staged slow rollout in between, where tests have to pass showing each component service node as online and ready for traffic or it cancels the deploy.

• During a time of less than optimal cash flow, pushed for and implemented my share of an idea that ended up saving us 30k/month (switching CDN providers). We couldn’t switch easily in full, so I had another idea,: to (as a temporary hack for a large instant gain) put one in front of the other. Still works to this day, and it, along with other improvements we accomplished worked to save us from extinction.

• Reimplemented our email sending layer to track bounces and ARF spam reports from providers, even in the case of redacted content, using a method similar to what I did with my emagnifigance project at LocSol. End result is our ips are no longer marked as spam on lists. There are more stages of this project I’ve outlined to further assist delivery of important messages.

• We’d been running out of space on our primary postgres datastore nodes and their read slaves for a long time. I implemented ZFS across them, slowly one at a time, using my experience gained from creating ZFS SANs at LocSol to create something which could easily withstand our IO workload. These new nodes far over performed in terms of both IOPs and IO bandwidth while saving a solid 35% of space from our 16TB of SSDs per node due to transparent compression alone. They run like champs.

• Reimplemented a sizable portion of our SaltStack infrastructure from before I got there that was on the [mis]usage side (as if it were Puppet) that was causing us pains. Wrote states as required as we migrated machines/roles to Salt (from Puppet) along with long due distro upgrades.

• Containerized our applications, both new and old, in a clean and sane fashion, based upon my docker boilerplate repository. This includes fully transparent ways of using containerized applications (ask me about this, I’m rather excited about it still!)

2014–2016Vertical Knowledge (VK)DevOps Lead Senior Software Engineer→

• Worked closely with government agencies as a subcontractor on a few classified projects.

• Lead quite a few movements in the company: 12-factor application development, automation, configuration management, cloud orchestration (previous they were hand-baking AMIs), continuous Integration and deployment, Docker for development for all new projects, eventually Docker in production using Mesos/Marathon, ECS, Rancher, and Kubernetes according to the project’s need. Implemented SaltStack for orchestration and configuration management and orchestration.

• Lead evolution of security policies as well as implementation of best practices in areas such as: CA management, GPG, SSL everywhere, multi-factor, centralized logging, CI/CD across many projects, etc

• Developed a distributed scalable web proxy to handle our specific needs for proxying our traffic from our spiders. Python 3.5 using Tornado. Utilized multi-processing heavily to work around the GIL’s nasty existence, as well as asynchronous logging through a dedicated thread since python logging is quite a hog with scale. Performed analytics and took actions based upon the results, such as rate limiting and detecting registered banned responses/status codes from sites. When a ban was detected, it would back off of that node and site exponentially, then ramp back up. The request was then replayed (transparent to the client) to a different download proxy according to registered limits based on rules. As a part of this project, I wrote Cython bindings for node’s C HTTP parser library for performance. A very early version is on my GitHub: akatrevorjay/http-parser

• Distributed requests among a large array of down line proxies across the globe, according to registered rules.

EDUCATION

2004–2006University of AkronComputer Science (via post-secondary)

2009–2010DeVry UniversityElectrical Engineering (Left to start LocSol)

Page 4: TREVOR JOYNSONTREVOR JOYNSON fighting the good fight against entropy. Software Engineer & DevOps Hybrid github, linkedin, youtube, career@trevor.joynson.io, 330-353-8738 OBJECTIVE

2010–2014Localhost Solutions (LocSol)Co-Founder

• Built company from the ground-up, designed and implemented entire infrastructure as tech lead in a hosted services company, providing virtual servers, web/email hosting, VDI (Xen/View/SPICE), and managed VOIP solutions.

• Deployed nearly every device and piece of software used: Juniper, HP C7000 Blade Enclosures, Cisco OpenStack, KVM, VMware, Ceph to start.

• Deployed Salt configuration automation to put an end to manual configuration, automated as much as possible.

• Deployed Infiniband networking for backend for that low-latency RDMA goodness.

• Wrote mass communication API in Python for use by other projects; utilized MongoDB for database backend. Used to send templated emails and SMS messages to a large list of destinations. Provided to clients strictly not for spam, but for applications and websites as a service.

• Designed and built HA storage backend using SRP/iSCSI DRBD ZFS. This ended up being packaged up and sold to clients. I wrote a Python management interface for it, utilizing → →ZeroMQ for networking, with beacon based cluster autodiscovery, transactional shared key value store for configuration, MongoDB for logging and analytics, and Cubism for metrics display. This management interface handled cluster management, provided a CLI, provided the HA portion of the Active/Passive cluster via heartbeats and health checks, log access, and metrics view.

2010--2011DroidModCo-Founder

• Our team (of four) was the first to discover a novel root method and deliver this via installation for free with manual installation, or for a small donation ($5) on the Play store, at the time called Market. We made $17,000 in a single month from donations alone! Our app used that newly found power to install our recovery image to flash as well as our AOSP fork.

• Created our own recovery modifications for the device, got past automated updates and bootloader security by simply faking that the original mtd partitions still existed to it and therefore [temporarily] the kernel. Worked around the limited recovery space by using a unionfs with a ramdisk loaded from the sdcard to facilitate for charging in recovery.

• Brought up our own Gerrit instance to manage changes to AOSP, used that to field pull requests for review. Automated CI testing per commit.

2007–2012CTMSTechnician Lead Consultant R&D Engineer→ →

• Completely revamped their Linux web/spam filtering appliance and became the lead developer of the product. Implemented configuration management, AD authentication, SSL MITM content filtering, developed centralized update/configuration/analytics management. Modified DansGuardian and C-ICAP to allow for templated blocked pages and integration into the appliance. Moved all custom components into git managed Debian packages, that upon commit were send to a build server, then put into associated APT repositories for package management. Wrote iptables configuration system that worked around common gotchas in integrating the appliance into new networks. Automated installation from PXE boot to ready to use appliance. Trained team on how to use and manage appliance.

• Designed and implemented numerous virtual environments, utilizing KVM and/or VMware.

• Designed and implemented numerous AD deployments, some of which were 200+ PCs. Not one for tedious tasks, I wrote a system which migrated computers and their associated user accounts/data/passwords to a new Active Directory installation, from a workgroup install, or from another domain, near-automatically. What was previously a long, drawn out process, turned into a quick simple execution that actually did more than what the manual process did to begin with, which minimized the amount of time and technical knowledge needed to do a full domain migration. This allowed them to free up the higher level staff for other things.

• Wrote a system that automated deploying all required software on client PCs. Upon first boot up after being joined to the network, all required updates were installed and verified, all software was configured using pushed configuration, most of which didn't have built-in support for such necessities.

The test of the machine is the satisfaction it gives you. There isn't any other test. If the machine produces tranquility it's right. If it disturbs you it's wrong until either the machine or your mind is changed.– Robert M. Pirsig