![Page 1: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/1.jpg)
the Constructocat by Jason Costello - https://github.com/jsncostello
gitDigger
Creating useful wordlists from GitHub
By: WiK & Mubix
![Page 2: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/2.jpg)
We suck at “picturing” things, so in order for this presentation to be successful you must all strip down to your underwear
CENSORED
![Page 3: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/3.jpg)
The Researcher – WiK
@jaimefilson
![Page 4: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/4.jpg)
We weren't the first to go digging…
![Page 5: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/5.jpg)
Link to blog post
![Page 6: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/6.jpg)
Only problem is you need to find a service that is “friendly” to “research”
![Page 7: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/7.jpg)
02:09 < pasv> http://www.mavitunasecurity.com/blog/svn-digger-better-lists-for-forced-browsing/02:11 <@WiK> nice find02:11 < pasv> i wish i had thought of it02:16 < mubix> Thats awesome02:16 <@WiK> ive done similar stuff, now i have a font collection that 10gb of unique fonts02:17 < mubix> wish they would add bitbucket, and github to their searches02:19 <@WiK> ive looked at scrapin github.. theres no real good way to do it
![Page 8: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/8.jpg)
The 30 minute
I CAN DO THIS!
Solution
![Page 9: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/9.jpg)
but there is no “all repos” list…
![Page 10: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/10.jpg)
Usernames & their repositories
Import osImport urllibImport urllib2Import sqlite3
Enter Python WGET
![Page 11: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/11.jpg)
Got it… now what?
![Page 12: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/12.jpg)
Wordlists
Repositories
Lots of manual review and headaches
But finally resulted in
os.walk() sortgrepawk
![Page 13: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/13.jpg)
Only the “TOP” repositories
17 hours
of manual kung-fu to process into wordlists
![Page 14: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/14.jpg)
Betterwalk.walk() vs os.walk()
Link to project
![Page 15: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/15.jpg)
Initial Thoughts
The Good News
I got the wordlists that I wanted and they were useful
The Bad News
Only the “TOP” repositories
Sqlite3 transactions were slow
17 hours of manual labor sucks
My HDD was now full
![Page 16: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/16.jpg)
Lets Get Serious
(well, kinda...)
![Page 17: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/17.jpg)
FIRST PROBLEM: STORAGE
![Page 18: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/18.jpg)
Storage Options
Pros Cheap ($99 USD a year) Built-in “indexing”
Cons Windows Only Crashes OFTEN Encryption == SLOW
Pros Central, local, fast storageCons Expensive
Remember… I’m already at 3 TB…
![Page 19: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/19.jpg)
Solution
![Page 20: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/20.jpg)
SECOND PROBLEM: PYTHON WGET
![Page 21: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/21.jpg)
GITHUB API
![Page 22: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/22.jpg)
![Page 23: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/23.jpg)
THIRD PROBLEM: SQLITE SUCKS
![Page 24: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/24.jpg)
Solution
![Page 25: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/25.jpg)
PUTTING IT ALL TOGETHER
![Page 26: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/26.jpg)
UpgradesAdded 2 modesDownloaderProcessorAdded ThreadingReplaced sqlite3 with mysql
Upgrades● Password Table
● count● name
● Username Table● count● name
● Email Table● count● name
● Projects Table● name● project● processed● grepped
● Directories Table● Files Table● Last Seen ID Table
AddedScript to add items to database
![Page 27: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/27.jpg)
Mode: Downloader
Repositories
![Page 28: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/28.jpg)
Repositories
Mode: Processor
Wordlists
Manual Cleanup
grep/egrep
.sh script
add2database.py
![Page 29: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/29.jpg)
Updated Results
The Good News
I'm now getting ALL public repositories
Generating wordlists now takes automated minutes instead of manual hours
I'm able to store the data over multiple USB HDDs
The Bad News
Carving out data such as usernames, passwords, and emails still requires some manual work which takes up a bunch of time
Huge amount of storage neededAn estimated 30TB uncompressed
![Page 30: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/30.jpg)
CUE BIG DATA DRINKING GAME
You can buy our product for the low low price of 19.95 per MB, maxing at 1 TB, each additional TB will cost one child or goat. Prices and participation may vary, see your BIG DATA representative at the door for a list of vendors who want to take your money.
![Page 31: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/31.jpg)
![Page 32: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/32.jpg)
all_dirs.txt751,991 info686,812 logs645,023 lib555,954 src490,724 test
all_files.txt846,524 README683,848 index.html408,574 ChangeLog307,197 README.txt132,053 license.txt
usernames.txt
166,997 username75,794 bob72,360 users59,595 admin45,522 user38,024 name29,799 rails25,853 sa22,981 root21,293 test
passwords.txt358,949 password118,287 foobar75,567 test53,238 secret35,842 user
Link to wordlists
![Page 33: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/33.jpg)
![Page 35: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/35.jpg)
![Page 36: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/36.jpg)
The obvious stuff...
● Wordlists for "forced" browsing as with the SVN digger project
● Small default passwords list● static_salts.txt: static salts found within github
projects.
● #22 file "Exception.php"● #323 is "file.php"● #4819 is password.txt (wtf?)
OF 19,260,460 UNIQUE FILES
![Page 37: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/37.jpg)
Burp
![Page 38: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/38.jpg)
The obvious stuff...
● #370848 ssh1_auth_keys● #185308 ntlmsso_magic.php
keeps going... too much fun.. but how real world is this stuff?
![Page 39: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/39.jpg)
coulda woulda shoulda.. SO WHAT!
![Page 40: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/40.jpg)
coulda woulda shoulda.. SO WHAT!
![Page 41: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/41.jpg)
The not so obvious
● Starting to parse every file from the git revision history (thought you removed that default password did ya?)
● Mass static code analysis for vulnerabilities● One of the top directories is ".svn", another is
".settings" ;-)
● Parsing .gitignore of production targets● Verify directories w/ HTTP 403 on .
empty_directory and .DS_Store files
![Page 42: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/42.jpg)
The not so obvious
● Run OCR on all image files● Using list of .txt files for intelligence gathering● Grep out ALL email addresses
![Page 43: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/43.jpg)
STOP you're just giving them ideas
![Page 44: gitDigger - DEF CON · Starting to parse every file from the git revision history (thought you removed that default password did ya?) Mass static code analysis for vulnerabilities](https://reader030.vdocuments.mx/reader030/viewer/2022011908/5f61e0559b068735eb177d26/html5/thumbnails/44.jpg)