tracking huge files with git lfs (gluecon 2016)

75
@kannonboy Photo: Le Monde en Vidéo @kannonboy

Upload: tim-pettersen

Post on 12-Feb-2017

216 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboyPhoto: Le Monde en Vidéo

@kannonboy

Page 2: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy@kannonboyPhoto: Le Monde en Vidéo

Page 3: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Git LFS!

Git LOB!

@kannonboyPhoto: Le Monde en Vidéo

Page 4: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

ok cool

@kannonboyPhoto: Le Monde en Vidéo

Git LFS!

Page 5: Tracking huge files with Git LFS (GlueCon 2016)

TIM PETTERSEN • SENIOR DEVELOPER • ATLASSIAN • @KANNONBOY

Tracking huge files with Git LFS

Page 6: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

G I T L F S

T H E P R O B L E M W I T H B I G F I L E S

T I P S F O R T E A M S

Agenda

C O N V E R T I N G Y O U R R E P O

Page 7: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

data model

Page 8: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

master

feature/JIRA-123

Page 9: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

434bb..tree

bab1e..parent

Tim P <kannonboy@…> 1455209277 -0800committer

Tim P <kannonboy@…> 1455209277 -0800author

My life is my commit message.

98ca9..

bab1e..

fad3d.. git cat-file -p 98ca9$

Page 10: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

git cat-file -p 434bb$

434bb..

98ca9..

bab1e..

fad3d..

ace23..100644 blob .gitignoredbdbd..100644 blob README.mda0bc3..040000 tree app33d33..040000 tree configb1de7..100755 blob deploy-prod.sh7011e..100755 blob deploy-staging.sh

typefilemode SHA-1

Page 11: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

434bb..

98ca9..

bab1e..

fad3d..

ace23.. 1010101

dbdbd..

a0bc3..

33d33..

b1de7..

7011e..

1010101

1010101

1010101

master

1010101

1010101

1010101

Page 12: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

98ca9..

bab1e..

fad3d..

Page 13: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

98ca9..

bab1e..

fad3d..

Page 14: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

50mb

100mb

150mb

Page 15: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Page 16: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

(Large File Storage)

Git LFS

Page 17: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

$

LFS store

Git host

Page 18: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

Git host

LFS store

git push$

Page 19: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

git pull$

LFS store

Git host

Page 20: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..☞

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

https://git-lfs.github.com/spec/v1version

sha256:325ddfb…oid

29342295size

git cat-file -p 4749d$

4749d..

bdd12..

778aa..

Page 21: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ brew install git-lfs

$ git lfs install

Page 22: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ cat ~/.gitconfig

[filter "lfs"] clean = git-lfs clean %f smudge = git-lfs smudge %f required = true

Page 23: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ git lfs track “*.mp4”

$ cat .gitattributes

*.mp4 filter=lfs diff=lfs merge=lfs -text

Page 24: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

massive_video.mp4

Work tree

dev

.git/lfs/objects

Clean filter(git-lfs clean)

Index

massive_video.mp4

$

.git/objects

git add

Page 25: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$

dev

.git/lfs/objects

Smudge filter(git-lfs smudge)

Work tree

massive_video.mp4

Commit tree

massive_video.mp4.git/objects

LFS Store

git checkout

Page 26: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

.git/lfs/objects

.git/objectsHosted repo

LFS store

git push / pull

Page 27: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ ls .git/hooks/

commit-msg.sample post-update.sample pre-commit.sample pre-push ...

Page 28: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ git push

Git LFS: (12 of 13 files, 1 skipped) 168.75 MB / 180.87 MB, 12.12 skipped

Counting objects: 22, done. ...

Page 29: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ git pull

remote: Counting objects: 3, done. ... Downloading massive_video.mp4 (38.79 MB) ... 1 file changed, 2 insertions(+)

Page 30: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ git clone ssh://git@bitbucke..

Cloning into ‘big_repo’ ... Downloading massive_video.mp4 (38.79 MB) ... Checking out files: 100% (13/13), done.

Page 31: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Auth

SSH / HTTP

HTTP ONLY

Git server with embedded LFS

Store

Page 32: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Auth

SSH / HTTP

HTTP ONLY

Git server

Separate LFS store (e.g. S3)

Page 33: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

LFS aware Git server LFS storeDev

git clone https://..

repo pack files

smudge filter

cafebabe is over there

I need object cafebabe

GET …/cafebabe

object cafebabe…

Page 34: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

POST .../repo.git

I need object cafebabe

@kannonboy

/info/lfs/objects/batch{“objects”:[

{ “oid”: “cafebabe...”, “size”: 40689401 }, ...

], “operation”: “download”}

Page 35: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

200 OK

{“objects”:[ {“oid”: “cafebabe…”, “size”: 40689401, “actions”: {

“download”: { “href”: “https://…/lfs/cafebabe…”,

@kannonboy

cafebabe is over there

“header”: { “Authorization”: “JWT eyJ0eXA…”, } } }

...

Page 36: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

LFS aware Git server LFS storeDev

git push

please upload it over there

I want to upload cafebabe

POST …/cafebabe

I uploaded cafebabe

Page 37: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

200 OK

{“objects”:[ ... “actions”: {

“upload”: { “href”: “https://…/lfs/cafebabe…”,

… }

@kannonboy

, “verify”: { “href”: “https://…/lfs/callback”,

… }

} ...

go upload it over there

I uploaded cafebabe

Page 38: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

LFS aware Git server LFS storeDev

git push ssh://…

please upload it over there

I want to upload cafebabe

POST …/cafebabe

I uploaded cafebabe

where is the LFS API?

the LFS API is over there

Page 39: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ ssh git@bitbucket git-lfs-authenticate \ project/repo.git download

{ “href”: “https://…/lfs/objects/batch”, “header”: { “Authorization”: “JWT eyJ0eXA...” } }

where is the LFS API?

the LFS API is over there

@kannonboy

Page 40: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

LFS aware Git server LFS storeDev

git clone https://..

repo data

POST /info/lfs/objects/batch

LFS objects hypermedia

GET …/<objectSHA>

smudge filter

happens once per

file checked out

Page 41: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

LFS storeDev

git lfs clone https://..

repo data

GET …/<objectSHA>

batched

smudge filter

subtle difference!

POST /info/lfs/objects/batch

LFS objects hypermedia

LFS aware Git server

Page 42: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Converting to Git LFS

Page 43: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

434bb..

fad3d..

98ca9..

41222..

dabad..100mb

150mbace34..

☞ 150mb!?!?

Page 44: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

git filter-branch

$ git filter-branch --force --index-filter \ 'git rm --cached --ignore-unmatch big_video.mp4’ \ --prune-empty --tag-name-filter cat -- --all

DON’T DO

THIS!

Page 45: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ git filter—branch --prune-empty --tree-filter ' git config -f .gitconfig lfs.url “https://bitbucket.example.com/team/repo.git” git lfs track "*.mp4" git add .gitattributes .gitconfig

for file in $(git ls-files | xargs git check-attr filter | grep "filter: lfs" | sed -r "s/(.*): filter: lfs/\1/"); do git rm -f --cached ${file} git add ${file} done' --tag-name-filter cat -- --all

@kannonboy

DON’T DO

THIS

EITHER!

Page 46: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

BFG Repo-Cleaner

@kannonboy

by @rtyley

Page 47: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

BFG Repo-Cleaner

@kannonboy

10-720x faster than filter-branch

built to kill history

Git LFS support

by @rtyley

Page 48: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ git filter—branch --prune-empty --tree-filter ' git config -f .gitconfig lfs.url “https://bitbucket.example.com/team/repo.git” git lfs track "*.mp4" git add .gitattributes .gitconfig

for file in $(git ls-files | xargs git check-attr filter | grep "filter: lfs" | sed -r "s/(.*): filter: lfs/\1/"); do git rm -f --cached ${file} git add ${file} done' --tag-name-filter cat -- --all

@kannonboy

DON’T DO

THIS

EITHER!

Page 49: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ brew install bfg

$ bfg —-convert-to-git-lfs ‘*.{zip,mp4}’ --no-blob-protection

Page 50: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Repofactoring

@kannonboy

Page 51: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Identifying large objects

github.com/bloomberg/repofactor

by @hashpling

Page 52: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$

a295ef4… 102437 95372 2cc7063… 152171 140443

blob SHA size on disk average blob size

generate-larger-than 50000

Identifying large objects

Page 53: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

a295ef4… 102437 95372 2cc7063… 152171 140443 , PNG, 1148 x 482 , PNG, 1101 x 800

$ generate-larger-than 50000 \ | add-file-info

Identifying large objects

Page 54: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ generate-larger-than 50000 \ | add-file-info

2cc7063… 152171 140443, PNG, 1101 x 800 a295ef4… 102437 95372, PNG, 1148 x 482

order by average blob size

\ | sort -k3nr

Identifying large objects

Page 55: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ generate-larger-than 50000 \ | add-file-info \ | sort -k3nr

$ report-on-large-objects big-stuff.txt

logo-hdpi.png 2cc7063… 152171 140443, PNG… logo-mdpi.png a295ef4… 102437 95372, PNG…

paths

> big-stuff.txt

Identifying large objects

Page 56: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ bfg —-convert-to-git-lfs ‘logo-*.png’ --no-blob-protection

Identifying large objects

Page 57: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Enable in Bitbucket

Page 58: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Page 59: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Tips for teams

Page 60: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboyBeware merge conflicts @kannonboy

Page 61: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

…meanwhile in

Page 62: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

…meanwhile in

Page 63: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboyTeamwork @kannonboyTeamwork

Page 64: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ git lfs fetch

$ git config lfs.fetchrecentalways “true”

master

feature0

feature1

--recent

T-minus 7 days

Page 65: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ git config lfs.fetchrecentalways “true”

lfs.fetchrecentrefsdays

lfs.fetchrecentremoterefs

lfs.fetchrecentcommitsdays

(default = 7)

(default = 0)

$ git lfs fetch --recent

Page 66: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

$ git lfs prune

lfs.pruneoffsetdays

lfs.pruneverifyremotealways

(default = 3)

Page 67: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboyFetch the bare necessitiesFetch the bare necessities @kannonboy© Disney

Page 68: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

# for a build that just runs the unit tests $ git lfs fetch --exclude Assets/**

# for an audio engineer $ git lfs fetch --include Assets/Audio/**

$ git config lfs.fetchexclude Assets/**

$ git config lfs.fetchinclude Assets/Audio/**

Page 69: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboyIDEs & GUI tools @kannonboy

Page 70: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Page 71: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

1. Install EGit Team Provider version 4.2+ 2. Make sure git-lfs is on your path

Page 72: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Page 73: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

Page 74: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

SourceTree

Page 75: Tracking huge files with Git LFS (GlueCon 2016)

@kannonboy

coming soon!

Bitbucket Cloud

git-lfs.github.comdocs

github.com/github/git-lfs

source

atlassian.com/bitbucket

Bitbucket Server

Lookingfor

more?

Follow me for occasional Git, Bitbucket & JIRA trivia