tracking huge files with git lfs

70
Tracking huge files with Git LFS LARS SCHNEIDER GIT SOLUTIONS LEAD • AUTODESK @KIT3BUS STEVE SMITH DEVOPS ADVOCATE • ATLASSIAN @TARKASTEVE

Upload: atlassian

Post on 19-Mar-2017

6.167 views

Category:

Technology


2 download

TRANSCRIPT

Tracking huge files with Git LFS

LARS SCHNEIDER • GIT SOLUTIONS LEAD • AUTODESK • @KIT3BUS

STEVE SMITH • DEVOPS ADVOCATE • ATLASSIAN • @TARKASTEVE

T H E G I T L F S M O D E L

T H E P R O B L E M W I T H B I G F I L E S

Agenda

M I G R AT I O N

G I T L F S P E R S O N A S

T H E G I T D ATA M O D E L

data model

$> git init$> tree .git/objects.git/objects── info

└── pack

2 directories

$> touch some-file.txt $> git add some-file.txt

$> tree .git/objects.git/objects── e6

│   └── 9de29bb2d1d6434b8b29ae775ad8c2e48c5391── info

└── pack

3 directories, 1 filezlib compressed

SHA1

Type Chapter title here

master

98ca9..

bab1e..

fad3d.. cat .git/refs/heads/master$

fad3dd41d0cf3d1b6aa2d8ad0549ab2fcb1575d1

“Directed Acyclic Graph”

master

98ca9..

bab1e..

fad3d..

434bb..tree

bab1e..parent

Tim P <kannonboy@…> 1455209277 -0800committer

Tim P <kannonboy@…> 1455209277 -0800author

My life is my commit message.

git cat-file -p 98ca9$

git cat-file -p 434bb

ace23..100644 blob .gitignoredbdbd..100644 blob README.mda0bc3..040000 tree app33d33..040000 tree configb1de7..100755 blob deploy-prod.sh7011e..100755 blob deploy-staging.sh

typefilemode SHA-1

master

98ca9..

bab1e..

fad3d..$

434bb..

master

98ca9..

bab1e..

fad3d..

434bb..

master

98ca9..

bab1e..

fad3d..

434bb..

98ca9..

bab1e..

fad3d..

master

98ca9..

bab1e..

fad3d..

master

98ca9..

bab1e..

fad3d..

master

50mb

100mb

150mb98ca9..

bab1e..

fad3d..

master

(Large File Storage)

Git LFS

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

$

LFS store

Git host

Git host

LFS store

$

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

LFS store

git push$

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

Git host

git pull$

LFS store

Git host

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

4749d..

bdd12..

778aa..

git checkout bab1e$

LFS store

Git host

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

4749d..

bdd12..

778aa..HEAD

https://git-lfs.github.com/spec/v1version

sha256:325ddfb…oid

29342295size

git cat-file -p 4749d$ ☞

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

4749d..

bdd12..

778aa..

massive_video.mp4

Work tree

dev

.git/lfs/objects

Clean filter(git-lfs clean)

Index

massive_video.mp4

$

.git/objects

git add

$

dev

.git/lfs/objects

Smudge filter(git-lfs smudge)

Work tree

massive_video.mp4

Commit tree

massive_video.mp4.git/objects

LFS Store

git checkout

.git/lfs/objects

.git/objectsHosted repo

LFS store

git push / pull

$ brew install git-lfs

$ git lfs install

$ cat ~/.gitconfig

[filter "lfs"] clean = git-lfs clean %f smudge = git-lfs smudge %f required = true

$ git lfs track “*.mp4”

$ cat .gitattributes

*.mp4 filter=lfs diff=lfs merge=lfs -text

@kit3bus

Lars Schneider Autodesk Inc.

Git and Git LFS contributor

Technical Lead forGit at Autodesk

@kit3bus

Who are we?• Best known for AutoCAD

2D and 3D computer-aided design

• 33 years in business

• 4000 engineers, hundreds of products, terabytes of code and asset data

@kit3bus

Architecture, Engineering and Construction

Image by Dave Tyner, Autodesk Plant 3D - P&ID

@kit3bus

Manufacturing

@kit3bus

Media and Entertainment

@kit3bus

3D Printing

"Future of Making Things"

Image courtesy of Local Motors Inc.

@kit3bus

What do we use Git LFS for?

Integration Test Data(3D Models, ...)

Auxiliary Data(Documentation, Images, Videos, ...)

Build Artifacts(not recommended)

@kit3bus

D E V E L O P E R

M I G R ATO R

A D M I N I S T R ATO R

- Git LFS Usage - What have we learned?

@kit3bus

Migrator

@kit3bus

Migration Process

1 Identify an engineer with deep code knowledge

Create a "demo" migration on Git migration server

Iterate on "demo" migration until repo and CI are OK

Ask broader team to "play" with the "demo" migration

Perform migration on Git production server

2

3

4

5

@kit3bus

How to migrate?

@kit3bus

How to migrate?

git-svn / git-p4 / git-tfs ...

@kit3bus

How to migrate?

+

git-svn / git-p4 / git-tfs ...

git filter branch / git-lfs-migrate

@kit3bus

How to migrate?

+

git-p4

git-svn / git-p4 / git-tfs ...

git filter branch / git-lfs-migrate

@kit3bus

Git LFSMigration Gotchas

Discard large file history

1998 2007 2016

code code code +++

@kit3bus

Git LFSMigration Gotchas

Avoid "orphaned" LFS files after history rewrite

LFSPtr

Git Repo LFS Storage

@kit3bus

( INCLUDES DESIGNER, TESTER, . . . )

Developer

@kit3bus

Teach why "Large" files are a problem!

All history is local. Good for source files.

Problem for large files.

@kit3bus

What is a "problematic" file?

Files that do not compress well...

@kit3bus

What is a "problematic" file?

... and change frequently.

Mon Tue Wed

@kit3bus

What is a "problematic" file?

Files smaller than 500kb are OK.

Rule of

Thumb

@kit3bus

How to track Git LFS files?

git lfs track "*.png"

@kit3bus

How to track Git LFS files?

git lfs track "*.png"

@kit3bus

How to track Git LFS files?

git lfs track "*.lfs.*"

e.g. /images/elephant.lfs.png

@kit3bus

How to track Git LFS files?

git lfs track "/big/*"

e.g. /big/elephant.png

@kit3bus

How to track Git LFS files?

git lfs track "/xxl.png"

@kit3bus

How to track Git LFS files?

Less than 1000 files in LFS are OK.

Rule of

Thumb

Up to 70x speed improvement pending!

@kit3bus

git lfs track "*.png"

git lfs track "*.[pP][nN][gG]"

Case sensitive:

Case in-sensitive:

Git LFSGotchas

@kit3bus

No line ending conversions on

LFS files!

Git LFS Gotchas

@kit3bus

Use the latestGit / Git LFS

version!

Git LFS Tips & Tricks

@kit3bus

Setup your Git credential helper

(or use SSH)!

Watch out for the "administrator" shell!

Git LFS Tips & Tricks

@kit3bus

git lfs clone <URL>

Use Git 2.9+ if your Submodules contain Git LFS files.

Git LFS Tips & Tricks

@kit3bus

Git LFS Tips & Tricks

Use Git Sparse Checkout if you have too many LFS files!

@kit3bus

Administrator

@kit3bus

How to make sure Git LFS is used properly?

Configure Git LFS on all platforms!

Enterprise Config for Githttps://git.io/vi1F4

@kit3bus

How to make sure Git LFS is used properly?

"What happens in Git, stays in Git."

@kit3bus

How to make sure Git LFS is used properly?

Rewriting history can cause a lot of

trouble!

@kit3bus

How to make sure Git LFS is used properly?

Configure file size limit on Git server!

@kit3bus

How to make sure Git LFS is used properly?

Configure file size limit with localGit pre-commit

hooks!

@kit3bus

How to make sure Git LFS is used properly?

Use code reviews and limit write

access to shared branches!

At least initially.

@kit3bus

Takeaways • Git LFS works

• Use the latest Git/Git LFS version

• Use `git lfs clone`

• Track problematic files in Git LFS

• Reject problematic files in Git

• Keep an eye on # of tracked files