deep dark-side of git: how git works internally

84
Deep Dark-side Of Git How Git Works Internally SeongJae Park <[email protected]>

Upload: seongjae-park

Post on 27-Aug-2014

1.003 views

Category:

Software


0 download

DESCRIPTION

Describe how git works internally using small and perfect plumbing commands

TRANSCRIPT

Page 1: Deep dark-side of git: How git works internally

Deep Dark-side Of GitHow Git Works Internally

SeongJae Park <[email protected]>

Page 2: Deep dark-side of git: How git works internally

Nice To Meet You

SeongJae Park

[email protected]

Page 3: Deep dark-side of git: How git works internally

Git

DVCS(Distributed Version Control System)

Made By Linus Torvalds To Manage Linux

http://git-scm.com/images/logos/downloads/Git-Logo-2Color.png

http://cdn.memegenerator.net/instances/400x/37078331.jpg

Page 4: Deep dark-side of git: How git works internally

Git

Many Projects Use Git Because It’s Awesomehttp://git-scm.com/images/logos/downloads/Git-Logo-2Color.png

http://blog.appliedis.com/wp-content/uploads/2013/11/android1.pnghttp://upload.wikimedia.org/wikipedia/en/4/40/Octocat,_a_Mascot_of_Github.jpghttp://upload.wikimedia.org/wikipedia/commons/thumb/3/35/Tux.svg/512px-Tux.svg.png

Page 5: Deep dark-side of git: How git works internally

Git: Learning Curve

Some People Says Hard To Learnhttp://git-scm.com/images/logos/downloads/Git-Logo-2Color.png

Page 6: Deep dark-side of git: How git works internally

This Time, We Will...

See How Git Works From The Scratch

Just For Fun...Or To Be Friend With Git

Forget About TheComplicated CommandsThis Time

https://lh4.googleusercontent.com/gBpfuABUjSNi2RagtJrGi8TW-pmtgak_0qtGOGubihvKH-5-umreO9CwJgjX2kaA9E7RkLwtEwiDnoMtOgm4iMJ0IWhvXlzlKL1kNVUYWuNa-gLRtRoyNjkVYg

Page 7: Deep dark-side of git: How git works internally

In Short,

Git Is A Content-Addressable File System

Blob, Tree, Commit, Reference. That’s It =3

http://www.juliagiff.com/wp-content/uploads/2014/03/tldr_trollcat.jpg

Page 8: Deep dark-side of git: How git works internally

Git: Unsung Heroes Behind

● Git Looks Graceful Owing To Plumbing Commands Consisting Them○ The Wounded Foots Are What We Interested In

http://cfile4.uf.tistory.com/image/182FF7244CFDDFB33CC999http://cfile29.uf.tistory.com/image/18574F224CFDD89B163073

Page 9: Deep dark-side of git: How git works internally

Why VCS?

Usual Life Of File

FileA ver 0 FileB ver 0

Page 10: Deep dark-side of git: How git works internally

Why VCS?

Usual Life Of File

FileA ver 0 FileB ver 1

Page 11: Deep dark-side of git: How git works internally

Why VCS?

Usual Life Of File

FileB ver 1 FileA ver 1

Page 12: Deep dark-side of git: How git works internally

Why VCS?

Usual Life Of File

FileB ver 2FileA ver 1

Page 13: Deep dark-side of git: How git works internally

Why VCS?

Usual Life Of File

FileB ver 2FileA ver 1

Page 14: Deep dark-side of git: How git works internally

We Need Version Control System

VCS Would...Record Every Changes SafelyAble To Check Out Any VersionEasy To Read History

Page 15: Deep dark-side of git: How git works internally

Brute VCS: File System

Rename / Backup Every Files Whenever Change Made

Page 16: Deep dark-side of git: How git works internally

Brute VCS: File System

Rename / Backup Every Files Whenever Change Made

$ ls

foo.c

foo_20140111.c

foo_final.c

foo_realfinal.c

foo_planb.c

foo_finalfinal.c

Page 17: Deep dark-side of git: How git works internally

Brute VCS: File System

Rename / Backup Every Files Whenever Change Made

$ ls

foo.c

foo_20140111.c

foo_final.c

foo_realfinal.c

foo_planb.c

foo_finalfinal.c

Page 18: Deep dark-side of git: How git works internally

GIT vs FileSystem

● GIT: Content-Addressable FileSystem

● Key Is SHA-1 Hash Of Object’s Content, Value Is The Content○ Same Content Never Saved Twice

Page 19: Deep dark-side of git: How git works internally

Save / Load ‘test content’

$ mkdir olaf; cd olaf; git init

Initialized empty Git repository in olaf/.git/

$ echo ‘test content’ | git hash-object -w --stdin

d670460b4b4aece5915caf5c68d12f560a9fe3e4

$

Page 20: Deep dark-side of git: How git works internally

Save / Load ‘test content’

$ mkdir olaf; cd olaf; git init

Initialized empty Git repository in olaf/.git/

$ echo ‘test content’ | git hash-object -w --stdin

d670460b4b4aece5915caf5c68d12f560a9fe3e4

$ find .git/objects/ -type f.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4

$

Page 21: Deep dark-side of git: How git works internally

Save / Load ‘test content’

$ mkdir olaf; cd olaf; git init

Initialized empty Git repository in olaf/.git/

$ echo ‘test content’ | git hash-object -w --stdin

d670460b4b4aece5915caf5c68d12f560a9fe3e4

$ find .git/objects/ -type f.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4

$ git cat-file -p d67046

test content

$ git cat-file -t d67046

blob

Page 22: Deep dark-side of git: How git works internally

What hash-object do

content = “test content”

header = “blob %d\0”, length_of(content)

store = header + content

Page 23: Deep dark-side of git: How git works internally

What hash-object do

content = “test content”

header = “blob %d\0”, length_of(content)

store = header + content

sha1 = sha1_of(store)

dir = “.git/objects/” + sha1[0:2] + “/”

filename = sha1[2:]

Page 24: Deep dark-side of git: How git works internally

What hash-object do

content = “test content”

header = “blob %d\0”, length_of(content)

store = header + content

sha1 = sha1_of(store)

dir = “.git/objects/” + sha1[0:2] + “/”

filename = sha1[2:]

write(dir + filename, store)

# Save compressed header + content at sha1 path

Page 25: Deep dark-side of git: How git works internally

Version Control Using Hash Value

$ echo “eyes, mouth” > head.txt

$ git hash-object -w head.txt

a134fc2477395ee1a59664a0b660085edde63d04

$

Page 26: Deep dark-side of git: How git works internally

Version Control Using Hash Value

$ echo “eyes, mouth” > head.txt

$ git hash-object -w head.txt

a134fc2477395ee1a59664a0b660085edde63d04

$ echo “eyes, nose, mouth” > head.txt

$ git hash-object -w head.txt

6546481b73fb62d0c627812e17e355d43d6ed30e

$

Page 27: Deep dark-side of git: How git works internally

Version Control Using Hash Value

$ echo “eyes, mouth” > head.txt

$ git hash-object -w head.txt

a134fc2477395ee1a59664a0b660085edde63d04

$ echo “eyes, nose, mouth” > head.txt

$ git hash-object -w head.txt

6546481b73fb62d0c627812e17e355d43d6ed30e

$ git cat-file -p a134f > head.txt

$ cat head.txt

eyes, mouth

Page 28: Deep dark-side of git: How git works internally

Version Control Using Hash Value

● Pros:○ Light Volume

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Page 29: Deep dark-side of git: How git works internally

Version Control Using Hash Value

● Pros:○ Light Volume

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Page 30: Deep dark-side of git: How git works internally

Version Control Using Hash Value

● Pros:○ Light Volume

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Page 31: Deep dark-side of git: How git works internally

tree Object

Point Other Objects(Using Hash) With Name

Page 32: Deep dark-side of git: How git works internally

tree Object

Point Other Objects(Using Hash) With Name

tree

blob blob tree

blob

a113f2main.c

b8934olaf.c

c9240include

d9b13true_love.h

Page 33: Deep dark-side of git: How git works internally

tree Object

Point Other Objects(Using Hash) With Name

“A Root tree Object Is A Snapshot”

tree

blob blob tree

blob

a113f2main.c

b8934olaf.c

c9240include

d9b13true_love.h

Page 34: Deep dark-side of git: How git works internally

tree object$ mkdir favorites; echo ‘fantastic’ > favorites/warm_hug

$ git update-index --add head.txt favorites/warm_hug

$ git write-tree

567167268c5c71bb647ca728bdb25f388d027f57

$

Page 35: Deep dark-side of git: How git works internally

tree object$ mkdir favorites; echo ‘fantastic’ > favorites/warm_hug

$ git update-index --add head.txt favorites/warm_hug

$ git write-tree

567167268c5c71bb647ca728bdb25f388d027f57

$ git cat-file -p 56716

040000 tree 799cf15c89acb88d76321b7b1529c8a9888fb9e2favorites

100644 blob a134fc2477395ee1a59664a0b660085edde63d04head.txt

$ git cat-file -p 799cf

100644 blob 7cc07dcddbcf92487065d4c12011e8a12f62a1bdwarm_hug

Page 36: Deep dark-side of git: How git works internally

Internal Data Structure

tree

blob tree

a134fhead.txt

799cffavorites

Page 37: Deep dark-side of git: How git works internally

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

Page 38: Deep dark-side of git: How git works internally

Version Control Using tree Object

$ echo “eyes, nose, mouth” > head.txt

$ git update-index --add head.txt

$ git write-tree

e4885f26f1d82b59c42bf1ed207fec4f60655c35

$

Page 39: Deep dark-side of git: How git works internally

Version Control Using tree Object

$ echo “eyes, nose, mouth” > head.txt

$ git update-index --add head.txt

$ git write-tree

e4885f26f1d82b59c42bf1ed207fec4f60655c35

$ git cat-file -p e4885040000 tree 799cf15c89acb88d76321b7b1529c8a9888fb9e2favorites

100644 blob 6546481b73fb62d0c627812e17e355d43d6ed30ehead.txt

$ git cat-file -p 65464

eyes, nose, mouth

Page 40: Deep dark-side of git: How git works internally

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

Page 41: Deep dark-side of git: How git works internally

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

Page 42: Deep dark-side of git: How git works internally

Version Control Using Hash Value

● Pros:○ Light Volume

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpghttps://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Page 43: Deep dark-side of git: How git works internally

● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

Version Control Using tree Object

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpghttps://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Page 44: Deep dark-side of git: How git works internally

commit Object

Describe Who / When / Why The Change Made

Point A tree Object With Information Above

http://modthink.com/wp-content/uploads/2013/05/WhoWhatWhenWhereWHY.jpg

Page 45: Deep dark-side of git: How git works internally

commit Object

$ echo '1st commit' | git commit-tree 56716

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c

$

Page 46: Deep dark-side of git: How git works internally

commit Object

$ echo '1st commit' | git commit-tree 56716

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c

$

$ git cat-file -p d075c

tree 567167268c5c71bb647ca728bdb25f388d027f57author SeongJae Park <s**@gmail.com> 1401359546 +0900

committer SeongJae Park <s**@gmail.com> 1401359546 +0900

1st commit

$

Page 47: Deep dark-side of git: How git works internally

commit Object

$ echo '1st commit' | git commit-tree 56716

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c

$

$ git cat-file -p d075c

tree 567167268c5c71bb647ca728bdb25f388d027f57author SeongJae Park <s**@gmail.com> 1401359546 +0900

committer SeongJae Park <s**@gmail.com> 1401359546 +0900

1st commit

$

Who When

Why

Page 48: Deep dark-side of git: How git works internally

Version Control Using commit Object

$ echo '2nd commit' | git commit-tree e4885 -p d075c

a9cd7374ce4951ab93aac75d78a45a245e27f414

$

Page 49: Deep dark-side of git: How git works internally

Version Control Using commit Object

$ echo '2nd commit' | git commit-tree e4885 -p d075c

a9cd7374ce4951ab93aac75d78a45a245e27f414

$

$ git cat-file -p a9cd7

tree e4885f26f1d82b59c42bf1ed207fec4f60655c35

parent d075cbd627bc3159be9c77e96b4dc44d8e9d8c4cauthor SeongJae Park <s**@gmail.com> 1401360590 +0900

committer SeongJae Park <s**@gmail.com> 1401360590 +0900

2nd commit

$

Page 50: Deep dark-side of git: How git works internally

Internal Data Structure

That’s Why People Says, “A Commit is a snapshot”

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

commit commit

tree

parent

tree

Page 51: Deep dark-side of git: How git works internally

● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

Version Control Using tree Object

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpghttps://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Page 52: Deep dark-side of git: How git works internally

Version Control Using commit Object

● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently○ Easy To Know Who / When / Why Made A

Change

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Page 53: Deep dark-side of git: How git works internally

Git References

File Storing SHA-1 Value

Resides In .git/refs/

Page 54: Deep dark-side of git: How git works internally

Git References Using echo

$ echo "d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c" > .git/refs/heads/first

$

Page 55: Deep dark-side of git: How git works internally

Git References Using echo

$ echo "d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c" > .git/refs/heads/first

$

$ git log --pretty=oneline first

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit

$

Page 56: Deep dark-side of git: How git works internally

Git References Using echo

$ echo "d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c" > .git/refs/heads/first

$

$ git log --pretty=oneline first

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit

$

$ find .git/refs/heads -type f

.git/refs/heads/first

.git/refs/heads/master

$

Page 57: Deep dark-side of git: How git works internally

Git References Using update-ref

$ git update-ref refs/heads/master a9cd7

$ git log --pretty=oneline master

a9cd7374ce4951ab93aac75d78a45a245e27f414 2nd commit

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit

$

Page 58: Deep dark-side of git: How git works internally

Git References Using update-ref

$ git update-ref refs/heads/master a9cd7

$ git log --pretty=oneline master

a9cd7374ce4951ab93aac75d78a45a245e27f414 2nd commit

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit

$

$ find .git/refs/heads -type f

.git/refs/heads/first

.git/refs/heads/master

$

Page 59: Deep dark-side of git: How git works internally

Git References Using update-ref

$ git update-ref refs/heads/master a9cd7

$ git log --pretty=oneline master

a9cd7374ce4951ab93aac75d78a45a245e27f414 2nd commit

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit

$

$ find .git/refs/heads -type f

.git/refs/heads/first

.git/refs/heads/master

$

$ cat .git/refs/heads/master

a9cd7374ce4951ab93aac75d78a45a245e27f414

Page 60: Deep dark-side of git: How git works internally

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

commit commit

tree

parent

tree

Page 61: Deep dark-side of git: How git works internally

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

commit commit

tree

parent

tree

refs/heads/master

refs/heads/first

Page 62: Deep dark-side of git: How git works internally

Version Control Using commit Object

● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently○ Easy To Know Who / When / Why Made A

Change

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Page 63: Deep dark-side of git: How git works internally

Version Control Using Reference

● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently○ Easy To Know Who / When / Why Made A Change○ Easy To Point A Snapshot

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Page 64: Deep dark-side of git: How git works internally

How Git Knows Current Commit?

Answer: HEAD

Page 65: Deep dark-side of git: How git works internally

How Git Knows Current Commit?

Answer: HEAD

HEAD Points reference Using ref format(Not SHA-1)

Page 66: Deep dark-side of git: How git works internally

How Git Knows Current Commit?

Answer: HEAD

HEAD Points reference Using ref format(Not SHA-1)

$ cat .git/HEADref: refs/heads/master

Page 67: Deep dark-side of git: How git works internally

HEAD$ cat .git/HEAD

ref: refs/heads/master

$

Page 68: Deep dark-side of git: How git works internally

HEAD$ cat .git/HEAD

ref: refs/heads/master

$ git branch

first

* master

$

Page 69: Deep dark-side of git: How git works internally

HEAD$ cat .git/HEAD

ref: refs/heads/master

$ git branch

first

* master

$

$ git symbolic-ref HEAD refs/heads/first

$ cat .git/HEAD

ref: refs/heads/first

$ git branch

* first

master

Page 70: Deep dark-side of git: How git works internally

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

commit commit

tree

parent

tree

refs/heads/master

refs/heads/first

Page 71: Deep dark-side of git: How git works internally

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

commit commit

tree

parent

tree

refs/heads/master

refs/heads/first .git/HEAD

Page 72: Deep dark-side of git: How git works internally

One More ThingCloned. Now Fetch Or Pull ?

Page 73: Deep dark-side of git: How git works internally

Fetch / Pull

Fetch Or Pull To Get Latest Code?

Page 74: Deep dark-side of git: How git works internally

Fetch

● Just Fetch Remote Repository’s Objects And References To Git Internal Storage

● If You Need The Changes On Your Working Directory,○ Manually Merge Them Using git-merge Or,○ Checkout

Page 75: Deep dark-side of git: How git works internally

Fetch

Refspec Describes Source / Destination

$ cat .git/config | grep remote

[remote "origin"]

url = git://127.0.0.1/git/olaf.git

fetch = +refs/heads/*:refs/remotes/origin/*

Source Destination

Page 76: Deep dark-side of git: How git works internally

Fetchurl = git://10.0.0.1/git/olaf.git

fetch = +refs/heads/*:refs/remotes/origin/*

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt

799cffavorites

commit commit

tree

parent

tree

refs/heads/master

.git/HEAD

git://10.0.0.1/git/olaf.git

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

commit

tree

refs/heads/master

.git/HEAD

file:///home/sjpark/olaf

Page 77: Deep dark-side of git: How git works internally

Fetchurl = git://10.0.0.1/git/olaf.git

fetch = +refs/heads/*:refs/remotes/origin/*

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt

799cffavorites

commit commit

tree

parent

tree

refs/heads/master

.git/HEAD

git://10.0.0.1/git/olaf.git

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt

799cffavorites

commit commit

tree

parent

tree

refs/remotes/

origin/master

refs/heads/master

.git/HEAD

file:///home/sjpark/olaf

Page 78: Deep dark-side of git: How git works internally

git merge origin/master

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt

799cffavorites

commit commit

tree

parent

tree

refs/remotes/

origin/master

refs/heads/

first

.git/HEAD

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt

799cffavorites

commit commit

tree

parent

tree

refs/remotes/

origin/master

refs/heads/

first

.git/HEAD

Page 79: Deep dark-side of git: How git works internally

Pull

Pull Is Just An Command Of Fetch then Merge

May Merge Conflict Occur…

Pull Is Sufficient For Simple Project

Page 80: Deep dark-side of git: How git works internally

In Short,

Git Is A Content-Addressable File System

Blob, Tree, Commit, Reference. That’s It =3

http://www.juliagiff.com/wp-content/uploads/2014/03/tldr_trollcat.jpg

Page 81: Deep dark-side of git: How git works internally

Thank you :)

http://ecache.ilbe.com/files/attach/new/20130724/377678/1231265/1642033319/19fb4341dbb9b69541a3ec76aa068df0.png

Page 84: Deep dark-side of git: How git works internally

This work by SeongJae Park is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported

License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.