deep dark-side of git: how git works internally

Post on 27-Aug-2014

1.004 Views

Category:

Software

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Describe how git works internally using small and perfect plumbing commands

TRANSCRIPT

Deep Dark-side Of GitHow Git Works Internally

SeongJae Park <sj38.park@gmail.com>

Nice To Meet You

SeongJae Park

sj38.park@gmail.com

Git

DVCS(Distributed Version Control System)

Made By Linus Torvalds To Manage Linux

http://git-scm.com/images/logos/downloads/Git-Logo-2Color.png

http://cdn.memegenerator.net/instances/400x/37078331.jpg

Git

Many Projects Use Git Because It’s Awesomehttp://git-scm.com/images/logos/downloads/Git-Logo-2Color.png

http://blog.appliedis.com/wp-content/uploads/2013/11/android1.pnghttp://upload.wikimedia.org/wikipedia/en/4/40/Octocat,_a_Mascot_of_Github.jpghttp://upload.wikimedia.org/wikipedia/commons/thumb/3/35/Tux.svg/512px-Tux.svg.png

Git: Learning Curve

Some People Says Hard To Learnhttp://git-scm.com/images/logos/downloads/Git-Logo-2Color.png

This Time, We Will...

See How Git Works From The Scratch

Just For Fun...Or To Be Friend With Git

Forget About TheComplicated CommandsThis Time

https://lh4.googleusercontent.com/gBpfuABUjSNi2RagtJrGi8TW-pmtgak_0qtGOGubihvKH-5-umreO9CwJgjX2kaA9E7RkLwtEwiDnoMtOgm4iMJ0IWhvXlzlKL1kNVUYWuNa-gLRtRoyNjkVYg

In Short,

Git Is A Content-Addressable File System

Blob, Tree, Commit, Reference. That’s It =3

http://www.juliagiff.com/wp-content/uploads/2014/03/tldr_trollcat.jpg

Git: Unsung Heroes Behind

● Git Looks Graceful Owing To Plumbing Commands Consisting Them○ The Wounded Foots Are What We Interested In

http://cfile4.uf.tistory.com/image/182FF7244CFDDFB33CC999http://cfile29.uf.tistory.com/image/18574F224CFDD89B163073

Why VCS?

Usual Life Of File

FileA ver 0 FileB ver 0

Why VCS?

Usual Life Of File

FileA ver 0 FileB ver 1

Why VCS?

Usual Life Of File

FileB ver 1 FileA ver 1

Why VCS?

Usual Life Of File

FileB ver 2FileA ver 1

Why VCS?

Usual Life Of File

FileB ver 2FileA ver 1

We Need Version Control System

VCS Would...Record Every Changes SafelyAble To Check Out Any VersionEasy To Read History

Brute VCS: File System

Rename / Backup Every Files Whenever Change Made

Brute VCS: File System

Rename / Backup Every Files Whenever Change Made

$ ls

foo.c

foo_20140111.c

foo_final.c

foo_realfinal.c

foo_planb.c

foo_finalfinal.c

Brute VCS: File System

Rename / Backup Every Files Whenever Change Made

$ ls

foo.c

foo_20140111.c

foo_final.c

foo_realfinal.c

foo_planb.c

foo_finalfinal.c

GIT vs FileSystem

● GIT: Content-Addressable FileSystem

● Key Is SHA-1 Hash Of Object’s Content, Value Is The Content○ Same Content Never Saved Twice

Save / Load ‘test content’

$ mkdir olaf; cd olaf; git init

Initialized empty Git repository in olaf/.git/

$ echo ‘test content’ | git hash-object -w --stdin

d670460b4b4aece5915caf5c68d12f560a9fe3e4

$

Save / Load ‘test content’

$ mkdir olaf; cd olaf; git init

Initialized empty Git repository in olaf/.git/

$ echo ‘test content’ | git hash-object -w --stdin

d670460b4b4aece5915caf5c68d12f560a9fe3e4

$ find .git/objects/ -type f.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4

$

Save / Load ‘test content’

$ mkdir olaf; cd olaf; git init

Initialized empty Git repository in olaf/.git/

$ echo ‘test content’ | git hash-object -w --stdin

d670460b4b4aece5915caf5c68d12f560a9fe3e4

$ find .git/objects/ -type f.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4

$ git cat-file -p d67046

test content

$ git cat-file -t d67046

blob

What hash-object do

content = “test content”

header = “blob %d\0”, length_of(content)

store = header + content

What hash-object do

content = “test content”

header = “blob %d\0”, length_of(content)

store = header + content

sha1 = sha1_of(store)

dir = “.git/objects/” + sha1[0:2] + “/”

filename = sha1[2:]

What hash-object do

content = “test content”

header = “blob %d\0”, length_of(content)

store = header + content

sha1 = sha1_of(store)

dir = “.git/objects/” + sha1[0:2] + “/”

filename = sha1[2:]

write(dir + filename, store)

# Save compressed header + content at sha1 path

Version Control Using Hash Value

$ echo “eyes, mouth” > head.txt

$ git hash-object -w head.txt

a134fc2477395ee1a59664a0b660085edde63d04

$

Version Control Using Hash Value

$ echo “eyes, mouth” > head.txt

$ git hash-object -w head.txt

a134fc2477395ee1a59664a0b660085edde63d04

$ echo “eyes, nose, mouth” > head.txt

$ git hash-object -w head.txt

6546481b73fb62d0c627812e17e355d43d6ed30e

$

Version Control Using Hash Value

$ echo “eyes, mouth” > head.txt

$ git hash-object -w head.txt

a134fc2477395ee1a59664a0b660085edde63d04

$ echo “eyes, nose, mouth” > head.txt

$ git hash-object -w head.txt

6546481b73fb62d0c627812e17e355d43d6ed30e

$ git cat-file -p a134f > head.txt

$ cat head.txt

eyes, mouth

Version Control Using Hash Value

● Pros:○ Light Volume

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Version Control Using Hash Value

● Pros:○ Light Volume

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Version Control Using Hash Value

● Pros:○ Light Volume

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

tree Object

Point Other Objects(Using Hash) With Name

tree Object

Point Other Objects(Using Hash) With Name

tree

blob blob tree

blob

a113f2main.c

b8934olaf.c

c9240include

d9b13true_love.h

tree Object

Point Other Objects(Using Hash) With Name

“A Root tree Object Is A Snapshot”

tree

blob blob tree

blob

a113f2main.c

b8934olaf.c

c9240include

d9b13true_love.h

tree object$ mkdir favorites; echo ‘fantastic’ > favorites/warm_hug

$ git update-index --add head.txt favorites/warm_hug

$ git write-tree

567167268c5c71bb647ca728bdb25f388d027f57

$

tree object$ mkdir favorites; echo ‘fantastic’ > favorites/warm_hug

$ git update-index --add head.txt favorites/warm_hug

$ git write-tree

567167268c5c71bb647ca728bdb25f388d027f57

$ git cat-file -p 56716

040000 tree 799cf15c89acb88d76321b7b1529c8a9888fb9e2favorites

100644 blob a134fc2477395ee1a59664a0b660085edde63d04head.txt

$ git cat-file -p 799cf

100644 blob 7cc07dcddbcf92487065d4c12011e8a12f62a1bdwarm_hug

Internal Data Structure

tree

blob tree

a134fhead.txt

799cffavorites

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

Version Control Using tree Object

$ echo “eyes, nose, mouth” > head.txt

$ git update-index --add head.txt

$ git write-tree

e4885f26f1d82b59c42bf1ed207fec4f60655c35

$

Version Control Using tree Object

$ echo “eyes, nose, mouth” > head.txt

$ git update-index --add head.txt

$ git write-tree

e4885f26f1d82b59c42bf1ed207fec4f60655c35

$ git cat-file -p e4885040000 tree 799cf15c89acb88d76321b7b1529c8a9888fb9e2favorites

100644 blob 6546481b73fb62d0c627812e17e355d43d6ed30ehead.txt

$ git cat-file -p 65464

eyes, nose, mouth

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

Version Control Using Hash Value

● Pros:○ Light Volume

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpghttps://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

Version Control Using tree Object

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpghttps://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

commit Object

Describe Who / When / Why The Change Made

Point A tree Object With Information Above

http://modthink.com/wp-content/uploads/2013/05/WhoWhatWhenWhereWHY.jpg

commit Object

$ echo '1st commit' | git commit-tree 56716

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c

$

commit Object

$ echo '1st commit' | git commit-tree 56716

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c

$

$ git cat-file -p d075c

tree 567167268c5c71bb647ca728bdb25f388d027f57author SeongJae Park <s**@gmail.com> 1401359546 +0900

committer SeongJae Park <s**@gmail.com> 1401359546 +0900

1st commit

$

commit Object

$ echo '1st commit' | git commit-tree 56716

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c

$

$ git cat-file -p d075c

tree 567167268c5c71bb647ca728bdb25f388d027f57author SeongJae Park <s**@gmail.com> 1401359546 +0900

committer SeongJae Park <s**@gmail.com> 1401359546 +0900

1st commit

$

Who When

Why

Version Control Using commit Object

$ echo '2nd commit' | git commit-tree e4885 -p d075c

a9cd7374ce4951ab93aac75d78a45a245e27f414

$

Version Control Using commit Object

$ echo '2nd commit' | git commit-tree e4885 -p d075c

a9cd7374ce4951ab93aac75d78a45a245e27f414

$

$ git cat-file -p a9cd7

tree e4885f26f1d82b59c42bf1ed207fec4f60655c35

parent d075cbd627bc3159be9c77e96b4dc44d8e9d8c4cauthor SeongJae Park <s**@gmail.com> 1401360590 +0900

committer SeongJae Park <s**@gmail.com> 1401360590 +0900

2nd commit

$

Internal Data Structure

That’s Why People Says, “A Commit is a snapshot”

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

commit commit

tree

parent

tree

● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

Version Control Using tree Object

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpghttps://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Version Control Using commit Object

● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently○ Easy To Know Who / When / Why Made A

Change

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Git References

File Storing SHA-1 Value

Resides In .git/refs/

Git References Using echo

$ echo "d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c" > .git/refs/heads/first

$

Git References Using echo

$ echo "d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c" > .git/refs/heads/first

$

$ git log --pretty=oneline first

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit

$

Git References Using echo

$ echo "d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c" > .git/refs/heads/first

$

$ git log --pretty=oneline first

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit

$

$ find .git/refs/heads -type f

.git/refs/heads/first

.git/refs/heads/master

$

Git References Using update-ref

$ git update-ref refs/heads/master a9cd7

$ git log --pretty=oneline master

a9cd7374ce4951ab93aac75d78a45a245e27f414 2nd commit

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit

$

Git References Using update-ref

$ git update-ref refs/heads/master a9cd7

$ git log --pretty=oneline master

a9cd7374ce4951ab93aac75d78a45a245e27f414 2nd commit

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit

$

$ find .git/refs/heads -type f

.git/refs/heads/first

.git/refs/heads/master

$

Git References Using update-ref

$ git update-ref refs/heads/master a9cd7

$ git log --pretty=oneline master

a9cd7374ce4951ab93aac75d78a45a245e27f414 2nd commit

d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit

$

$ find .git/refs/heads -type f

.git/refs/heads/first

.git/refs/heads/master

$

$ cat .git/refs/heads/master

a9cd7374ce4951ab93aac75d78a45a245e27f414

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

commit commit

tree

parent

tree

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

commit commit

tree

parent

tree

refs/heads/master

refs/heads/first

Version Control Using commit Object

● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently○ Easy To Know Who / When / Why Made A

Change

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

Version Control Using Reference

● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently○ Easy To Know Who / When / Why Made A Change○ Easy To Point A Snapshot

● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values

https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg

How Git Knows Current Commit?

Answer: HEAD

How Git Knows Current Commit?

Answer: HEAD

HEAD Points reference Using ref format(Not SHA-1)

How Git Knows Current Commit?

Answer: HEAD

HEAD Points reference Using ref format(Not SHA-1)

$ cat .git/HEADref: refs/heads/master

HEAD$ cat .git/HEAD

ref: refs/heads/master

$

HEAD$ cat .git/HEAD

ref: refs/heads/master

$ git branch

first

* master

$

HEAD$ cat .git/HEAD

ref: refs/heads/master

$ git branch

first

* master

$

$ git symbolic-ref HEAD refs/heads/first

$ cat .git/HEAD

ref: refs/heads/first

$ git branch

* first

master

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

commit commit

tree

parent

tree

refs/heads/master

refs/heads/first

Internal Data Structure

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt799cf

favorites

commit commit

tree

parent

tree

refs/heads/master

refs/heads/first .git/HEAD

One More ThingCloned. Now Fetch Or Pull ?

Fetch / Pull

Fetch Or Pull To Get Latest Code?

Fetch

● Just Fetch Remote Repository’s Objects And References To Git Internal Storage

● If You Need The Changes On Your Working Directory,○ Manually Merge Them Using git-merge Or,○ Checkout

Fetch

Refspec Describes Source / Destination

$ cat .git/config | grep remote

[remote "origin"]

url = git://127.0.0.1/git/olaf.git

fetch = +refs/heads/*:refs/remotes/origin/*

Source Destination

Fetchurl = git://10.0.0.1/git/olaf.git

fetch = +refs/heads/*:refs/remotes/origin/*

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt

799cffavorites

commit commit

tree

parent

tree

refs/heads/master

.git/HEAD

git://10.0.0.1/git/olaf.git

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

commit

tree

refs/heads/master

.git/HEAD

file:///home/sjpark/olaf

Fetchurl = git://10.0.0.1/git/olaf.git

fetch = +refs/heads/*:refs/remotes/origin/*

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt

799cffavorites

commit commit

tree

parent

tree

refs/heads/master

.git/HEAD

git://10.0.0.1/git/olaf.git

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt

799cffavorites

commit commit

tree

parent

tree

refs/remotes/

origin/master

refs/heads/master

.git/HEAD

file:///home/sjpark/olaf

git merge origin/master

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt

799cffavorites

commit commit

tree

parent

tree

refs/remotes/

origin/master

refs/heads/

first

.git/HEAD

tree

blob tree

blob

a134fhead.txt

799cffavorites

7cc07warm_hug

tree

blob

65464head.txt

799cffavorites

commit commit

tree

parent

tree

refs/remotes/

origin/master

refs/heads/

first

.git/HEAD

Pull

Pull Is Just An Command Of Fetch then Merge

May Merge Conflict Occur…

Pull Is Sufficient For Simple Project

In Short,

Git Is A Content-Addressable File System

Blob, Tree, Commit, Reference. That’s It =3

http://www.juliagiff.com/wp-content/uploads/2014/03/tldr_trollcat.jpg

Thank you :)

http://ecache.ilbe.com/files/attach/new/20130724/377678/1231265/1642033319/19fb4341dbb9b69541a3ec76aa068df0.png

This work by SeongJae Park is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported

License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.

top related