Git init Init repository 1 2 [ryao@macpro-gn07 deep-inside-git]> git init Initialized empty Git repository in /Users/ryao/Workspaces/deep-inside-git/.git/
.git 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 [ryao@macpro-gn07 deep-inside-git]> tree .git/ .git/ ├── HEAD ├── config ├── description ├── hooks │ ├── applypatch-msg.sample │ ├── commit-msg.sample │ ├── post-update.sample │ ├── pre-applypatch.sample │ ├── pre-commit.sample │ ├── pre-push.sample │ ├── pre-rebase.sample │ ├── pre-receive.sample │ ├── prepare-commit-msg.sample │ └── update.sample ├── info │ └── exclude ├── objects │ ├── info │ └── pack └── refs ├── heads └── tags 8 directories, 14 files
This is the structure of .git
directory. So basically:
HEAD: pointing to the current HEAD
hooks: web hooks that make git host programmable
logs: this is where log
read from
objects: this is where commit information and committed files are stored
refs: this is where reflog
read from
HEAD content 1 2 3 4 [ryao@macpro-gn07 deep-inside-git]> cat .git/HEAD ref: refs/heads/master [ryao@macpro-gn07 deep-inside-git]> ll .git/refs/heads/masterls : .git/refs/heads/master: No such file or directory
As we don’t have any commit yet, the file that referred by HEAD
is not exist.
hash-object 1 2 3 4 5 [ryao@macpro-gn07 deep-inside-git]> echo 'Rugal' | git hash-object --stdin 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a [ryao@macpro-gn07 deep-inside-git]> echo 'Rugal' > README.md [ryao@macpro-gn07 deep-inside-git]> cat README.md | git hash-object --stdin 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a
In order to start our tutorial, let me introduce hash-object
tool. This tool is what git uses for computing hash for objects. By testing the code above, we can see that the hash of a file is simply the hash of its content.
Git add Updated objects folder 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [ryao@macpro-gn07 deep-inside-git]> git add -A && git status On branch master Initial commit Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: README.md [ryao@macpro-gn07 deep-inside-git]> tree .git/objects .git/objects/ ├── 2f │ └── b811a4ca96b3d0ac9b4fb8aa3d96e6a809509a ├── info └── pack
By simply add the new file into staging area, we have a new object being added into .git/objects
folder. We notice the folder name along with the file name is 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a
, which is exactly the same with the hash we calculated above. From this section we realize, that once a file is staged, its information, type will be hashed and store into .git/objects
under the folder/file of the hash.
Object content 1 2 3 4 5 6 7 8 9 10 11 12 [ryao@macpro-gn07 deep-inside-git]> cat .git/objects/2f/b811a4ca96b3d0ac9b4fb8aa3d96e6a809509a xKOR0*MOV [ryao@macpro-gn07 deep-inside-git]> git cat-file -s 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a 6 [ryao@macpro-gn07 deep-inside-git]> git cat-file -t 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a blob [ryao@macpro-gn07 deep-inside-git]> git cat-file -p 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a Rugal
Here another interesting tool comes, cat-file
. We can use this tool to inspect objects information. We will use this tool to show information of an object later.
Git commit Commit In this section we will commit our first file. Let’s monitor what will change.
1 2 3 4 5 6 7 [ryao@macpro-gn07 deep-inside-git]> git commit -m"Initial commit" [master (root-commit) f73a6ae] Initial commit 1 file changed, 1 insertion(+) create mode 100644 README.md [ryao@macpro-gn07 deep-inside-git]> cat .git/refs/heads/master f73a6ae93b095b899fbdb3b2485f5829b9f460cf
After the first commit, we got a file master
under .git/refs/heads
. The content of file is the hashcode of the commit.
Updated objects folder 1 2 3 4 5 6 7 8 9 10 11 [ryao@macpro-gn07 deep-inside-git]> tree .git/objects .git/objects ├── objects ├── 2f │ └── b811a4ca96b3d0ac9b4fb8aa3d96e6a809509a ├── 36 │ └── 2032c56bddec6ad5b639e16eeb594f92886516 ├── f7 │ └── 3a6ae93b095b899fbdb3b2485f5829b9f460cf ├── info └── pack
Git commit objects Now let’s inspect the commit objects.
1 2 3 4 5 6 7 8 [ryao@macpro-gn07 deep-inside-git]> git cat-file -t f73a6ae93b095b899fbdb3b2485f5829b9f460cf commit [ryao@macpro-gn07 deep-inside-git]> git cat-file -p f73a6ae93b095b899fbdb3b2485f5829b9f460cf tree 362032c56bddec6ad5b639e16eeb594f92886516 author Rugal Bernstein <ryao@peakcontact.com> 1498230956 -0400 committer Rugal Bernstein <ryao@peakcontact.com> 1498230956 -0400 Initial commit
This object is of commit
type, which means it contains information of a commit, including:
Hash of tree
author information
commiter information
commit message
Commit tree 1 2 3 4 [ryao@macpro-gn07 deep-inside-git]> git cat-file -t 362032c56bddec6ad5b639e16eeb594f92886516 tree [ryao@macpro-gn07 deep-inside-git]> git cat-file -p 362032c56bddec6ad5b639e16eeb594f92886516 100644 blob 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a README.md
This object is of tree
type. It contains the files involved in one commit.
File permission
File type
File hash
File name
Notice that file name and content are separate, which means file content can be reused. I can see this in later section.
Commit object content 1 2 3 4 [ryao@macpro-gn07 deep-inside-git]> git cat-file -t 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a blob [ryao@macpro-gn07 deep-inside-git]> git cat-file -p 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a Rugal
This object is of blob
type. It contains the original file content.
Second Commit Now let’s make another commit that includes difference content and different name from README.md
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 [ryao@macpro-gn07 deep-inside-git]> echo 'Bernstein' > INSTALL.md && git add -A [ryao@macpro-gn07 deep-inside-git]> cat INSTALL.md | git hash-object --stdin 56757e169d62beeb6371e7f5d3bd6bd507edd2f6 [ryao@macpro-gn07 deep-inside-git]> tree .git/objects/ .git/objects/ ├── 2f │ └── b811a4ca96b3d0ac9b4fb8aa3d96e6a809509a ├── 36 │ └── 2032c56bddec6ad5b639e16eeb594f92886516 ├── 56 │ └── 757e169d62beeb6371e7f5d3bd6bd507edd2f6 ├── f7 │ └── 3a6ae93b095b899fbdb3b2485f5829b9f460cf ├── info └── pack 6 directories, 4 files [ryao@macpro-gn07 deep-inside-git]> git commit -m"Add INSTALL.md file" [master 800d7b9] Add INSTALL.md file 1 file changed, 1 insertion(+) create mode 100644 INSTALL.md [ryao@macpro-gn07 deep-inside-git]> git lg * 800d7b9 - (HEAD -> master) Add INSTALL.md file (68 seconds ago) <Rugal Bernstein> * f73a6ae - Initial commit (13 minutes ago) <Rugal Bernstein>
Now we have 3 more git objects.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 [ryao@macpro-gn07 deep-inside-git]> tree .git/objects/ .git/objects/ ├── 2f │ └── b811a4ca96b3d0ac9b4fb8aa3d96e6a809509a ├── 36 │ └── 2032c56bddec6ad5b639e16eeb594f92886516 ├── 56 │ └── 757e169d62beeb6371e7f5d3bd6bd507edd2f6 ├── 80 │ └── 0d7b9974fd1d4ad26791dfcf4bb0478c51c4da ├── c6 │ └── 919ff8ab7ff578ebb6995121d501aa645d0797 ├── f7 │ └── 3a6ae93b095b899fbdb3b2485f5829b9f460cf ├── info └── pack 8 directories, 6 files
And the master is pointing to the latest commit.
1 2 [ryao@macpro-gn07 deep-inside-git]> cat .git/refs/heads/master 800d7b9974fd1d4ad26791dfcf4bb0478c51c4da
Git commit objects 1 2 3 4 5 6 7 [ryao@macpro-gn07 deep-inside-git]> git cat-file -p 800d7b9974fd1d4ad26791dfcf4bb0478c51c4da tree c6919ff8ab7ff578ebb6995121d501aa645d0797 parent f73a6ae93b095b899fbdb3b2485f5829b9f460cf author Rugal Bernstein <ryao@peakcontact.com> 1498231664 -0400 committer Rugal Bernstein <ryao@peakcontact.com> 1498231664 -0400 Add INSTALL.md file
This file not only has the same type of data as before, but also has a parent
field indicates which commit it follows.
Commit tree 1 2 3 [ryao@macpro-gn07 deep-inside-git]> git cat-file -p c6919ff8ab7ff578ebb6995121d501aa645d0797 100644 blob 56757e169d62beeb6371e7f5d3bd6bd507edd2f6 INSTALL.md 100644 blob 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a README.md
We have one more file in tree
.
Commit object content 1 2 3 4 [ryao@macpro-gn07 deep-inside-git]> git cat-file -p 56757e169d62beeb6371e7f5d3bd6bd507edd2f6 Bernstein [ryao@macpro-gn07 deep-inside-git]> git cat-file -p 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a Rugal
Override Finally let’s do another commit, where we are going to override INSTALL.md
file with the same content as README.md
.
1 2 3 4 5 6 7 8 [ryao@macpro-gn07 deep-inside-git]> echo 'Rugal' > INSTALL.md && git commit -am"Override content of INSTALL.md" [master 95ab7c4] Override content of INSTALL.md 1 file changed, 1 insertion(+), 1 deletion(-) [ryao@macpro-gn07 deep-inside-git]> git lg * 95ab7c4 - (HEAD -> master) Override content of INSTALL.md (7 seconds ago) <Rugal Bernstein> * 800d7b9 - Add INSTALL.md file (15 minutes ago) <Rugal Bernstein> * f73a6ae - Initial commit (27 minutes ago) <Rugal Bernstein>
Git object content 1 2 3 4 5 6 7 [ryao@macpro-gn07 deep-inside-git]> git cat-file -p 95ab7c4 tree 2d5cfaf5513d9dc6876124ff683241bb5c61e0ae parent 800d7b9974fd1d4ad26791dfcf4bb0478c51c4da author Rugal Bernstein <ryao@peakcontact.com> 1498232582 -0400 committer Rugal Bernstein <ryao@peakcontact.com> 1498232582 -0400 Override content of INSTALL.md
Not many difference than previous commit.
Commit tree 1 2 3 4 [ryao@macpro-gn07 deep-inside-git]> git cat-file -p 2 d5cfaf5513d9dc6876124f f683241b b5c61e0ae100644 blob 2 fb811a4ca96b 3d0ac9b 4f b8aa3d96e6a809509a INSTALL.md 100644 blob 2 fb811a4ca96b 3d0ac9b 4f b8aa3d96e6a809509a README.md
Something weird in this file, we can see both files have the same hash content. This means git can reuse object to reduce repository size.
Git GC Some more interesting tool.
We can use gc
to compress objects into one pack/index.
1 2 3 4 5 6 [ryao@macpro-gn07 deep-inside-git]> git gc Counting objects: 8, done . Delta compression using up to 4 threads. Compressing objects: 100% (5/5), done . Writing objects: 100% (8/8), done . Total 8 (delta 0), reused 8 (delta 0)
1 2 3 4 5 6 7 8 9 [ryao@macpro-gn07 deep-inside-git]> tree .git/objects/ .git/objects/ ├── info │ └── packs └── pack ├── pack-1ba47dfdc2c98b428c17082e4ee16e8c111c42ac.idx └── pack-1ba47dfdc2c98b428c17082e4ee16e8c111c42ac.pack 2 directories, 3 files
Git verify-pack After compression, we can still see object content by using verify-pack
.
1 2 3 4 5 6 7 8 9 10 11 [ryao@macpro-gn07 deep-inside-git]> git verify-pack -v .git/objects/pack/pack-1ba47dfdc2c98b428c17082e4ee16e8c111c42ac.idx 95ab7c4946063d036a84f677081271a8106407ac commit 255 178 12 800d7b9974fd1d4ad26791dfcf4bb0478c51c4da commit 244 172 190 f73a6ae93b095b899fbdb3b2485f5829b9f460cf commit 191 131 362 2fb811a4ca96b3d0ac9b4fb8aa3d96e6a809509a blob 6 15 493 2d5cfaf5513d9dc6876124ff683241bb5c61e0ae tree 75 61 508 c6919ff8ab7ff578ebb6995121d501aa645d0797 tree 75 82 569 56757e169d62beeb6371e7f5d3bd6bd507edd2f6 blob 10 19 651 362032c56bddec6ad5b639e16eeb594f92886516 tree 37 48 670 non delta: 8 objects .git/objects/pack/pack-1ba47dfdc2c98b428c17082e4ee16e8c111c42ac.pack: ok