Say that we edit 1.txt, and add an exclamation mark - that is, we changed the content from HELLO WORLD, to HELLO WORLD!. Let’s examine what happens if we change the contents of a file. How can that work? Doesn’t that mean that we have to store a lot of data every commit? These are the hashes we are used to seeing when we use git log.Įvery commit holds the entire snapshot, not just diffs from the previous commit(s). Of course, commit objects are also identified by their SHA-1 hashes. In most cases, a commit also has one or more parent commits - the previous snapshot(s). A commit object includes a pointer to the main tree (the root directory), as well as other meta-data such as the committer, a commit message and the commit time.
Now it’s time to take a snapshot of that file system - and store all the files that existed at that time, along with their contents. The diagram above is equivalent to a file system with a root directory that has one file at /test.js, and a directory named /docs with two files: /docs/pic.png and /docs/1.txt. In another tree, that same blob may have another name. Note that the tree CAFE7 refers to the blob F92A0 as pic.png. Referring to these objects, either blobs or other trees, happens via the SHA-1 hash of the objects. Trees are identified by their SHA-1 hashes as well. A tree is basically a directory listing, referring to blobs as well as other trees. In git, the equivalent of a directory is a tree. Throughout this post we will sometimes show just the first characters of that hash. SHA-1 hashes consist of 20 bytes, usually represented by 40 characters in hexadecimal form. A blob doesn’t register its creation date, its name, or anything but its contents.Įvery blob in git is identified by its SHA-1 hash. For example, a file “remembers” when it was created, so if you move that file into another directory, its creation time remains the same.īlobs, on the other hand, are just contents - binary streams of data. The difference between blobs and files is that files also contain meta-data. In git, the contents of files are stored in objects called blobs, binary large objects. These directories contain other directories, and/or files (for example, /usr/1.txt). It is very useful to think about git as maintaining a file system, and specifically - snapshots of that system in time.Ī file system begins with a root directory (in UNIX-based systems, /), which usually contains other directories (for example, /usr or /bin). Are you up for it? ? Git Objects - blob, tree and commit We will also create new branches, switch branches, and create additional commits - all without using git branch or git checkout.īy the end of this post, you will feel like you understand git. This will allow us to deepen our understanding of what is happening under the hood when we work with git. Next, will create a repository from scratch - without using git init, git add, or git commit. We will dive into the working directory, staging area and repository.Īnd we will make sure we understand how these terms relate to the git commands we know and use to create a new repository. We will then briefly discuss branches and how they are implemented. We will start by covering objects - blobs, trees, and commits. We will get a rare understanding of what goes on under the hood of what we do almost daily.
#GIT LIST DIRECTORY CONTENTS SERIES#
I also uploaded a YouTube series covering this post - you are welcome to watch it here. Still, we will start with an overview to make sure we are on the same page regarding the mechanisms of git, and specifically, the terms used throughout this post. You’ll benefit from this post if you’re experienced enough with git to feel comfortable with commands such as git pull, git push, git add or git commit. īut even more acutely, I've found that understanding how git actually works is useful in many scenarios - whether it’s resolving merge conflicts, looking to conduct an interesting rebase, or even just when something goes slightly wrong. But does it really matter?įirst, as professionals, we should strive to understand the tools we use, especially if we use them all the time - like git.
Many people who use git don’t know the answers to the questions above. But how many of us know what goes on under the hood?įor example, what happens when we use git commit? What is stored between commits? Is it just a diff between the current and previous commit? If so, how is the diff encoded? Or is an entire snapshot of the repo stored each time? What really happens when we use git init ?