Edit: I have checked this What does FETCH_HEAD in Git mean? before asking the question.
Sorry for the original inaccurate question.
My question is how does fetch really work? Does fetch drop all current log?
This is my situation: my teammates and I are using the same repository which only has one branch. So we have to do fetch before push anything up.
We typically do it this way:
git status
git add .
git commit -m message1
git fetch origin
git reset head
git status
git add .
git commit -m message
git push
But after reset, it seems that my previous commit (with message1
) is gone.
Is this normal or is there anything wrong?
How can I access my local history?
They are synced but my local history is gone.
Old staff ,forget it: I have been learning Git CLI recently.
Someone told me to type "git fetch head
" to keep track of remote branch.
But I wonder what does this do ? Does this command override my local log?
And what is the difference between "git fetch
" and "git fetch head
" ?
Best Answer
git fetch
itself is really quite simple. The complicated parts come before and after.The first thing to know here is that Git stores commits. In fact, this is essentially what Git is about: it manages a collection of commits. This collection rarely shrinks: for the most part, the only thing you ever do with this collection of commits is add new commits.
Commits, the index, and the work-tree
Each commit has several pieces of information, such as the author's name and email address and a time-stamp. Each commit also saves a complete snapshot of all the files you told it to: these are the files stored in your index (also known as your staging area) at the time you ran
git commit
. This is also true of commits you obtain from someone else: they save the files that were in the other user's index at the time the other user rangit commit
.Note that each Git repository has just the one index, at least initially. This index is linked with the one work-tree. In newer Git versions, you can use
git worktree add
to add additional work-trees; each new work-tree comes with one new index/staging-area. The point of this index is to act as an intermediate file-holder, situated between "the current commit" (akaHEAD
) and the work-tree. Initially, theHEAD
commit and the index normally match: they contain the same versions of all the committed files. Git copies the files fromHEAD
into the index, and then from the index into the work-tree.It's easy to see the work-tree: it has your files in their ordinary format, where you can view and edit them with all the regular tools on your computer. If you write Java or Python code, or HTML for a web server, the work-tree files are usable by the compiler or interpreter or web-server. The files stored in the index, and stored in each Git commit, do not have this form and are not usable by the compilers, interpreters, web-servers, and so on.
One other thing to remember about commits is that once a file is in a commit, it cannot be changed. No part of any commit can ever change. A commit is therefore permanent—or at least, permanent unless it is removed (which can be done but is difficult and usually undesirable). What is in the index and work-tree, however, can be modified at any time. This is why they exist: the index is almost a "modifiable commit" (except that it's not saved until you run
git commit
), and the work-tree keeps the files in the form that the rest of the computer can use.11It's not necessary to have both the index and the work-tree. The VCS could treat the work-tree as the "modifiable commit". This is what Mercurial does; this is why Mercurial does not need an index. This is arguably a better design—but it's not the way Git works, so when using Git, you have an index. The presence of the index is a large part of what makes Git so fast: without it, Mercurial has to be extra-clever, and is still not as fast as Git.
Commits remember their parent; new commits are children
When you make a new commit by running
git commit
, Git takes the index contents and makes a permanent snapshot of everything that is in it right at that point. (This is why you mustgit add
files: you copy them from your work-tree, where you have changed them, back into your index, so that they are ready to be "photographed" for the new snapshot.) Git also collects a commit message, and of course uses your name and email address and the current time, to make the new commit.But Git also stores, in the new commit, the hash ID of the current commit. We say that the new commit "points back to" the current commit. Consider, for instance, this simple three-commit repository:
Here we say that the branch name
master
"points to" the third commit, which I have labeledC
, rather than using one of Git's incomprehensible hash IDs likeb06d364...
. (The nameHEAD
refers to the branch name,master
. This is how Git can turn the stringHEAD
into the correct hash ID: Git followsHEAD
tomaster
, then reads the hash ID out ofmaster
.) It's commitC
itself that "points to"—retains the hash ID of—commitB
, though; and commitB
points to commitA
. (Since commitA
is the very first commit ever, there is no earlier commit for it to point to, so it doesn't point anywhere at all, which makes it a bit special. This is called a root commit.)To make a new commit, Git packages up the index into a snapshot, saves that with your name and email address and so on, and includes the hash ID of commit
C
, to make a new commit with a new hash ID. We will useD
instead of the new hash ID since we don't know what the new hash ID will be:Note how
D
points toC
. Now thatD
exists, Git alters the hash ID stored under the namemaster
, to storeD
's hash ID instead ofC
's. The name stored inHEAD
itself does not change at all: it's stillmaster
. So now we have this:You can see from this diagram how Git works: given a name, like
master
, Git simply follows the arrow to find the latest commit. That commit has a backwards arrow to its earlier or parent commit, which has another backwards arrow to its own parent, and so on, throughout all its ancestors leading back to the root commit.Note that while children remember their parents, the parent commits do not remember their children. This is because no part of any commit can ever change: Git literally can't add the children to the parent, and it does not even try. Git must always work backwards, from newer to older. The commit arrows all automatically point backwards, so normally I do not even draw them:
Distributed repositories: what
git fetch
doesWhen we use
git fetch
, we have two different Gits, with different—but related—repositories. Suppose we have two Git repositories, on two different computers, that both start out with those same three commits:Because they start out with the exact same commits, these three commits also have the same hash IDs. This part is very clever and is the reason the hash IDs are the way they are: the hash ID is a checksum2 of the contents of the commit, so that any two commits that are exactly identical always have the same hash ID.
Now, you, in your Git and your repository, have added a new commit
D
. Meanwhile they—whoever they are—may have added their own new commits. We'll use different letters since their commits will necessarily have different hashes. We'll also look at this mostly from your (Harry's) point of view; we'll call them "Sally". We'll add one more thing to our picture of your repository: it now looks like this:Now let's assume that Sally made two commits. In her repository, she now has this:
or perhaps (if she fetches from you, but has not yet run
git fetch
):When you run
git fetch
, you connect your Git to Sally's Git, and ask her if she has any new commits added to hermaster
since commitC
. She does—she has her new commitsE
andF
. So your Git gets those commits from her, along with everything needed to complete the snapshots for those commits. Your Git then adds those commits to your repository, so that you now have this:As you can see, what
git fetch
did for you was to collect all of her new commits and add them to your repository.In order to remember where her
master
is, now that you have talked with her Git, your Git copies her master to yoursally/master
. Your ownmaster
, and your ownHEAD
, do not change at all. Only these "memory of another Git repository" names, which Git calls remote-tracking branch names, change.2This hash is a cryptographic hash, in part so that it's difficult to fool Git, and in part because cryptographic hashes naturally behave well for Git's purposes. The current hash uses SHA-1, which was secure but has seen brute-force attacks and is now being abandoned for cryptography. Git will likely move to SHA2-256 or SHA3-256 or some other larger hash. There will be a transition period with some unpleasantness. :-)
You should now merge or rebase—
git reset
is generally wrongNote that after you have fetched from Sally, it is your repository, and only your repository, that has all the work from both of you. Sally still does not have your new commit
D
.This is still true even if instead of "Sally", your other Git is called
origin
. Now that you have bothmaster
andorigin/master
, you must do something to connect your new commitD
with their latest commitF
:(I moved
D
on top for graph-drawing reasons, but this is the same graph as before,Your main two choices here are to use
git merge
orgit rebase
. (There are other ways to do this but these are the two to learn.)Merge is actually simpler as
git rebase
does something that involves the verb form of merging, to merge. Whatgit merge
does is to run the verb form of merging, and then commit the result as a new commit that is called a merge commit or simply "a merge", which is the noun form of merging. We can draw the new merge commitG
this way:Unlike a regular commit, a merge commit has two parents.3 It connects back to both of the two earlier commits that were used to make the merge. This makes it possible to push your new commit
G
toorigin
:G
takes with it yourD
, but also connects back to theirF
, so their Git is OK with this new update.This merge is the same kind of merge you get from merging two branches. And in fact, you did merge two branches here: you merged your
master
with Sally's (ororigin
's)master
.Using
git rebase
is usually easy, but what it does is more complicated. Instead of merging your commitD
with their commitF
to make a new merge commitG
, whatgit rebase
does is to copy each of your commits so that the new copies, which are new and different commits, come after the latest commit on your upstream.Here, your upstream is
origin/master
, and the commits that you have that they don't is just your one commitD
. Sogit rebase
makes a copy ofD
, which I will callD'
, placing the copy after their commitF
, so thatD'
's parent isF
. The intermediate graph looks like this:5The copying process uses the same merging code that
git merge
uses to do the verb form, to merge, of your changes from commitD
.4 Once the copy is done, however, the rebase code sees that there are no more commits to copy, so it then changes yourmaster
branch to point to the final copied commitD'
:This abandons the original commit
D
.6 This means we can stop drawing it too, so now we get:It's now easy to
git push
your new commitD'
back toorigin
.3In Git (but not Mercurial), a merge commit can have more than two parents. This doesn't do anything you cannot do by repeated merging, so it's mainly for showing off. :-)
4Technically, the merge base commit, at least for this case, is commit
C
and the two tip commits areD
andF
, so in this case it's literally exactly the same. If you rebase more than one commit, it gets a little more complicated, but in principle it's still straightforward.5This intermediate state, where
HEAD
is detached frommaster
, is usually invisible. You see it only if something goes wrong during the verb-form-of-merge, so that Git stops and has to get help from you to finish the merge operation. When that does occur, though—when there is a merge conflict during rebasing—it's important to know that Git is in this "detached HEAD" state, but as long as the rebase completes on its own, you don't have to care about this so much.6The original commit chain is retained temporarily through Git's reflogs and via the name
ORIG_HEAD
. TheORIG_HEAD
value gets overwritten by the next operation that makes a "big change", and the reflog entry eventually expires, typically after 30 days for this entry. After that, agit gc
will really remove the original commit chain.The
git pull
command just runsgit fetch
and then a second commandNote that after
git fetch
, you usually have to run a second Git command, eithergit merge
orgit rebase
.If you know in advance that you will, for certain, immediately use one of those two commands, you can use
git pull
, which runsgit fetch
and then runs one of those two commands. You pick which second command to run by settingpull.rebase
or supplying--rebase
as a command-line option.Until you are quite familiar with how
git merge
andgit rebase
work, however, I suggest not usinggit pull
, because sometimesgit merge
andgit rebase
fail to complete on their own. In this case, you must know how to deal with this failure. You must know which command you actually ran. If you run the command yourself, you will know which command you ran, and where to look for help if necessary. If you rungit pull
, you may not even know which second command you ran!Besides this, sometimes you might want to look before you run the second command. How many commits did
git fetch
bring in? How much work will it be to do a merge vs a rebase? Is merge better than rebase right now, or is rebase better than merge? To answer any of these questions, you must separate thegit fetch
step from the second command. If you usegit pull
, you must decide in advance which command to run, before you even know which one is the one to use.In short, only use
git pull
after you're familiar with the way the two parts of it—git fetch
, and the second command you choose—really work.