Comment by deathanatos

7 years ago

There's three "versions" of the code at play, if you will:

* the most recent commit

* the staged files

* the working directory

`git commit`, as you might already know, takes "the staged files" and turns it into a commit, making that the latest commit. `git add` adds a snapshot of a file in your working directory to the staged files. The important bit here is that the copy of the file in the staging area is separate from the file in your working directory. So, if after `git add`ing a file you make more changes, you will need to `git add` those subsequent changes if you wish to commit. `git status` will tell you this:

  » git status
  On branch master
  Changes to be committed:
    (use "git reset HEAD <file>..." to unstage)
  
  	modified:   foo.txt
  
  Changes not staged for commit:
    (use "git add <file>..." to update what will be committed)
    (use "git checkout -- <file>..." to discard changes in working directory)
  
  	modified:   foo.txt

"Changes to be committed" is the staged files. "Changes not staged" is stuff that has been modified, but not `git add`'d. You can see here that I've changed foo.txt after git adding it; if I want those changes, I need to git add it again.

I can look at the diffs, too:

  # diff between the last commit, and the staged files
  # (i.e., what will be committed)
  git diff --staged
  # diff between the staged files and the working directory
  # (unstaged changes)
  git diff
  # diff of all changes since the last commit:
  # (stage+working dir, essentially)
  git diff HEAD

That should be all the various combinations.

I find that a lot of newcomers find the staging area weird, and usually ask some variant of "why would I not want to commit all of the files I've changed?" The staging area, used effectively, can really help you break out things into logical chunks of changes that can then be adequately described with a message in a commit. This can help others later: if your change is a bug fix, and someone wants to cherry-pick it to production, they might not want your new feature, or your lint fixes: they want a minimally risky fix. To that end, the stage/working dir separation acts as a sieve of sorts, letting the stuff that's "ready to go" get filtered out into a commit.

I want to mention the extremely useful `git add -p`: this command will interactively prompt you, hunk by hunk, for whether or not you want to stage that hunk. It will even let you edit the hunks prior to staging. So, for example, if I run across a spelling error, or a minor bug, I can very quickly add it (and just it) to the stage w/ `git add -p`, and then commit it, even if there are other modifications, even in the same file.

> There's three "versions" of the code at play, if you will:

> * the most recent commit

> * the staged files

> * the working directory

This is weird. The staging area is like a commit but not a commit. They're changes that git is aware of and has a record of but not quite a permanent record.

Why not just make it a commit? You can always keep editing commits, or throw them out, or whatever. That's what I do with Mercurial. I write a commit and I keep adding stuff to it if I think it needs more stuff (or less stuff).

Gregory Szorc has a more extensive analysis of the situation in first subsection here: https://gregoryszorc.com/blog/2017/12/11/high-level-problems...

  • My best guess is that the commit metadata (particularly, message) is missing. You could always have it be "(uncommited, staged changes)" though, and that's probably descriptive enough. (I agree with you on the whole: having the staged data be a commit makes things conceptually much simpler.)

    My other guess is that the "index" (the other name for the staging area) is also used for conflicts during merges & rebases, and that somehow plays into the problem of making it an actual commit. (But again, this comes across more as an excuse than a reason: I don't any viable reason why the staged changes can't still be an actual commit, and the merge conflict data just stored as a separate structure.)

    That, or the person who added it just didn't think of it, or couldn't do it due to backwards compatibility.