Wednesday, August 3, 2011

Git for dummies

OK, the title of this post is a lie... git is decidedly NOT for dummies. Git is more for your really smart mentally gifted people OR for people who are working on exceeding complex software projects that have complicate merging and revision history requirements. Neither of these groups should be dummies or you will have a serious problem being effective. For the context of this tutorial, a "dummy" is defined as somebody who knows and understands how to use SVN or CVS in a team environment.

So if you're still reading, I will walk you through the simplest workflow I can discover for git that work and doesn't cause too many complications. For the sake of simplicity, we're going to assume you're working on a project hosted at github that already exists. I've created a public repo if you'd like to try along at home. Assuming you already have git installed, you should be able to "clone" the repository to your local machine by issuing the following command:

git clone git://github.com/mikemainguy/mainguy_blog.git

edit: I used the wrong url originally.
At this point you should now have a subdirectory called mainguy_blog with a "README" file inside it.

Assuming that everybody is working on a single branch of development, the workflow is pretty simple.
  • edit files (vi README)
  • add files to staging area (git add README)
  • commit changes (git commit README)
  • pull changes from remote (git pull)
  • push changes to remote (git push)

One thing you will note is that the number of steps is a bit different than you might be used to with svn or cvs. In particular, git seems to have added some steps. With SVN, the workflow would typically be:
  • edit files (vi README)
  • update from remote (svn update)
  • commit changes to remote (svn commit)

With Git, we've added the notion of a local staging area AND a local repository. This will really confuse dummies like me at first and I cannot emphasize enough that you need to think about the implications of this. I guarantee you THINK you get it, but the practical implication of not grokking it are that you will likely do tremendously stupid things for a period of days once you get into a larger team and/or someone starts to try doing some fancy merging.

So now, we're going to walk through a "normal" multiuser scenario.
  1. User 1 edits README and adds it, and commits to their local repository
  2. User 2 edits the same file in a different place, adds it and commits to their local repository
  3. User 1 pushes their change to github and the change looks like this
  4. User 2 tries to push their changes to github, but they discover that user1 has already pushed their changes.
  5. User 2 pulls their changes from github
  6. Git automerges the file because there are no conflicts
  7. User 2 pushes their changes to github
When we look at the github history we something interesting... there is an additional commit that was added at the end to indicate git/User2 merged some files. Aside from the extra workflow steps, this is an additional point of confusion for quite a few newcomers.

In short, a workflow to have git work like perhaps a dummy like me would expect follows:

vi README
git add README
git commit README
git pull --rebase


Now here is where things can get tricky. In the SVN world, if you have merge conflicts, you fix them and move along committing the results when you've "fixed" things. With git, on the other hand, you need "add" fixed files back in and continue the rebase. So, if you have no conflicts, you're actually done at this point, but if you have a merge conflict, you need to do the following steps


vi README
git add README
git rebase --continue


Once this is finished, push your changes back to the remote repo


git push


additional warning

When rebasing, if you get a conflict, do NOT commit the files, only ADD them. If you commit the files you will condemn yourself to a hurtful place where your commit history shows conflicts with things that you didn't even edit.

I think git is a wonderful tool, but it has a much steeper learning curve than it's simpler and kinder cousins svn and cvs. While my perspective is skewed by years of using SVN and CVS, I think it is pretty safe to say that these tools have millions of users and I am not the only person to go through the pain of "figuring out" git. The addition of remote/local and a staging area seems to be a common point of confusion for newcomers who've arrived at git from the SVN/CVS world.

4 comments:

Unknown said...

This looks like a great tutorial -- just what I need, but... when I try issuing the first command: "git clone git@github.com:mmainguy/mainguy_blog.git"
I get this error:
"Cloning into mainguy_blog...
Permission denied (publickey).
fatal: The remote end hung up unexpectedly"

Do you know why this might fail? I've tried from two different Linux boxes and have gotten this same response both times.

Mike Mainguy said...

Ahhh, whoops, forgot to post the anonymous git url. it should be git://github.com/mikemainguy/mainguy_blog.git .

I've updated in the post to reflect the proper path.

Anonymous said...

okay, I cloned your mainguy_blog.git. Then made changes to the README, Staged and committed the changes. Now I have the original branch and my changes in a branch. Not sure what to do next. Should I compare the two branches before pushing to the original ? I can't find the commands to do that in VS.
Thanks, -Nancy

Mike Mainguy said...

VS? The easiest thing to do is simply push... though I'd be curious if you can push to an anonymous clone. Could you try a push and tell me what happens?