Looking at history and differences

Overview

Teaching: 30 min
Exercises: 5 min
Questions
  • How can I see what changed between commits?

  • How do I go back to a previous version of my project?

Objectives
  • Be able to view history of changes to a repository

  • Be able to view differences between commits

  • Be able to recover a previous version of your project

  • Understand how and when to use tags to label commits

Looking at differences

We should reference some previous work in the introduction section. Make the required changes, save both files but do not commit the changes yet. We can review the changes that we made using:

$ nano paper.md		# Cite previous studies in introduction
$ nano refs.txt		# Add the reference to the database
$ git diff		# View changes

This shows the difference between the latest copy in the repository and the unstaged changes we have made.

Looking at differences between commits is one of the most common activities. The git diff command itself has a number of useful options.

Configure a visual diff tool

There are many GUI-based tools available for looking at differences and editing files, which can be easier to work with. For example:

  • Diffmerge (Free, cross-platform)
  • WinMerge - open source tool available for Windows; To view differences with a GUI instead of using the command-line diff tool, first configure git to use your chosen diff tool:
$ git config --global diff.tool diffmerge    # Set diffmerge as your visual diff tool
$ git config --global difftool.prompt false  # Suppress confirmation before launching GUI

Note that these config steps are slightly different for Windows.

Then to use the GUI, use the following command instead of git diff:

$ git difftool

Now commit the change we made by adding the second reference:

$ git add paper.md refs.txt
$ git commit			# "Cite previous work in introduction"

Looking at our history

To see the history of changes that we made to our repository (the most recent changes will be displayed at the top):

$ git log
commit 8bf67f3862828ec51b3fdad00c5805de934563aa
Author: Your Name <your.name@manchester.ac.uk>
Date:	Mon Jun 26 10:22:39 2017 +0100

    Cite PCASP paper


commit 4dd7f5c948fdc11814041927e2c419283f5fe84c
Author: Your Name <your.name@manchester.ac.uk>
Date:	Mon Jun 26 10:21:48 2017 +0100

    Write introduction

commit c38d2243df9ad41eec57678841d462af93a2d4a5
Author: Your Name <your.name@manchester.ac.uk>
Date:	Mon Jun 26 10:14:30 2017 +0100

    Add author and title

The output shows (on separate lines):

Git automatically assigns an identifier (e.g. 4dd7f5) to each commit made to the repository — we refer to this as COMMITID in the code blocks below. In order to see the changes made between any earlier commit and our current version, we can use git diff followed by the commit identifier of the earlier commit:

$ git diff COMMITID		# View differences between current version and COMMITID

And, to see changes between two commits:

$ git diff OLDER_COMMITID NEWER_COMMITID

Where to create a Git repository?

Avoid creating a Git repository within another Git repository. Nesting repositories in this way causes the ‘outer’ repository to track the contents of the ‘inner’ repository - things will get confusing!

Exercise: “bio” Repository

  • Create a new Git repository on your computer called “bio”
  • Be sure not to create your new repo within the ‘paper’ repo (see above)
  • Write a three-line biography for yourself in a file called me.txt
  • Commit your changes
  • Modify one line, add a fourth line, then save the file
  • Display the differences between the updated file and the original

You may wish to use the faded example below as a guide

cd ..                # Navigate out of the paper directory
                     # Avoid creating a repo within a repo - confusion will arise!
mkdir ___            # Create a new directory called 'bio'
cd ___               # Navigate into the new directory
git ____             # Initialise a new repository
_____ me.txt         # Create a file and write your biography
git ___ me.txt       # Add your biography file to the staging area
git ______           # Commit your staged changes
_____ me.txt         # Edit your file
git ____ me.txt      # Display differences between your modified file and the last committed version

Solution

cd ..                # Navigate out of the paper directory
                     # Avoid creating a repo within a repo - confusion will arise!
mkdir bio            # Create a new directory
cd bio               # Navigate into the new directory
git init             # Initialise a new repository
nano me.txt          # Create a file and write your biography
git add me.txt       # Add your biography file to the staging area
git commit           # Commit your staged changes
nano me.txt          # Edit your file
git diff me.txt      # Display differences between your modified file and the last committed version

The HEAD and master pointers

Let’s take a look again at the output from git log. This time we’ll use the --decorate option to display the pointers (your git set up might already display them by default).

$ git log --decorate
commit 8bf67f3862828ec51b3fdad00c5805de934563aa (HEAD -> master)
Author: Your Name <your.name@manchester.ac.uk>
Date:	Mon Jun 26 10:22:39 2017 +0100

    Cite PCASP paper


commit 4dd7f5c948fdc11814041927e2c419283f5fe84c
Author: Your Name <your.name@manchester.ac.uk>
Date:	Mon Jun 26 10:21:48 2017 +0100

    Write introduction

commit c38d2243df9ad41eec57678841d462af93a2d4a5
Author: Your Name <your.name@manchester.ac.uk>
Date:	Mon Jun 26 10:14:30 2017 +0100

    Add author and title

You’ll see there are two pointers, HEAD and master which label the most recent commit.

Going back in time with git

We can use commit identifiers to set our working directory back to how it was at any commit. Doing so will mean the HEAD pointer no longer points to the branch tip – this scenario is known as a detached HEAD, and is for inspection and discardable experiments.

Checking out a previous commit - detached head

As we’ll find out in episode 6, the switch command is used to switch between branches, but if we want to switch to a commit instead of a named branch, we’ll need to use switch with the -d (detach) option.

Let’s go back to the very first commit we made:

$ git log
$ git switch -d INITIAL_COMMITID

We will get something like this:

HEAD is now at 8bd9133 Add title and author

And if we run

$ git status

we get a confirmation that we have a detached HEAD:

HEAD detached at 8bd9133
nothing to commit, working tree clean

If we look at paper.md we’ll see it’s our very first version. And if we look at our directory,

$ ls
paper.md

then we see that our refs.txt file is gone. But, rest easy, while it’s gone from our working directory, it’s still in our repository. We can jump back to the latest commit by doing:

$ git switch master

And refs.txt will be there once more,

$ ls
paper.md refs.txt

So we can get any version of our files from any point in time. In other words, we can set up our working directory back to any stage it was when we made a commit.

Visualising your own repository as a graph

If we use git log with a couple of options, we can display the history as a graph, and decorate those commits corresponding to Git references (e.g. HEAD, master):

$ git log --graph --decorate --oneline
* 6a48241 (HEAD, master) Cite previous work in introduction
* ed26351 Cite PCASP paper
* 7446b1d Write introduction
* 4f572d5 Add title and author

Notice how HEAD and master point to the same commit. Now switch to previous commit again, and look at the graph again. We can display, this time specifying that we want to look at --all the history, rather than just up to the current commit.

$ git switch -d HEAD~		# This syntax refers to the commit before HEAD
$ git log --graph --decorate --oneline --all
* 6a48241 (master) Reference second paper in introduction
* ed26351 (HEAD) Reference Allen et al in introduction
* 7446b1d Write introduction
* 4f572d5 Add title and authors

Notice how HEAD no longer points to the same commit as master. Let’s return to the current version of the project by switching to master again.

$ git switch master

Using tags as nicknames for commit identifiers

Commit identifiers are long and cryptic. Git allows us to create tags, which act as easy-to-remember nicknames for commit identifiers.

For example,

$ git tag PAPER_STUB

We can list tags by doing:

$ git tag

Let’s explain to the reader why this research is important:

$ nano paper.md	# Give context for research
$ git add paper.md
$ git commit -m "Explain motivation for research" paper.md

We can switch back to our previous version using our tag instead of a commit identifier.

$ git switch -d PAPER_STUB

And return to the latest commit,

$ git switch master

Top tip: tag significant events

When do you tag? Well, whenever you might want to get back to the exact version you’ve been working on. For a paper, this might be a version that has been submitted to an internal review, or has been submitted to a conference. For code this might be when it’s been submitted to review, or has been released.

Key Points

  • git log shows the commit history

  • git diff displays differences between commits

  • git switch -d recovers previous states of the repo

  • HEAD points to the commit you have checked out

  • master points to the tip of the master branch

  • git tag allows commits to be given a descriptive label

  • git difftool shows changes using your configured diff GUI