Version control with git

Denis Schluppeck, 2019-03-01

Why version control?

Lots of good reasons - but the main ones1 are:

  • a complete history of changes (which means you can undo)
  • branches (you can try new stuff out without breaking things)
  • you can trace who did what when, tag versions of your manuscript / code
    • submitted, published
    • v1.0, feature-release


1 see e.g. "What is version control"

Imagine a typical project (code / notes)

my alt text
How material changes over time...

Why git?

There are many version control systems (VCS). But git comes with some advantages:

  • it's distributed (full version history in your local copy)
  • corollary: you can work with it anywhere ✈️ or 🚋 (no need for network connection)
  • it's widely used1


1 see e.g. "Wikipedia / git"

git does snapshots

  • think of this as snapshots
  • what's the state of each file now?

my alt text

How are things tagged?

fingerprint

2 [CC BY-SA 3.0](https://creativecommons.org/licenses/by-sa/3.0)

How are things tagged?

  • each file has a unique fingerprint (shasum)
  • if the file changes, the fingerprint changes, too!
  • sha = secure hash algorithm
  • sha turns text/data into a 40 digit hexadecimal number


hexadecimal numbers?

0...9  10 11 12 13 14   15  # decimal
0...9   a  b  c  d  e    f  # hexadecimal

0 1 10 11 100 101 ...  111  # binary

shasum of a file

shasum Introduction.md
# b5acbb35abd2511a4c05e48ef58f8990f139793a  Introduction.md

tiny change, e.g. add a space?! and calculate SHA again:


shasum Introduction.md
# 502bbcb5ab4f0d8127396675dd7d17d7d8b55b0a  Introduction.md

... completely different.

How are things tagged (2)?

A similar trick works for a list of directory contents (the "tree")

➡️ tree hash


.
├── analysis
├── stimulusCode
│   └── stims
│       ├── houses
│       ├── normal
│       ├── objects
│       └── scrambled
└── unix-intro

How are things tagged (3)? - commit

  • information about files (aka blobs), their relationship to each other (the tree), the previous state (parent) and a message make up a commit

$ git cat-file -p HEAD

tree 80fc45cae348efbdbbb652642cf4c22e1ddaaf80
parent b2b3a018fa2569bc5aa54b0b744145f6758bcba7
author Denis Schluppeck <denis.schluppeck@gmail.com> 1517238320 +0000
committer Denis Schluppeck <denis.schluppeck@gmail.com> 1517238320 +0000

fixes http to https

Workflow

my alt text
Files

Let's try it

  • make a directory, cd into it
  • initialize repo
mkdir test && cd test
git init
  • make a text file test.txt
  • write something into it and save it

Let's try it (2)

  • add to staging area
  • ... and try to commit with a message (-m)
git add test.txt
git commit -m 'my first commit'

Warnings?

  • you'll see some warning messages
  • for (only this first time), set up your user.name and user.email
git config --global user.name "First Last"    # your name
git config --global user.email "me@gmail.com" #  your email
  • This info is stored on your machine in a little file, which you can inspect
more ~/.gitconfig

Now complete the commit

git status # read what's there

git commit -m 'my first commit'

git status # read what's there NOW

If you want this on github

Currently the repository is local to the machine you are working on, if you want to share with your friends and colleagues on github.com, follow instructions at:

https://help.github.com/en/articles/adding-an-existing-project-to-github-using-the-command-line

Notes

man shasum  # or
info shasum

page_number: true