ng-data-club

Version control with git

Denis Schluppeck, 2025-01-14


height:200px


Why version control? (tout seul)

Lots of good reasons - but the main ones1 are:



1 see e.g. “What is version control”


Why version control? (avec amis)

If you are collaborating on code / docs, in addition:

+ github.com


Imagine a typical project (code / notes)

my alt text
How material changes over time...

Why git?

There are many version control systems (VCS). But git comes with some advantages:



1 see e.g. “Wikipedia / git”


git does snapshots

my alt text

How are things tagged?

fingerprint

2 CC BY-SA 3.0


How are things tagged?


hexadecimal numbers?

0...9  10 11 12 13 14   15  # decimal
0...9   a  b  c  d  e    f  # hexadecimal

0 1 10 11 100 101 ...  111  # binary

shasum of a file

shasum Introduction.md
# b5acbb35abd2511a4c05e48ef58f8990f139793a  Introduction.md


tiny change, e.g. add a space?! and calculate SHA again:


shasum Introduction.md
# 502bbcb5ab4f0d8127396675dd7d17d7d8b55b0a  Introduction.md

… completely different.


git nitty-gritty – for data club 😀

the sha actually refers to the

"blob + <size in bytes> + \0 + <the file contents>"

you can try this out by

echo 'Hello, World!' | git hash-object --stdin
# leads to
8ab686eafeb1f44702738c8b0f24f2567c36da6d

Note: the filename doesn’t contribute to the sha of the file / blob … which means renaming files is cheap (doesn’t use up space)


How are things tagged (2)?

A similar trick works for a list of directory contents (the “tree”)

:arrow_right: tree hash


.
├── analysis
├── stimulusCode
│   └── stims
│       ├── houses
│       ├── normal
│       ├── objects
│       └── scrambled
└── unix-intro

How are things tagged (3)? - commit


$ git cat-file -p HEAD

tree 80fc45cae348efbdbbb652642cf4c22e1ddaaf80
parent b2b3a018fa2569bc5aa54b0b744145f6758bcba7
author Denis Schluppeck <denis.schluppeck@gmail.com> 1517238320 +0000
committer Denis Schluppeck <denis.schluppeck@gmail.com> 1517238320 +0000

fixes http to https

</small>


Workflow

my alt text
Files

3 scenarios to get us all thinking

  1. you (on your own), several different computers

  2. you, a couple of collaborators,+ code that changes a lot

  3. you want to share materials with lots of people (details change: maybe once a year, maybe more often…)


bg height:100%


![width:700px](/ng-data-club/presentations/2025-01-14-version-control-zero-to-hero/images/scenario-01-b.png)

</small>


bg height:100%


Branches - trying out new ideas

![width:600px](../images/scenario-02-b.png)

# they should make a new TRACKING BRANCH
git checkout -b whacky-idea-branch

# work on there, git add  / commit / push...
git checkout main
git merge whacky-idea branch # when ready ;)

</small>


bg height:100%


Pages - sharing via static www hosting

![width:600px](/ng-data-club/presentations/2025-01-14-version-control-zero-to-hero/images/scenario-03-b.png)


Examples

Local repo: Let’s try it

mkdir test && cd test
git init

Let’s try it (2)

git add test.txt
git commit -m 'my first commit'

Warnings?

git config --global user.name "First Last"    # your name
git config --global user.email "me@gmail.com" #  your email
more ~/.gitconfig

Now complete the commit

git status # read what's there

git commit -m 'my first commit'

git status # read what's there NOW

If you want this on github

Currently the repository is local to the machine you are working on, if you want to share with your friends and colleagues on github.com, follow instructions at:

https://help.github.com/en/articles/adding-an-existing-project-to-github-using-the-command-line


Notes