Version control for beginners

Git
version control
Published

February 4, 2026

About this session

This is an introduction to version control using git. In this session, we’ll:

  • do a no/low-tech introduction to the general idea of version control using Microsoft Word
  • then talk about how Git’s view of version control differs from Word
  • we’ll then do a more technical practical using Git. Because git

Version control in two minutes

Version control refers to the use of software to track changes to the code base of a project. Word’s track changes is a simple version control system, because it allows you to a) see how a document has developed, and b) control that development by e.g. accepting or rejecting changes. Version control is extremely important for code-driven project. As code grows, managing changes becomes more complicated - particularly when several developers collaborate on a project. Version control aims to overcome these problems by providing tools to assist developers in managing their code base. This helps to assure the quality of the finished code by making sure that all and only the working code is passed on to users.

[Word demo of track changes]

Why do version control

  • version control is consistent with best practice software development techniques.
  • industry standard for analysis in other sectors
  • follows leading stakeholders, especially PHS, NHS Digital
  • necessary for reproducible analytics pipelines
  • mandate for open code
  • solves complex problems of collaboration as the scope of projects grow

Practical

In this section, we’ll use Git to demonstrate how version control can be used to manage code- and text-based projects. This uses posit.cloud, which is mainly meant for writing R code. It includes several version control tools, including Git. We use this platform because it makes providing infrastructure to users across different organisations really easy. You definitely don’t need to know anything about R to follow this session.

Setup

  • in posit.cloud, create a new Rstudio project
  • give it a name
  • create an R script in that project
  • write a simple script or line of text ("hello world" or 2 + 2 if you’re not an R person)
  • save your R script in your project root

Starting Git

  • Tools > Version Control > Project Setup
  • pick Git, and confirm that “do you want to initialize a new git repository for this project”
  • Repository = the container where Git stores version control information about this project
  • our repo contains:
    • .git subdirectory, which contains no user servicable components
    • gitignore, which is important to our story
ImportantImportant bit of Posit-specific fiddling

In the terminal, please run the following two lines of code:

  • git config --global user.email "you@example.com"
  • git config --global user.name "Your Name"

Git repositories record the name and email of the person making changes. This is one way of setting that information, but your own platform may well use a different method. You should only need to do that once per Git setup on a computer.

Adding files to our repo

  • before Git can version control files, they need to be added to the repo
  • Git pane: should contain three files with yellow status blobs
    • the Git pane gives a simple GUI for interacting with our repo
  • select all, and tick the Staged checkboxes, to add them to the repo
    • Add/Stage = Git is keeping an eye on these files

Committing files to our repo

  • then click Commit to save the current state of the files
    • write a commit message, traditionally “first commit”
    • Commit = Git has added the current state of your files to the repo, as the commit summary should show you
  • by default, Git adds those files to a branch, called Master
    • decolonialising point, this is more properly called Main in more recent versions of Git (including those on Github)
    • we’ll talk about what branches are, and why they’re important, later
    • you will want to rename this branch as main otherwise Github will be a pain
  • git branch -m main in the terminal will do it (might need a cheeky refresh in the Git pane)

Change your R script

  • make a change to your script and save
  • you’ll see a blue modified icon appear in the Git pane next to the file name
  • stage and commit the file
    • you’ll see a diff: a summarised view of the differences between the old and new version of the file
    • Git basically thinks in diffs, and the non-user-servicable files in .git are records of those differences
  • add a commit message and press Commit

History

  • look at the Git history Git history
  • explore each of your two commits, paying particular attention to the diffs

.gitignore practical

  • so far, we’ve added individual files to our repo
  • we can also instruct Git to ignore files and folders by editing the .gitignore file that was created when we initialised our repo
  • open .gitignore from the files pane
  • naming files and folders here will cause Git to ignore them
  • see the docs for more information, but roughly
    • add a filename (with its relative path using /) to prevent it being monitored by Git
    • use dir_name/* to ignore everything in a directory
NoteTask
  • commit any outstanding changes to your repo
  • create a new R script in the project root directory called “secret.R”
  • create a new folder called “also_secret”, and add a couple of empty R scripts inside it
  • review what happens in the Git pane
  • now edit .gitignore to ignore all the new stuff you’ve just made
  • again, review the Git pane

Branching

If you’re doing something more involved to your file(s), you might consider creating a branch. For example, say you’re planning long-term improvements to the contents of main. You might create a branch - dev, say - to do that development work. That’d free you to potentially build an entirely new piece of work without needing to keep the old one working.

  • New branch and name it dev
    • at the start, your new branch will be a copy of the current state of main
  • now re-write your script
    • feel free to make multiple commits on the way!
  • now look again at the Git history: you’ll see coloured blobs telling you which branches changes were made on
  • switch to main, and in the terminal git merge dev to bring your dev changes in
    • you can delete dev now with git branch -d dev

Git bash

You’ll notice that we don’t have all the tools we might ever need in the Git pane in Rstudio. Git is mainly intended to be used via the command line in the Terminal.

Git bash toolkit

  • git init = start a new Git repo here
  • git status = what’s currently committed/changed
  • git add . = add everything and track it
  • git commit -m "commit message" = commit all tracked files
  • git checkout -b emma = create a branch named “emma” and switch to it
  • git branch -m steve = rename current branch to “steve”
  • git checkout main = switch back to the main branch

GitHub = a key tool for distributed version control

GitHub is based on Git, but provides a centralised location for repositories. For example, Microsoft host their source code on GitHub, as do PHS, NHS Digital, and many others. As well hosting the repository, and allowing users to choose to share their code, GitHub provides many other tools for software development. PHS provide an excellent quick introduction to GitHub workflow on their guidance page.

Some organisations use Gitea, which is similar to GitHub, except that private servers can be created. This is useful to share confidential code securely with authorised users.

Tools and resources

https://ohshitgit.com/

https://happygitwithr.com/https-pat