Neural Networks made ridiculously simple

AI/ML

beginner

Published

August 12, 2024

Previous attendees have said…

8 previous attendees have left feedback
100% would recommend this session to a colleague
100% said that this session was pitched correctly

Three random comments from previous attendees

Great introductions to the origin and concept of Neural Networks.
Helpful overview
I won’t pretend I ‘got’ everything but a fascinating and helpful window into this topic!

Session materials

all materials
slides html / pdf

Welcome

this session is 🌶: for beginners

What’s this session for?

neural nets are a core technology for AI/ML systems
they’ve been around for decades (and probably will go on for decades)
they’re also particularly helpful for health & care folk as a way of understanding AI/ML tools in general

What this session won’t do

give a general introduction to AI/ML
explain how to build a neural net of your very own
discuss in any detail the (often formidable) maths of neural nets

Biology: the neurone

biological neurones:
- receive some input from an upstream neuron(s)
- process that input in some way
- generate some output(s) in response to that input, and pass it downstream

Biology: activation

https://upload.wikimedia.org/wikipedia/commons/thumb/4/4a/Action_potential.svg/778px-Action_potential.svg.png

neurones respond to stimulii
- threshold-y
- approximately digital output (on/off)
- sometimes complex behaviour about inputs

Biology: networks of neurones

neurones are usually found in networks
can produce complex and sophisticated behaviours in a robust way
non-obvious relationships between structure and function

Machines: the node

Here’s a simple representation of a node, implemented in code, that we might find in a neural network:

Machines: activation functions

Here are some example input:output pairs for our node:

there are lots of possible activation functions
a simple one: NOT
- our node outputs TRUE when we input FALSE, and vice versa

This flexibility means that we can build networks of nodes (hence neural networks). Again, a very simple example:

Activation functions can be extremely simple

node <- function(input){
  !input
}

node(TRUE)

[1] FALSE

Machines: networks of nodes

nodes are usually found in networks
can produce complex and sophisticated behaviours in a robust way
again, non-obvious relationships between structure and function in artifical neural networks (ANN)

A user supplies some input. That input is fed into an input node(s), which processes the input, and produces three different outputs that are then fed into a second layer of nodes. Further processing happens in this hidden layer, leading to three outputs that are integrated together in a final output node that processes the outputs of the hidden layer into a single output.

Several kinds of networks

there are lots of ways that neural networks can be arranged
our example above = feed-forward
- all the nodes are connected from left-to-right
but more complex architectures - like recurrent neural networks - might have feedback looks and other biological-ish features
different numbers of layers
lots of different design tendencies since the first intro of neural nets in the 1950s [@rosenblatt1958]
most fancy ANNs are currently architecturally simple

Why ANNs?

ANNs can can potentially replicate any input-output ransformation
we do that by a) increasing complexity and b) allowing them to ‘learn’

. . .

Different activation functions

binary (true/false)
continuous
- linear
- non-linear (like sigmoid, ReLU)

Training in neural networks

ANNs can be trained
- take a dataset
- split it into training and test parts
  - classify (by hand) the training data
- then train
  - feed your ANN the training data and evaluate how well it performs
  - modify the ANN based on that evaluation
  - repeat until done/bored/perfect
- finally, test your model with your unlabelled test data and evaluate

MNIST

a classic dataset
recognizing handwritten numbers = actually-important task
60000 labelled training images
10000 test images
each is a 28*28 pixel matrix grey vales encoded as 0-255

MNIST data example

V10	V11	V12	V13	V14	V15	V16	V17	V18	V19	V20
0	0	0	0	0	0	0	0	0	0	0
0	0	0	3	18	18	18	126	136	175	26
36	94	154	170	253	253	253	253	253	225	172
253	253	253	253	253	253	253	253	251	93	82
253	253	253	253	253	198	182	247	241	0	0
156	107	253	253	205	11	0	43	154	0	0

Train for MNIST

take your training data
put together a neural network (number of nodes, layers, feedback, activation functions)
run the training data, and evaluate based on labelling
modify your neural network, rinse, and repeat
…
when happy, try the unlabelled test data

MNIST examples

lots of different examples
why do this work in an ANN, rather than in some other tool?
- generic training strategy - reduces need for domain knowledge
- hopefully robust outcomes - so giving models able to work across contexts
- scalable

MNIST in R

An aside here for the R enthusiasts - we can plot the handwritten numbers back out of the data using ggplot():

mnist_plot_dat <- function(df) {
   # matrix to pivoted tibble for plotting
  df |>
      as_tibble() |>
      mutate(rn = row_number()) |>
      tidyr::pivot_longer(!rn) |>
      mutate(name = as.numeric(gsub("V", "", name)))
}

mnist_main_plot <- function(df) {
  df |>
    ggplot() +
    geom_tile(aes(
      x = rn,
      y = reorder(name,-name),
      fill = value
    )) +
    scale_fill_gradient2(mid = "white", high = "black")
}

mnist_plot <- function(n){

  mnist_plot_dat(matrix(mnist$train$images[n, ], 28, 28)) |>
    mnist_main_plot() +
      ggtitle(glue("Label: { mnist$train$labels[n]}")) +
      theme_void() +
      theme(legend.position = "none")

}

gridExtra::grid.arrange(grobs = purrr::map(1:36, mnist_plot), nrow = 6, top="Some MNIST examples")