No feedback found for this session
Scope of the possible with R
R
overview
NoteSession materials
- all materials
- slides
html / pdf
Slides for this session / .pdf slides for this session
Welcome
- this session is a non-technical overview designed for service leads
Session outline
- Why R, and why this session?
- R demo - take some data, load, tidy, analyse
- Strengths and weaknesses
- obvious
- less obvious
- Alternatives
- Skill development
R
- free and open-source
- multi-platform
- large user base
- prominent in health, industry, biosciences
Why this session?
- R can be confusing
- it’s code-based, and most of us don’t have much code experience
- it’s used for some inherently complicated tasks
- it’s a big product with lots of add-ons and oddities
- But R is probably the best general-purpose toolbox we have for data work at present
- big user base in health and social care
- focus on health and care-like applications
- not that hard to learn
- extensible and flexible
- capable of enterprise-y, fancy uses
R demo
- this is about showing what’s possible, and give you a flavour of how R works
- we won’t explain code in detail during this session
- using live open data
Load that data
One small bit of cheating: renaming
Preview
date | country | hb | loc | type | attend | n_within | n_4 | perc_4 | n_8 | perc_8 | n_12 | perc_12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
20170416 | S92000003 | S08000022 | H212H | Emergency Department | 171 | 166 | 5 | 97.1 | 0 | 0.0 | 0 | 0 |
20190811 | S92000003 | S08000020 | N121H | Emergency Department | 287 | 287 | 0 | 100.0 | 0 | 0.0 | 0 | 0 |
20170122 | S92000003 | S08000031 | C418H | Emergency Department | 1271 | 1129 | 142 | 88.8 | 7 | 0.6 | 0 | 0 |
20160814 | S92000003 | S08000019 | V217H | Emergency Department | 1180 | 1080 | 100 | 91.5 | 3 | 0.3 | 0 | 0 |
20180422 | S92000003 | S08000024 | S319H | Emergency Department | 1020 | 973 | 47 | 95.4 | 2 | 0.2 | 0 | 0 |
Removing data
ae_activity <- ae_activity |>
select(!c(country, contains("perc_")))
date | hb | loc | type | attend | n_within | n_4 | n_8 | n_12 |
---|---|---|---|---|---|---|---|---|
20180204 | S08000032 | L308H | Emergency Department | 1361 | 1133 | 228 | 43 | 18 |
20230716 | S08000015 | A210H | Emergency Department | 627 | 434 | 193 | 108 | 71 |
20210530 | S08000020 | N411H | Emergency Department | 484 | 410 | 74 | 7 | 2 |
20180401 | S08000029 | F704H | Emergency Department | 1285 | 1230 | 55 | 3 | 0 |
20231015 | S08000022 | H103H | Emergency Department | 185 | 163 | 22 | 6 | 2 |
Tidying data
ae_activity <- ae_activity |>
mutate(date = lubridate::ymd(date))
date | hb | loc | type | attend | n_within | n_4 | n_8 | n_12 |
---|---|---|---|---|---|---|---|---|
2019-03-10 | S08000025 | R103H | Emergency Department | 129 | 120 | 9 | 2 | 0 |
2022-02-13 | S08000015 | A111H | Emergency Department | 1059 | 759 | 300 | 137 | 94 |
2021-10-31 | S08000016 | B120H | Emergency Department | 502 | 384 | 118 | 44 | 25 |
2015-09-20 | S08000022 | H212H | Emergency Department | 180 | 168 | 12 | 0 | 0 |
2020-03-01 | S08000031 | G107H | Emergency Department | 1799 | 1450 | 349 | 18 | 1 |
Subset data
- we’ll take a random selection of 5 health boards to keep things tidy
date | hb | loc | type | attend | n_within | n_4 | n_8 | n_12 |
---|---|---|---|---|---|---|---|---|
2019-03-31 | S08000022 | C121H | Emergency Department | 166 | 158 | 8 | 1 | 0 |
2023-10-01 | S08000020 | N411H | Emergency Department | 561 | 419 | 142 | 51 | 22 |
2023-05-14 | S08000022 | C121H | Emergency Department | 172 | 152 | 20 | 2 | 0 |
2023-10-22 | S08000020 | N121H | Emergency Department | 239 | 209 | 30 | 1 | 0 |
2015-06-14 | S08000024 | S319H | Emergency Department | 965 | 959 | 6 | 0 | 0 |
Basic plots
Joining data
and again…
Add to a map
ae_activity_loc |>
leaflet::leaflet() |>
leaflet::addTiles() |>
leaflet::addMarkers(~longitude, ~latitude, label = ~HospitalName)
Then make that map more useful
ae_activity_loc |>
group_by(HospitalName) |>
summarise(attend = sum(attend), n_within = sum(n_within), longitude = min(longitude), latitude = min(latitude)) |>
mutate(rate = paste(HospitalName, "averages", scales::percent(round(n_within / attend, 1)))) |>
leaflet::leaflet() |>
leaflet::addTiles() |>
leaflet::addMarkers(~longitude, ~latitude, label = ~rate)
Then add to reports, dashboards…
Strengths
- enormous scope and flexibility
- a force-multiplier for fancier data work
- helps collaboration within teams, between teams, between orgs
- reproducible analytics
- modular approaches to large projects
- decreasing pain curve: the fancier the project, the better
Weaknesses
- harder to learn than competitors
- very patchy expertise across H+SC Scotland
- complex IG landscape
- messy skills development journey