PowerQuery - a practical introduction

Brendan Clarke, NHS Education for Scotland, brendan.clarke2@nhs.scot

17/07/2024

Welcome

  • 🌶 this session is a beginner-level introduction to PowerQuery

  • we’ll get going properly at 14.35

  • you’ll either Excel desktop (>2007) or Power BI desktop

  • you’ll also need to have basic competency in Excel or Power BI

  • if you can’t access the chat, you might need to join our Teams channel: tinyurl.com/kindnetwork

  • you can find session materials at tinyurl.com/kindtrp

The KIND network

  • a social learning space for staff working with knowledge, information, and data across health, social care, and housing in Scotland
  • we offer social support, free training, mentoring, community events, …
  • Teams channel / mailing list

Excel / Power BI training sessions

Session Date Area Level
Lambda formulas in Excel 13:00-13:30 Mon 15th July 2024 Excel 🌶🌶 : intermediate-level
Lookups in Excel 13:00-14:30 Thu 1st August 2024 Excel 🌶🌶 : intermediate-level
Relative, absolute, mixed, structured, and R1C1 references in Excel 15:00-16:00 Thu 8th August 2024 Excel 🌶🌶 : intermediate-level
  • this session is based on our Power BI for Beginners course
  • but PowerQuery isn’t specific to Power BI - it’s brilliant for Excel too

Session outline

  • what is PowerQuery, and why might you use it?
  • getting data into PowerQuery
  • basic data wrangling
  • joins (merges)
  • pivots

What is PowerQuery?

How I think of PowerQuery

  • the right way to load any real data into Excel

Why might you use it?

  • powerful
  • easy, to start with at least…
  • allows you to standardise data loading from different sources
  • allows you to repeat data loading

Getting data into PowerQuery (Excel)

  1. Open Excel and create a new, blank, workbook
  2. Data > Get & Transform Data
    Get & Transform Data
  3. Pick From Web and supply the following URL - https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-07-25/scurvy.csv
  4. From the preview, select Transform data to open PowerQuery

Getting data into PowerQuery (Power BI)

  1. Open Power BI and start a new report
  2. Select Get data > From Web button in the ribbon
  3. Supply the following URL - https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-07-25/scurvy.csv
  4. From the preview, select Transform data to open PowerQuery
    Select Transform data to open PowerQuery
  5. If you’ve already got some data in your report that you’d like to transform, use Tranform data in the ribbon to open PowerQuery:
    Tranform data in the ribbon :::

Preview your Data

make sure to click Transform data
(make sure to click Transform data)

First view of PowerQuery

First view of PowerQuery

  • Queries (= datasets) on the left
  • ribbon menu at the top
  • Query settings (= the changes you’ll make to your data) on the right
  • Excel-ish data preview in the middle

Basic data wrangling

  • right click the study_id column and select Remove
    • the column will disappear
    • but you’ll also gain a step in your Applied Steps field

Basic data wrangling

  • you can edit this (and any) step
    • if you’re code-confident, you can edit the formula to read = Table.RemoveColumns(#"Changed Type",{"study_id", "treatment"}) to also remove the treatment column
    • and you can remove the step using the X next to the step name
  • you should also try renaming and reordering your columns in the same way
  • once you’ve had a play, please remove the steps until you’re left with the data you started with (so delete back to “Changed Type”)

Replacing values

  • we can also work with individual values
  • our values are a number, then an underscore, then a word
  • we can Split column at the underscore
  • we could also replace the odd underscores in the treatment column

Reshaping data

  • we could go column by column here splitting, but an easier approach would be to re-shape the data
  • select the first three columns, and select Unpivot Other Columns
  • then split, change type, and generally tidy up

Add columns

Our treatment and dosing columns are annoying. Let’s make a new, joined, column for them:

  • select Custom Column
    Custom Column
  • you can click the gearwheel next to the step if you’d like to go back and tweak

Close and load / Close and Apply sends your data back to Excel/Power BI

Getting more fancy: a bigger dataset

A challenge

Can you find the number of responses to each question for your local NHS board? You should be able to re-work this data to give something like:

Some hints

  • filtering works like Excel, but is destructive
  • unpivoting, doing something to all the rows, then re-pivoting is a great way to work across lots of columns
  • you can join structurally similar datasets using Append Queries as New
    Append Queries as New
  • you can deduplicate entire datasets by selecting all the columns and using Remove Rows > Remove Duplicates

Feedback and resources

  • please can I ask for some feedback - takes less than a minute, completely anonymous, helps people like you find the right training for them
Session Date Area Level
Lambda formulas in Excel 13:00-13:30 Mon 15th July 2024 Excel 🌶🌶 : intermediate-level
Lookups in Excel 13:00-14:30 Thu 1st August 2024 Excel 🌶🌶 : intermediate-level
Relative, absolute, mixed, structured, and R1C1 references in Excel 15:00-16:00 Thu 8th August 2024 Excel 🌶🌶 : intermediate-level