No feedback found for this session
Intermediate Power BI Session 3
Power BI
intermediate
Intermediate Power BI course
DAX
This is part of our intermediate-level Power BI desktop course. The course structure is as follows:
- session 1: a simple build-a-dashboard session as a warmup, reminder about core Power BI techniques, and sharing some suggestions for good workflow practices
- session 2 (this session): DAX language foundations, including FILTER and CALCULATE
- session 3: core PowerQuery functions to manage sources, join data sets, pivot, and tidy values
- session 4: PowerQuery M language foundations, including a basic overview of syntax, a core function toolkit, advanced mode, and reproducibility
- session 5: a second, more ambitious, build-a-dashboard session to put all that PowerQuery and DAX knowledge into practice
Session aim
- to cover core Power Query (PQ) functions and techniques
Where does this fit in?
- session 1: a simple build-a-dashboard session as a warmup, reminder about core Power BI techniques, and sharing some suggestions for good workflow practices
- session 2: DAX language foundations, including FILTER and CALCULATE
- session 3: core PowerQuery functions to manage sources, join data sets, pivot, and tidy values
- session 4: PowerQuery M language foundations, including a basic overview of syntax, a core function toolkit, advanced mode, and reproducibility
- session 5: a second, more ambitious, build-a-dashboard session to put all that PowerQuery and DAX knowledge into practice
Session outline
- thinking about queries - sources, connections, code folding, duplicating/referencing
- data formats for Power BI - tidy data, and key-value pairs, and interconverting with pivots
- joins and relationships
- normalising and the star schema
Data for this session
Getting started
- open Power BI
- create a blank report
- then click
Tranform data
to open PQ
Loading data from Power Query
- all the Power BI connectors are available in PQ
New query > Web
- supply the link for the NRS vital events data for Q3 2024: https://www.nrscotland.gov.uk/media/arkfqyyr/q3-2024-data.xlsx
Tables and sheets
- we want table 6
(TableQ6_Deaths_by_sex_cause_and_NHS_Board_area
) - please import that from the navigator
Tidying up
- remove the three
All
rows - check the data types
- fix the query name to
cause_death
This data isn’t quite tidy
Shaping your data
- two main approaches found in Power BI:
- Excel-like tidy data
- SQL-like relational data
- tidy data is conceptually easier to build
- joins/merges
- splitting columns
- but really Power BI prefers relational data
- more performant
- easier to manage complex datasets
Tidy data
- largely covered in session one, but a quick reminder about splitting and merging
Key-value pair data
Key | Value |
---|---|
Pneumonia | 144 |
Asthma | 20 |
Influenza 3 | 2 |
Relational data in two minutes
- break your data into multiple tables
- give each item a key
- build relationships between those keys
Take some wide data
NHS Board | AED count | AED model | Manufacturer | Cost |
---|---|---|---|---|
NHS A&A | 3 | AED Plus Fully Automatic Defibrillator | ZOLL | 1305 |
NHS A&A | 4 | AED Plus Fully Automatic Defibrillator | ZOLL | 975 |
NHS A&A | 5 | Philips HeartStart FRx Semi-Automatic Defibrillator with Carry Case | Philips | 1200 |
Data from sja.org.uk
Break into tables
NHS Board | AED count |
---|---|
NHS A&A | 3 |
NHS A&A | 4 |
NHS A&A | 5 |
AED model | Manufacturer | Cost |
---|---|---|
AED Plus Fully Automatic Defibrillator | ZOLL | 1305 |
AED Plus Fully Automatic Defibrillator | ZOLL | 975 |
Philips HeartStart FRx Semi-Automatic Defibrillator with Carry Case | Philips | 1200 |
Add unique keys
NHS Board | AED count | AED key |
---|---|---|
NHS A&A | 3 | ZOLL-1 |
NHS A&A | 4 | ZOLL-2 |
NHS A&A | 5 | PHIL-1 |
AED key | AED model | Manufacturer | Cost |
---|---|---|---|
ZOLL-1 | AED Plus Fully Automatic Defibrillator | ZOLL | 1305 |
ZOLL-2 | AED Plus Fully Automatic Defibrillator | ZOLL | 975 |
PHIL-1 | Philips HeartStart FRx Semi-Automatic Defibrillator with Carry Case | Philips | 1200 |
Star schema
- you might also see Power BI data described as a star schema
image: Wikimedia
Reshaping our data
- we’ll break out the cause of death parts of the data into a new table
- we’ll use the ICD 10 codes as a key
- duplicate query and rename that query to
ICD10
- remove anything other than ICD10 data in
ICD10
(and a note aboutTable.SelectColumns
) - remove duplicates via
Remove Rows > Remove Duplicates
- rename
ICD 10 Summary List
toICD10
- then remove the
Cause of death
column incause_death
, and rename theICD10
column to link with the new table
Duplicating vs referencing queries
- a duplicate query is an independent copy of a query that can be tweaked/extended
- a referenced query takes the output of a query, and uses it as the starting-point for additional work
Reshape cause_death
to key-value
- remove the
Scotland
column - then unpivot everything but
ICD10
andSex
- rename your new Attribute/Value columns to
board
andn
Create national data for benchmarking
- duplicate
cause_death
- name that new query
national
- remove everything but the national data with
= Table.SelectColumns(#"Filtered Rows",{"ICD 10 Summary List", "Scotland", "Sex"})
More data, same source
- go back to source step of the
cause_death
query - find the table data in the row
TableQ2b_Vital_events_by_NHS_Board_area
- right-click the
Table
, and selectAdd as New Query
- rename that new query
population
Make a population table
- remove everything but the first three columns
- remove the
null
row and theSCOTLAND
row - rename
NHS Board area
toboard
- clean and trim the
board
values - unpivot the population columns to
sex
andpopulation
- replace values and change case in the
sex
column to leaveMale
andFemale
values
Additional data sources
- you might like to add in a second data source at this point for practice
- e.g. population estimates, which could be added as a web source
- or second quarter 2024 NRS vital data as an Excel file