The data stack, hands-on
Try SQL, Power BI, Python and Databricks side-by-side — same UK retail scenario, same numbers, four different tools doing four different jobs. Everything runs in your browser; no signup, no setup.
New to data? Start with the beginner's guide →
From till scan to boardroom decision
Imagine you're a national retailer with 400 stores, a website, and a loyalty app. Here's how SQL, Power BI, Python and Databricks fit together to turn every transaction into a business decision.
Data collection
Every till scan, online checkout and loyalty-card tap lands in a central SQL database — one big set of related tables: customers, products, orders, returns.
Ask questions of the database
A data analyst writes SQL (Structured Query Language), short English-like queries, to ask the database questions. It is the language a database understands and is used to manipulate data. It is the first thing you need to learn. See examples below.
SELECT
product_category,
COUNT(*) AS transactions,
SUM(price_gbp) AS total_revenue
FROM sales
WHERE region = 'London'
AND order_date >= '2026-04-01'
AND order_date < '2026-05-01'
GROUP BY product_category
ORDER BY total_revenue DESC;
Aggregates every London sale in April 2026 by product category, returning total revenue and transaction count per category, sorted highest revenue first.
- SELECT — Choose which columns to return — here: the category name, a count of transactions, and the total revenue.
- COUNT(*) — Aggregate function: counts the number of rows in each group. With GROUP BY product_category, you get one count per category.
- SUM — Aggregate function: adds up all values in a column. SUM(price_gbp) gives the total revenue per category.
- AS — Renames a column in the output. AS transactions / AS total_revenue makes the result table headings readable.
- FROM — Read from the sales table.
- WHERE — Filter rows: London only, dates between 1st April (inclusive) and 1st May (exclusive).
- GROUP BY — Roll up rows that share a category so COUNT and SUM apply per category, not per row.
- ORDER BY — Sort the result — largest revenue first, descending.
💡 Click a variation below to swap in a different region, date or category.
Visualise the answer
A BI developer wires those queries into a Power BI dashboard. The marketing director opens one tab on Monday morning and sees bar charts, regional maps and league tables — never the SQL behind them.
- In-store45%
- Web37%
- Mobile app14%
- Click & collect4%
Build dashboards like that from scratch
Power BI Core walks you through connecting data, modelling it, and shipping interactive reports — same techniques behind the dashboard above. 24 hours · 113 lessons · £99 £129.
Move and transform data automatically
When SQL alone can't do the job — joining weather data to ice-cream sales, cleaning a messy supplier feed, emailing 400 store managers a personalised PDF — a data engineer writes Python. The glue between systems.
import pandas as pd
sales = pd.read_csv('sales.csv')
print(sales.head())Loads a CSV into a pandas DataFrame and shows the top of the table — the quickest way to sanity-check the shape of any new data.
- import pandas as pd — Load the pandas library — Python's standard data-handling toolkit. Aliased as pd by universal convention.
- read_csv — Read a CSV file into a DataFrame (a table with rows + columns).
- head — Return the first 5 rows of the DataFrame. Default count is 5; pass a number for more.
- print — Pretty-print to the terminal / notebook output.
💡 Every pandas pipeline starts with read_csv (or read_sql, read_parquet, read_excel) + a quick .head() to confirm the data loaded correctly.
| order_date | product_category | product_name | price_gbp | region |
|---|---|---|---|---|
| 2026-04-01 | Electronics | Headphones | 199.99 | London |
| 2026-04-02 | Sports | Running shoes | 79.99 | London |
| 2026-04-02 | Home & Garden | Lamp | 34.99 | London |
| 2026-04-03 | Sports | Yoga mat | 24.99 | London |
| 2026-04-03 | Clothing | Jeans | 39.99 | London |
When the data won't fit on one computer
Some answers need to scan every row of years of history — far too much for one machine. Databricks splits the data into chunks, runs the analysis across many computers in parallel, then stitches their answers back together. That's distributed parallel processing: hours of work compressed into seconds. The standard for banks, NHS trusts, energy companies and large retailers.
| Category | Revenue | Transactions |
|---|---|---|
| Electronics | £227M | 36.0M |
| Sports | £185M | 42.0M |
| Clothing | £160M | 48.0M |
| Home & Garden | £101M | 22.0M |
| Books | £42.0M | 12.0M |
That's the stack, end-to-end
Databricks replaces “Database stores” + “Python cleans” when the data outgrows a single database — same shape, bigger scale.
Ready to pick a track?
Ten short questions about your background, hours per week, and what kind of work appeals to you. Comes back with a recommended track, starting level, and a realistic time plan. No signup, no email capture — just a plan.
Got questions instead of an answer? Drop into the weekly Q&A — Thursdays 7pm UK, free.