What is “data” — and is a data career for you?
If you've seen “data analyst”, “data engineer”, or “BI developer” on job ads and wondered what those people actually do, this page is for you. No assumed knowledge, no jargon.
A short read, top to bottom. Click any section to jump straight to it:
- What is data?
- How to identify data
- How data is generated, stored and shared
- Data Lifecycle
- A real scenario: a UK retailer
- Who's on a data team
If you see any terminology you don't understand, we would have explained it here: The data glossary →
The short version
Every company collects information — customer records, sales, machine readings, NHS appointments, parcel deliveries. Data professionals turn that information into insights that drive business decisions: they clean it, organise it, create charts and reports that managers can analyse, and build the plumbing that moves data between systems.
You don't need a maths PhD. You don't need to be a coder. The day-to-day skills are closer to careful spreadsheet work with better tools, plus enough programming-like skills to ask the data the right questions.
Roles split into analysts (they turn data into charts and reports), engineers (they build the pipelines that move data around), and BI developers (they build the dashboards that executives use to make informed business decisions).
At its simplest: data is just recorded facts — about purchases, people, events, sensors, anything. The job of data professionals is turning that record into useful facts: decisions, charts, alerts, automations.
Two flavours: quantitative and qualitative
Most data falls cleanly into one of two camps. Knowing which one you're looking at shapes the questions you can ask and the charts you can draw.
Things you can count or measure. Split into discrete (whole numbers: 412 stores, 3 children) and continuous (decimals: £8.27 basket value, 36.4°C temperature).
Categories, names, free-text, sentiment. Things you can group but not directly average — ‘Sports’ vs ‘Electronics’, ‘London’ vs ‘Manchester’, the words in a product review.
You're already surrounded by data
Every app, shop and service you use is collecting it. Once you start looking, you spot it everywhere. Some everyday examples:
Your supermarket loyalty card
Every scan tells the supermarket what you bought, when, where, and at what price. That's why your Clubcard discounts feel uncannily relevant — the data says you buy nappies every two weeks, so they email you a nappy voucher on week three.
Netflix's homepage
The order of rows you see, the thumbnails picked, the autoplay teaser — every choice is driven by data on what people like you have watched and abandoned. Two viewers see two different homepages.
NHS appointment letters
Behind every letter is a database of patient records, GP referrals, hospital capacity, and waiting-list rules. The decision ‘who gets seen Tuesday at 10am’ is a data question.
Your bank's fraud alerts
If a £40 transaction in Manchester is followed by a £400 one in Bangkok ten minutes later, the bank's fraud-detection system flags it instantly. That decision is a data model running in the background.
How data gets to a data person
Data goes on a journey before anyone analyses it. Understanding this lifecycle is half the battle — it's also where you'll spend most of your career, regardless of which tool you specialise in.
- 1Collected
Apps log clicks. Tills record sales. Forms capture sign-ups. Sensors stream temperatures. APIs pull in weather data. Anything happening, anywhere, can become a row in a table.
- 2Stored
The most common home is a database — a structured set of tables, like spreadsheets that can talk to each other. Smaller jobs use files (CSV, Excel, JSON). Very large jobs use cloud data lakes (Databricks, Azure, AWS S3).
- 3Cleaned
Real-world data is messy. Duplicate customers, typo'd postcodes, missing fields, dates in five different formats. Most of a data engineer's day is fixing this so the analysis below it can be trusted.
- 4Analysed
An analyst writes queries to answer business questions: ‘Which products did we sell more of last quarter than this one?’ The answer comes back as a small table of numbers.
- 5Shared
That small table of numbers becomes a chart, a dashboard, a slide, or an email alert. Done well, a stakeholder can act on it in seconds without ever seeing the underlying data.
The big idea:By the time anyone makes a decision from data insights, the data has been collected from many sources, cleaned, organised, and presented in a way humans can read. Specific data roles' expertise are needed on various parts of this chain.
How fresh does the data need to be?
Two big decisions shape every data pipeline: how often it should run, and the volumeit should pull in each time. Different businesses sit in different boxes: a stock-trading firm wants every tick the moment it lands; a retailer's morning revenue report can wait until 6am.
Run hourly, nightly, weekly. The classic 'overnight job' that has yesterday's report on your desk by 9am. Easier to operate, cheaper to run.
Each event flows through the pipeline within seconds. Fraud detection, live dashboards, IoT sensors, trading. More complex; more expensive.
Simple. Reload everything from scratch on each run. Fine for small tables; impossible once the data grows.
Track a 'last updated' timestamp; pull only rows that changed since then. Most production pipelines work this way.
End to end: from till scan to boardroom decision
Imagine you're a national retailer with 400 stores, a website, and a loyalty app. Here's how SQL, Power BI, Python and Databricks fit together to turn every transaction into a business decision.
End-to-end, in one line
Databricks replaces “Database stores” + “Python cleans” when the data outgrows a single database — same shape, bigger scale.
Try the four tools, hands-on
Run a SQL query, drive a Power BI dashboard, follow a Python pipeline, and scale a Databricks cluster — all against the same UK-retailer scenario, all in your browser, no signup.
Drop into the weekly Q&A
Thursdays at 7pm UK · 30 min · Microsoft Teams · free. Bring anything that's holding you back from starting — group format, no agenda.
How the roles fit together, end-to-end
One person rarely does all of this. Most data teams split the work across roles that hand off as the request travels from a business question to a decision-ready insight. Here's the typical flow.
Roles overlap in real teams — at smaller companies one person may wear two hats; at larger ones each box is a department. The flow is the same.
Data terminology, defined
Migration, ingestion, transformation, validation, governance, cataloguing… every data role uses these words daily. We've collected them — categorised, plain-English — on a dedicated page you can bookmark.
Open the data glossary →Still not sure which track? That's what the assessment is for.
Ten short questions about your background, hours per week, and what kind of work appeals to you. Comes back with a recommended track, starting level, and a realistic time plan. No signup, no email capture — just a plan.