What Is Data, Really?
Data Is Everywhere in Your Project
Every project you could build — a habit tracker, a portfolio site, a data dashboard — deals with data. But "data" is one of those words that sounds more complicated than it actually is.
Data is just information that's organized so it can be used. That's it. Your name is data. Your daily step count is data. A list of your habits is data. The temperature outside is data. A photo on your phone is data. Whenever information is stored, tracked, or processed — that's data.
For your project, understanding data means answering four questions: What information does my project need? Where does that information come from? How should I organize it? And how do I keep it accurate? This module answers all four.
How This Applies to Your Project — Track 1
Your productivity tool runs on data. Every habit you track, every checkbox you check, every streak you count — that's data. In this lesson, you'll learn what data actually is and where yours comes from. For your project, the main data source is user input: the habits they add and the completions they log each day.
How This Applies to Your Project — Track 2
Your portfolio or recommendation site is built on content data. Every item in your gallery, every book recommendation, every resource listing — that's structured data. You need to think about how your content is organized: titles, descriptions, categories, images. This lesson teaches you why that structure matters.
How This Applies to Your Project — Track 3
Data is the core of your dashboard. Your entire project exists to collect, organize, and visualize data. Whether it's spending amounts, sports statistics, or environmental measurements — this lesson teaches you how to think about where that data comes from and what makes it useful.
Quick Check: Which of these is data?
Answer: D. All of these are data. Any piece of information that can be stored, tracked, or processed counts. Your project will work with data whether or not you think of it that way.
Where Data Comes From: Four Main Sources
Data in your project will come from one or more of these sources:
1. User Input
The person using your project types, clicks, or selects something. A habit tracker gets data when the user checks off a habit. A portfolio site gets data when the creator adds their projects. This is the most common data source for the projects in this course.
2. APIs (Application Programming Interfaces)
APIs are services that provide data from external sources. A weather API gives you current temperature data. A sports API gives you game scores. A news API gives you headlines. Think of an API as a restaurant kitchen: you send in an order (a request) and get back a dish (data). You don't need to know how the kitchen works.
If "API" sounds intimidating, don't worry — you don't need to understand how they work right now. When it's time to use one in your project, you can ask AI: "How do I get weather data from an API? Walk me through it step by step." AI will handle the technical details for you.
3. Public Datasets
Pre-existing collections of data that anyone can use. Government data, research datasets, open-source data collections. Track 3 data dashboard projects often use these as their starting point.
4. Generated Data
Data your project creates by processing other data. If your tracker calculates a streak count from daily check-ins, that streak number is generated data. If your dashboard computes averages from raw numbers, those averages are generated.
Four Data Sources
User Input (keyboard/form) → APIs (cloud/server) → Public Datasets (document/table) → Generated Data (calculator/formula) all flow into your project.
Structured vs. Unstructured Data
Not all data is created equal. Understanding the difference between structured and unstructured data helps you make smart decisions about how to organize your project.
Structured data fits neatly into rows and columns, like a spreadsheet. Each piece of information has a clear label and a consistent format. Examples: a table of habits (name, status, date), a list of expenses (amount, category, date), a roster of team members (name, role, email).
Unstructured data doesn't fit neatly into a table. It's freeform. Examples: a journal entry, an image, a block of text, a voice recording. It contains information, but that information isn't organized into consistent fields.
For your projects in this course, you'll mostly work with structured data. It's easier to store, search, sort, and display. When you have unstructured data (like a project description or a journal entry), you'll store it as a text field within a structured format.
Answer: Structured. Each habit has consistent fields (name, status, streak) that fit perfectly into rows and columns. This is exactly the kind of data that works well in a spreadsheet or simple database.
Why Data Quality Determines Output Quality
There's a principle in computing that's been true since the earliest computers: garbage in, garbage out. If your data is messy, inconsistent, or wrong, everything that depends on that data will be messy, inconsistent, or wrong.
Here's what bad data looks like in practice:
- A spending tracker where some amounts include the dollar sign and some don't ("$15.00" vs. "15" vs. "fifteen dollars")
- A habit list where the same habit is entered three different ways ("Drink water" vs. "drink water" vs. "Water")
- A contacts list with missing emails, duplicate entries, and phone numbers in different formats
None of these are hard to fix individually. But when your project has hundreds of entries, small inconsistencies become big problems. Charts won't calculate correctly. Searches won't find everything. Displays will look messy.
The fix is prevention, not cleanup. Design your data structure so that data goes in clean from the start. That means using dropdowns instead of text fields where possible, consistent formats for dates and numbers, and validation that catches mistakes before they're saved. You'll learn how in Lesson 4.3.
Key Concepts
- Data is information organized so it can be used. Every project deals with data.
- Four data sources: user input (most common), APIs (external services), public datasets, and generated data (computed from other data).
- Structured data fits into rows and columns (like a spreadsheet). Unstructured data is freeform (like text or images). Your projects will primarily use structured data.
- Data quality directly determines project quality. "Garbage in, garbage out."
- Prevention beats cleanup: design your data to go in clean from the start.
Try It: Map Your Project's Data
Identify every piece of data your project will touch.
- List every type of information your project needs (e.g., habit names, check-in dates, streak counts, user preferences).
- For each piece of data, identify the source: user input, API, public dataset, or generated.
- Mark each as structured or unstructured.
- Identify 2–3 potential data quality issues (e.g., "users might enter the same habit name differently") and brainstorm how to prevent them.
AI Collaboration Moment
Now use AI to expand and validate your data plan.
Open an AI tool and use this prompt:
What to do with the response:
- Compare AI's suggestions to your original list. Did it catch anything you missed?
- Add any useful data fields to your plan
- Note the potential problems it identified — you'll address these in Lessons 4.3 and 4.4
- Save this conversation — you'll reference it in your Data Planning Template (PDF)
Find this and all other downloadable resources on the Dashboard Resources page.
Check Your Understanding
1. A weather dashboard pulls temperature data from a weather service. What type of data source is this?
Explanation: Weather services provide data through APIs — your project sends a request and gets back current weather data. You don't create or store this data; you fetch it from an external service.
2. A habit tracker calculates a "weekly completion percentage" from daily check-ins. What type of data is the percentage?
Explanation: The percentage is computed from other data (daily check-ins). It doesn't come from the user directly or from an external source — your project generates it by processing existing data.
3. Why is "garbage in, garbage out" relevant for your project?
Explanation: If your data has inconsistencies (different formats, duplicates, errors), your charts, calculations, and displays will all be affected. Clean data in = reliable results out.
4. Which approach best prevents data quality issues?
Explanation: Prevention beats cleanup every time. Using dropdowns instead of free text, enforcing date formats, and validating input before saving means data enters your system clean. This saves enormous time compared to fixing messy data later.
Reflect & Write
Write 2–3 sentences: What data does your project deal with? Were there data sources or types you hadn't thought about until this lesson?
Project Checkpoint
Create your data inventory:
- List every piece of data your project uses
- Identify the source of each (user input, API, public dataset, generated)
- Mark each as structured or unstructured
- Flag 2–3 potential data quality risks
This inventory feeds directly into your data model in Lesson 4.2.
Level Up: Coming Next
Lesson 4.2 — Organizing Data: Models and Schemas. You know what data your project needs. Now it's time to organize it — tables, fields, and relationships. Plus an interactive Data Architect activity where you build your data model.
Continue to Lesson 4.2 →