AI Power Progress iA
Open-first learning

Start Here

No single site can turn someone into a “leading contributor in everything.” This page is a curated learning graph for becoming a strong contributor across software, systems, and AI — using open and free resources with official links and licensing notes.

Building learning graph…

Filter

Learning Ladder

Browse in order (Start Here → Core CS → Build Products → AI & ML → Contribute). Use filters above to narrow.

Open/Free Backbone

Anchor sources you can use across nearly every discipline. Prefer these when you need trustworthy, zero-cost foundations and references.

learningfoundationbeginnerlearning_platformmathsciencecomputinginteractivevideo

Khan Academy

Khan Academy

Math/science/CS foundations from zero.

License: free (see terms)

learningfoundationbeginnertextbook_librarychemistrymathbiologytext

LibreTexts

LibreTexts

Large open subject libraries with remixable content.

License: open (license varies by module)

learningfoundationbeginnertextbook_librarygeneraltext

Open Textbook Library

University of Minnesota

Discover and adopt openly licensed textbooks by subject.

License: open (license varies by textbook)

learningfoundationbeginnercourse_libraryhumanitiessocial_sciencesciencevideotext

Open Yale Courses

Yale

Full lecture courses with transcripts, readings, and exams.

License: free (see terms)

learningfoundationbeginnercourse_librarygeneraltextinteractive

OpenLearn

The Open University

Short free courses and learning resources across many domains.

License: free (see terms)

learningfoundationbeginnertextbook_librarymathscienceeconomicstext

OpenStax

OpenStax

High-quality, peer-reviewed textbooks with permissive licensing per title.

License: open textbooks (license varies by book)

learningfoundationbeginnercourse_librarygeneralbusinesscomputer_sciencetext

Saylor Academy

Saylor

Self-paced courses (some credit pathways).

License: free (see terms)

learningfoundationintermediatecourse_librarymathcomputer_scienceengineeringtextvideo

MIT OpenCourseWare

MIT

University-level depth with syllabi, problem sets, and readings.

License: open (see OCW license)

learningfoundationintermediatebook_librarybiomedlife_sciencehealthtext

NCBI Bookshelf

NCBI

Biomedical and life-science reference books and documents.

License: free (license varies by book)

Start Here

Zero → basic literacy: learn to code, use the terminal, and develop real workflow habits.

learningfoundationbeginnercoursecomputer_sciencevideotext

CS50x: Introduction to Computer Science

Harvard (CS50)

First serious CS experience with real assignments and mental models.

License: free (see terms)

Prereqs: Khan Academy
learningfoundationbeginnertutorialgitgithubinteractive

GitHub Skills

GitHub

Hands-on Git/GitHub workflows with guided repositories.

License: free (see docs)

learningfoundationbeginnerlearning_platformmathsciencecomputinginteractivevideo

Khan Academy

Khan Academy

Math/science/CS foundations from zero.

License: free (see terms)

learningfoundationbeginnercourselinuxshelltextinteractive

Linux Journey

Linux Journey

Linux fundamentals for dev, ops, and cloud.

License: free (see site)

learningfoundationbeginnertutorialdatabasessqlinteractive

SQLBolt

SQLBolt

Fast interactive SQL basics for backend work.

License: free (see site)

learningfoundationbeginnercoursedeveloper_toolinggitshellvideotext

The Missing Semester of Your CS Education

MIT

Terminal, editors, Git, debugging, profiling, and tooling fluency.

License: free (see site)

learningfoundationbeginnercoursewebjavascriptpythoninteractive

freeCodeCamp

freeCodeCamp

Daily coding reps + project-based certifications.

License: free (see terms)

Core CS

Depth in algorithms, systems, and the mental models that make you effective beyond any framework.

learningfoundationbeginnercoursecomputer_sciencevideotext

CS50x: Introduction to Computer Science

Harvard (CS50)

First serious CS experience with real assignments and mental models.

License: free (see terms)

Prereqs: Khan Academy
learningfoundationintermediatecoursediscrete_mathcomputer_sciencetextvideo

MIT 6.042J: Mathematics for Computer Science

MIT OCW

Discrete math, proofs, probability, graphs, and recurrences.

License: open (see OCW license)

learningfoundationintermediatecourse_librarymathcomputer_scienceengineeringtextvideo

MIT OpenCourseWare

MIT

University-level depth with syllabi, problem sets, and readings.

License: open (see OCW license)

learningfoundationintermediatecourseprogrammingcomputer_sciencetext

UC Berkeley CS 61A

UC Berkeley

A rigorous intro sequence that builds strong programming fundamentals.

License: free (see site)

learningbuildintermediatecoursecomputer_architecturesystemstextvideo

Nand2Tetris

Nand2Tetris

Build a computer from first principles: logic gates → OS.

License: free (some paid options)

learningbuildintermediatecurriculumcomputer_sciencetext

OSSU Computer Science

OSSU

An open self-taught CS spine with strong course sequencing.

License: free/open (see repo)

Prereqs: CS50x: Introduction to Computer Science
learningbuildintermediatebookoperating_systemssystemstext

OSTEP (Operating Systems: Three Easy Pieces)

UW–Madison

Virtualization, concurrency, and persistence with a systems mindset.

License: free (see site)

learningbuildintermediatebookcomputer_scienceabstractiontext

SICP (Structure and Interpretation of Computer Programs)

MIT Press

Mental models for abstraction, interpreters, and computation.

License: free to read online (see site)

learningcontributorintermediateguidecomputer_sciencesystemstext

Teach Yourself Computer Science

teachyourselfcs.com

Gap-filling map after bootcamps/self-teaching.

License: free (see site)

Build Products

Ship real software: web, APIs, databases, deployments, and distributed systems basics.

learningfoundationbeginnercourselinuxshelltextinteractive

Linux Journey

Linux Journey

Linux fundamentals for dev, ops, and cloud.

License: free (see site)

learningfoundationbeginnertutorialdatabasessqlinteractive

SQLBolt

SQLBolt

Fast interactive SQL basics for backend work.

License: free (see site)

learningfoundationbeginnercoursedeveloper_toolinggitshellvideotext

The Missing Semester of Your CS Education

MIT

Terminal, editors, Git, debugging, profiling, and tooling fluency.

License: free (see site)

learningfoundationbeginnercoursewebjavascriptpythoninteractive

freeCodeCamp

freeCodeCamp

Daily coding reps + project-based certifications.

License: free (see terms)

learningbuildbeginnercurriculumwebbackendproject_based

The Odin Project

The Odin Project

Project-focused full-stack web development curriculum.

License: free/open (see repo)

learningbuildbeginnerroadmapcareerskillsinteractive

roadmap.sh

roadmap.sh

Role-based roadmaps (frontend, backend, DevOps, AI, etc).

License: free (see site)

learningbuildintermediatecoursewebtypescriptnodetext

Full Stack Open

University of Helsinki

Modern React/Node/TypeScript/full-stack path.

License: free (see site)

learningbuildintermediatetutorialdevopsdistributed_systemsinteractive

Kubernetes Basics

Kubernetes

First deployment mental model once you're shipping services.

License: open (see site)

learningbuildintermediatecourseml_systemsmlopstext

Made With ML

Made With ML

Design, develop, deploy, and iterate ML systems.

License: free (see site)

learningcontributoradvancedcoursemlopsproductvideotext

Full Stack Deep Learning

Full Stack Deep Learning

Building AI-powered products across the full lifecycle.

License: free (see site)

AI & ML

Practical deep learning + ML systems engineering: build, deploy, evaluate, and iterate.

learningfoundationintermediatecourse_librarymathcomputer_scienceengineeringtextvideo

MIT OpenCourseWare

MIT

University-level depth with syllabi, problem sets, and readings.

License: open (see OCW license)

learningbuildbeginnercoursellmsagentsnlptext

Hugging Face Learn

Hugging Face

LLMs + tooling: transformers, agents, evaluation, and more.

License: free (see site)

learningbuildbeginnerdocsdeep_learningtext

PyTorch Tutorials

PyTorch

Official beginner-to-advanced training in the PyTorch ecosystem.

License: open (see site)

learningbuildbeginnerroadmapcareerskillsinteractive

roadmap.sh

roadmap.sh

Role-based roadmaps (frontend, backend, DevOps, AI, etc).

License: free (see site)

learningbuildintermediatebookdeep_learningmathtextinteractive

Dive into Deep Learning

d2l.ai

Interactive deep learning book with code + math.

License: open (see site)

learningbuildintermediatecourseml_systemsmlopstext

Made With ML

Made With ML

Design, develop, deploy, and iterate ML systems.

License: free (see site)

learningbuildintermediatecoursedeep_learningvideotext

fast.ai

fast.ai

Practical deep learning for coders with a bias toward shipping.

License: free (see site)

learningcontributoradvancedbookdeep_learningtext

Deep Learning Book

deeplearningbook.org

Classic reference for deep learning fundamentals.

License: free (see site)

learningcontributoradvancedcoursemlopsproductvideotext

Full Stack Deep Learning

Full Stack Deep Learning

Building AI-powered products across the full lifecycle.

License: free (see site)

learningcontributoradvancedbookml_systemssystemstext

MLSysBook

mlsysbook.ai

Machine learning systems engineering depth.

License: free (see site)

Open Datasets

Open data by modality (text/code/image/audio) with licensing notes. Always verify dataset terms for your use case.

datasetfoundationbeginnercatalogdatasets

Hugging Face Datasets Hub

Hugging Face

Discover, load, and compare open datasets via dataset cards.

License: varies by dataset

learningbuildbeginnercoursellmsagentsnlptext

Hugging Face Learn

Hugging Face

LLMs + tooling: transformers, agents, evaluation, and more.

License: free (see site)

datasetbuildintermediatedatasetcomputer_vision

COCO

cocodataset.org

Major object detection/segmentation/captioning benchmark dataset.

License: see terms

datasetbuildintermediatedatasetspeech

LibriSpeech

OpenSLR

~1,000 hours of English read speech.

License: see terms

datasetbuildintermediatedatasetspeechtts

LibriTTS

OpenSLR

Multi-speaker TTS corpus derived from LibriSpeech/LibriVox.

License: see terms

datasetbuildintermediatedatasetspeech

Mozilla Common Voice

Mozilla

Crowdsourced speech dataset/platform.

License: open (see dataset page)

datasetbuildintermediatedatasetcomputer_vision

Open Images V7

Google

Computer vision dataset with boxes, segmentation, and relationships.

License: see terms

datasetbuildintermediatecorpusllm_data

Project Gutenberg

Project Gutenberg

Public-domain books (verify country-specific copyright rules).

License: public domain + project terms (read terms)

datasetcontributoradvancedcorpusllm_data

Common Crawl

Common Crawl

Broad web crawl data; requires careful filtering + dedup.

License: web-derived (varies) — read terms

datasetcontributoradvancedcorpusllm_data

Dolma

Allen Institute for AI (AI2)

Large open corpus spanning web, academic works, code, books, and encyclopedic material.

License: mixed/open (read terms)

datasetcontributoradvancedcorpusllm_data

FineWeb

Hugging Face

Cleaned and deduplicated English web data derived from Common Crawl.

License: web-derived (varies) — read dataset card

datasetcontributoradvancedcorpusllm_data

FineWeb-2

Hugging Face

Multilingual extension of FineWeb.

License: web-derived (varies) — read dataset card

datasetcontributoradvancedcorpusllm_data

FineWeb-Edu

Hugging Face

Educational subset of FineWeb.

License: web-derived (varies) — read dataset card

datasetcontributoradvancedindexmultimodal

LAION-5B

LAION

Large-scale image-text training data discovery and reconstruction.

License: index of URLs + alt-text (read terms)

datasetcontributoradvanceddatasetspeech

Multilingual LibriSpeech (MLS)

OpenSLR

Large multilingual speech corpus spanning multiple languages.

License: see terms

datasetcontributoradvancedcatalogresearch

OpenAlex

OpenAlex

Research discovery + metadata for filtered downloads and linking.

License: open (see site)

datasetcontributoradvancedcorpusbiomed

PMC Open Access Subset

NCBI

Reuse-permitted subset of PubMed Central open access content.

License: subset only (reuse-permitted) — verify

datasetcontributoradvancedindexmultimodal

Re-LAION-5B

LAION

Reproducible iteration of LAION-style indexes with transparent rebuild steps.

License: Apache-2.0 (see source)

datasetcontributoradvancedcorpusllm_data

RedPajama-Data-V2

Together Computer

Open LLM dataset built from Common Crawl snapshots with quality signals and dedup metadata.

License: web-derived (varies) — read dataset card

datasetcontributoradvancedcorpusllm_data

The Pile

EleutherAI

Large, diverse open-source language-modeling dataset.

License: mixed (read terms)

datasetcontributoradvancedcorpuscode

The Stack

BigCode

Permissively licensed source code dataset across many languages.

License: mixed (read terms + acknowledgments)

datasetcontributoradvancedcorpuscode

The Stack v2

BigCode

Large code dataset with billions of files across programming and markup languages.

License: mixed (read terms + acknowledgments)

datasetcontributoradvancedcorpusllm_data

Wikimedia Dumps

Wikimedia

Monthly dumps of Wikipedia and other Wikimedia projects.

License: CC BY-SA (varies by project/language)

datasetcontributoradvancedcorpusresearch

arXiv Bulk Data

arXiv

Research corpora + metadata pipelines; use via official bulk data guidance.

License: varies (see arXiv terms)

Contribute

Turn learning into public contribution: first PRs, issue selection, and PR collaboration workflows.

learningfoundationbeginnertutorialopen_sourcegittext

First Contributions

firstcontributions

Make your first PR with a safe, guided workflow.

License: free (see repo)

learningfoundationbeginnertutorialgitgithubinteractive

GitHub Skills

GitHub

Hands-on Git/GitHub workflows with guided repositories.

License: free (see docs)

learningfoundationbeginnertoolopen_sourceweb

Good First Issue

goodfirstissue.dev

Find beginner-friendly issues by language and repository.

License: free (see site)

learningfoundationbeginnerguideopen_sourcetext

Open Source Guides

GitHub

Contributor/maintainer onboarding, best practices, and community norms.

License: free (see site)

learningfoundationbeginnercoursedeveloper_toolinggitshellvideotext

The Missing Semester of Your CS Education

MIT

Terminal, editors, Git, debugging, profiling, and tooling fluency.

License: free (see site)

learningbuildbeginnerdocsgithubcode_reviewtext

GitHub Docs: Pull Requests

GitHub

Proposing, reviewing, and collaborating on changes.

License: free (see docs)

Project Paths

Build tracks on this site: guided projects, interactive course mode, and weekly plans.

sitefoundationbeginnerpageplanningtext

7-Day Plan (on this site)

AI Power Progress iA

A short, structured plan to start shipping progress immediately.

License: site feature

sitebuildbeginnerpagelearninginteractive

Interactive Course (on this site)

AI Power Progress iA

Plan staged learning tracks and import your workbook package.

License: site feature

sitebuildbeginnerpageprojectsinteractive

Project Builder (on this site)

AI Power Progress iA

Generate project ideas and build tracks from your catalog and goals.

License: site feature

Research & Frontier

Papers, benchmarks, and reproducibility references for staying current without hype.

learningcontributorintermediatecatalogresearchbenchmarksweb

Papers with Code

paperswithcode.com

Find papers + leaderboards + code implementations by task.

License: free (see site)

learningcontributoradvancedbenchmarkbenchmarkssystemsweb

MLPerf

MLCommons

Systems and model benchmarking reference points.

License: free (see site)

learningcontributoradvancedbookml_systemssystemstext

MLSysBook

mlsysbook.ai

Machine learning systems engineering depth.

License: free (see site)

datasetcontributoradvancedcatalogresearch

OpenAlex

OpenAlex

Research discovery + metadata for filtered downloads and linking.

License: open (see site)

datasetcontributoradvancedcorpusresearch

arXiv Bulk Data

arXiv

Research corpora + metadata pipelines; use via official bulk data guidance.

License: varies (see arXiv terms)

Licensing Notes

Data usage

  • Always verify dataset license/terms for your use case (commercial vs research, redistribution, attribution).
  • LAION datasets are indexes of URLs + alt-text, not hosted image files; reconstructing images has separate rights implications.
  • PMC Open Access Subset is only the reuse-permitted part of PMC; not all PMC content is reusable.
  • Stack Exchange dumps have explicit restrictions for LLM training access; verify current terms before using.

Code data

  • The Stack / The Stack v2 have dataset-specific terms and acknowledgments; link and follow them prominently in any use.