Similar items by topic, tags, and provider (metadata-only).
datasetlaion.ai
laion.ai
Still an index of URLs/alt-text; reconstructed images have separate rights considerations.
datasethuggingface.co
huggingface.co
Access: research-only. Video-text dataset
datasetdocvqa.org
docvqa.org
Access: open (login). Document VQA datasets
datasetBuildphysionet.org
physionet.org
Canonical source for ECG, ICU, waveform, and related biomedical datasets.
datasetBuildopenneuro.org
openneuro.org
Best open hub for MRI/EEG/MEG/iEEG style data.
datasetfoundationMIT
MIT
Excellent lecture notes, exams, and videos across advanced technical topics.
datasetnandumps.wikimedia.org
dumps.wikimedia.org
Strong encyclopedic backbone for general knowledge and factual style.
datasetUCI
UCI
Classic and modern ML datasets that are ideal for education, benchmarking, and tabular experiments.
datasetnanHugging Face
Hugging Face
Top open code corpus; huge language coverage.
datasetnanstorage.googleapis.com
storage.googleapis.com
Large supervised vision dataset with labels, boxes, masks, relations, narratives.
datasetNASA
NASA
NASA metadata portal for open datasets spanning science, aeronautics, Earth, and exploration.
datasetnanopenslr.org
openslr.org
Classic open English speech corpus.