Similar items by topic, tags, and provider (metadata-only).
datasethuggingface.co
huggingface.co
Access: research-only. Video-text dataset
datasetmmmu-benchmark.github.io
mmmu-benchmark.github.io
Access: open. Multimodal reasoning benchmark
datasetdocvqa.org
docvqa.org
Access: open (login). Document VQA datasets
datasetLAION
LAION
Web-scale image-text corpus for multimodal research; use with strong filtering and license review.
datasetnanlaion.ai
laion.ai
Massive web-scale image-text corpus.
datasetBuildphysionet.org
physionet.org
Canonical source for ECG, ICU, waveform, and related biomedical datasets.
datasetfoundationMIT
MIT
Excellent lecture notes, exams, and videos across advanced technical topics.
datasetMozilla
Mozilla
Large multilingual speech dataset project for ASR, speech research, and voice tooling.
datasetre3data.org
re3data.org
Access: open. Registry of research data repositories
datasetpaperswithcode.com
paperswithcode.com
Access: open. Dataset index across ML tasks
datasetmicrosoft.github.io
microsoft.github.io
Access: research-only. Large IR/QA dataset
datasetmawi.wide.ad.jp
mawi.wide.ad.jp
Access: research-only. Backbone traffic traces