Similar items by topic, tags, and provider (metadata-only).
reponanGitHub
GitHub
Excellent multilingual image-text corpus from Wikipedia/Wikimedia.
docsFoundationOpenCV
OpenCV
Still the fastest practical path to vision preprocessing, classical CV, and real-world image pipelines.
datasetBuildMicrosoft / Google
Microsoft / Google
High-value public corpora for detection, captions, grounding, and multimodal experimentation.
datasetBuildMozilla / OpenSLR
Mozilla / OpenSLR
Best public speech base for ASR experimentation and evaluation.
resourceFoundationPyImageSearch
PyImageSearch
Project-driven vision learning for detection, OCR, face pipelines, and deployment patterns.
datasetnanimage-net.org
image-net.org
Still useful for vision classification baselines.
videonanOpenCV
OpenCV
Video supplement for image processing and computer vision workflows.
datasetnanstorage.googleapis.com
storage.googleapis.com
Large supervised vision dataset with labels, boxes, masks, relations, narratives.
datasetnancommonvoice.mozilla.org
commonvoice.mozilla.org
Best open multilingual voice dataset.
datasetnanopenslr.org
openslr.org
Classic open English speech corpus.
docsBuildOpenCV
OpenCV
Reference docs for computer vision building blocks, APIs, and modules.
datasetnanopenslr.org
openslr.org
Open TTS-oriented speech corpus.