AI Power Progress iA
All Resources / Topics / Topic / HowTo100M
Resource detail

HowTo100M

Access: research-only. Video-text dataset

dataset multimodal-data open-data research video

Resource Metadata

Category

Multimodal Data

Provider

huggingface.co

Type

dataset

Level

unknown

Topic

Multimodal Data

Track

n/a

Section

Open Data Directory

Format

n/a

Status

publishable

Commercial

unknown

Featured

no

Fast start

no

Sequence

n/a

Priority

n/a

Primary source

website_existing

Sources

website_existing

ID

93d812141a3809cd

Open Resource

Fallback Access

Continue Learning

Keep momentum with nearby resources and structured tracks.

Tags: dataset multimodal-data open-data research video

Related Resources

Similar items by topic, tags, and provider (metadata-only).

datasetlaion.ai

Re-LAION-5B

laion.ai

Still an index of URLs/alt-text; reconstructed images have separate rights considerations.

datasetmmmu-benchmark.github.io

MMMU

mmmu-benchmark.github.io

Access: open. Multimodal reasoning benchmark

datasetdocvqa.org

DocVQA

docvqa.org

Access: open (login). Document VQA datasets

datasetBuildphysionet.org

PhysioNet

physionet.org

Canonical source for ECG, ICU, waveform, and related biomedical datasets.

datasetMozilla

Common Voice

Mozilla

Large multilingual speech dataset project for ASR, speech research, and voice tooling.

datasetre3data.org

re3data

re3data.org

Access: open. Registry of research data repositories

datasetmicrosoft.github.io

MS MARCO

microsoft.github.io

Access: research-only. Large IR/QA dataset

datasetimage-net.org

ImageNet

image-net.org

Access: research-only. Large image classification

datasetfigshare.com

Figshare

figshare.com

Access: open. General research data repository