AI Power Progress iA
All Resources / Topics / Topic / Re-LAION-5B
Resource detail

Re-LAION-5B

Still an index of URLs/alt-text; reconstructed images have separate rights considerations.

dataset multimodal-data open-data research

Resource Metadata

Category

multimodal

Provider

laion.ai

Type

dataset

Level

unknown

Topic

Multimodal Data

Track

n/a

Section

Open Data Directory

Format

n/a

Status

publishable

Commercial

unknown

Featured

no

Fast start

no

Sequence

n/a

Priority

n/a

Primary source

training_data_stack

Sources

training_data_stack, website_existing

ID

5e6590a1020bfa75

Open Resource

Fallback Access

Continue Learning

Keep momentum with nearby resources and structured tracks.

Tags: dataset multimodal-data open-data research

Related Resources

Similar items by topic, tags, and provider (metadata-only).

datasethuggingface.co

HowTo100M

huggingface.co

Access: research-only. Video-text dataset

datasetmmmu-benchmark.github.io

MMMU

mmmu-benchmark.github.io

Access: open. Multimodal reasoning benchmark

datasetdocvqa.org

DocVQA

docvqa.org

Access: open (login). Document VQA datasets

datasetLAION

LAION-5B

LAION

Web-scale image-text corpus for multimodal research; use with strong filtering and license review.

datasetBuildphysionet.org

PhysioNet

physionet.org

Canonical source for ECG, ICU, waveform, and related biomedical datasets.

datasetMozilla

Common Voice

Mozilla

Large multilingual speech dataset project for ASR, speech research, and voice tooling.

datasetre3data.org

re3data

re3data.org

Access: open. Registry of research data repositories

datasetmicrosoft.github.io

MS MARCO

microsoft.github.io

Access: research-only. Large IR/QA dataset