AI Power Progress iA
Open datasets

Open Data Directory

Datasets for training, evaluation, and retrieval-augmented generation, with health and fallback visibility.

Loading data health...

AI Data Stack: training → embeddings → LoRA → RAG

Use these shortcuts to jump into the most common open datasets and resources for building local models and grounded assistants.

Foundation / pretraining

Large corpora and knowledge bases used to train foundation models. Always verify license and dataset terms for your use case.

Instruction / LoRA fine-tuning

High-signal instruction datasets and practical fine-tuning references (LoRA / QLoRA / PEFT).

Search Datasets