AI Power Progress iA
Resource detail

MBPP

Small benchmark for Python program synthesis and instruction-following code generation.

benchmark code dataset local-ai mbpp training-data

Resource Metadata

Category

Code

Provider

Google Research

Type

dataset

Level

unknown

Topic

Local AI / LLM Engineering / RAG

Track

Local AI / LLM Engineering / RAG

Section

Open data

Format

Dataset / benchmark

Status

publishable

Commercial

candidate

Featured

no

Fast start

no

Sequence

nan

Priority

B

Primary source

mega_open_hub

Sources

mega_open_hub

ID

47a467dcdc1853c7

Open Resource

Fallback Access

Continue Learning

Keep momentum with nearby resources and structured tracks.

Learning placement: track: Local AI / LLM Engineering / RAG

Tags: benchmark code dataset local-ai mbpp training-data

Related Resources

Similar items by topic, tags, and provider (metadata-only).

datasetOpenAI

OpenAI GSM8K

OpenAI

Widely used grade-school math reasoning benchmark for evaluation and small-scale instruction data work.

datasetOpenML

OpenML Docs

OpenML

Open ecosystem for datasets, tasks, models, and runs that helps standardize ML benchmarking.

datasetnanHugging Face

FineWeb2

Hugging Face

Best multilingual extension of FineWeb pipeline; very broad language coverage.

datasetnanHugging Face

FineWeb

Hugging Face

Huge cleaned English web corpus; best raw breadth for LLM pretraining.

datasetnanHugging Face

Cosmopedia

Hugging Face

Synthetic textbook/blog/WikiHow-style corpus that helps tutor-like explanations.

datasetnanHugging Face

Common Pile v0.1

Hugging Face

Best legally cleaner starting corpus: 8 TB of public-domain and openly licensed text spanning books, papers, code, encyclopedias, educational materials, and transcripts.

datasetnanHugging Face

xP3

Hugging Face

Crosslingual prompt pool across many languages and tasks.

datasetnanHugging Face

Vision-Flan

Hugging Face

Good open visual instruction tuning layer after base vision-language pretraining.

datasetnanHugging Face

UltraChat

Hugging Face

Good synthetic multi-turn chat augmentation.