Large, diverse open-source language-modeling dataset.
base_weights
pile.eleuther.ai
dataset
unknown
general
n/a
publishable
conditional
no
training_data_stack
70b03f23a74a71e3
Open Resource
Keep momentum with nearby resources and structured tracks.
Tags: dataset