Similar items by topic, tags, and provider (metadata-only).
docsBuildQdrant
Qdrant
Simple, strong choice for local and hybrid vector search systems.
videonanHugging Face
Hugging Face
Modern LLM engineering, datasets, transformers, and community tutorials.
docsBuildQdrant
Qdrant
Quick path to a working semantic search stack with the Qdrant API and local deployment.
datasetfoundationMIT
MIT
Excellent lecture notes, exams, and videos across advanced technical topics.
docsZeroOllama
Ollama
Fastest path to running modern local models on a workstation.
reponanGitHub
GitHub
Large collection of NLP tasks with human-readable instructions.
reponanGitHub
GitHub
One of the best open instruction mixtures; includes FLAN, P3, Super-Natural Instructions, and more.
datasetnanHugging Face
Hugging Face
Best multilingual extension of FineWeb pipeline; very broad language coverage.
datasetnanHugging Face
Hugging Face
Huge cleaned English web corpus; best raw breadth for LLM pretraining.
datasetnanHugging Face
Hugging Face
Synthetic textbook/blog/WikiHow-style corpus that helps tutor-like explanations.
datasetnanHugging Face
Hugging Face
Best legally cleaner starting corpus: 8 TB of public-domain and openly licensed text spanning books, papers, code, encyclopedias, educational materials, and transcripts.
docsAdvancedHugging Face
Hugging Face
Core for LoRA and efficient adaptation on local or limited hardware.