Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Huggingface Datasets Dataset Loading and Exploration
- Workflow:Apache Hudi Docker Demo Setup
- Workflow:Langgenius Dify RAG Pipeline Development
- Workflow:NVIDIA DALI Image Preprocessing Pipeline
- Workflow:ChenghaoMou Text dedup SimHash Deduplication
- Workflow:Kserve Kserve LLM Disaggregated Serving
- Workflow:Haotian liu LLaVA Two Stage Pretraining and Finetuning
- Workflow:MaterializeInc Materialize dbt Integration
- Workflow:VainF Torch Pruning Vision Transformer Pruning
- Workflow:Anthropics Anthropic sdk python Structured Output Extraction
Principles
- Principle:Eric mitchell Direct preference optimization Log Probability Extraction
- Principle:Microsoft BIPIA Black Box Defense Configuration
- Principle:Vllm project Vllm Constrained Sampling Configuration
- Principle:Heibaiying BigData Notes Storm Topology Deployment
- Principle:Kserve Kserve Storage Credentials
- Principle:Deepspeedai DeepSpeed Accelerator Abstraction
- Principle:Apache Flink Data Generation
- Principle:Lucidrains X transformers Aligned Model Evaluation
- Principle:Ollama Ollama ThinkingSupport
- Principle:Mlflow Mlflow Model Version Management
Implementations
- Implementation:Google deepmind Mujoco MJWarp Collision Primitive Core
- Implementation:Googleapis Python genai GenerateContentResponse Access
- Implementation:LLMBook zh LLMBook zh github io DPOTrainer Train
- Implementation:ArroyoSystems Arroyo Polling Http Connector
- Implementation:Ggml org Llama cpp Batch Header
- Implementation:NVIDIA DALI NVML Utilities
- Implementation:Hpcaitech ColossalAI DocumentLoader
- Implementation:Openai Openai agents python Handoff Helper
- Implementation:Hiyouga LLaMA Factory Ploting
- Implementation:Scikit learn Scikit learn PLSRegression
Heuristics
- Heuristic:NVIDIA DALI Warning Deprecated C API V1 Functions
- Heuristic:Isaac sim IsaacGymEnvs Factory Velocity Limits
- Heuristic:Obss Sahi Match Threshold Tuning
- Heuristic:Apache Beam Warning Deprecated Twister2 Runner
- Heuristic:Vespa engine Vespa RPM Zstd Compression Settings
- Heuristic:Wandb Weave Retry And Error Handling
- Heuristic:Kserve Kserve Multinode Replica Calculation
- Heuristic:Neuml Txtai Memory Streaming Optimization
- Heuristic:Kubeflow Kubeflow Issue Routing To Sub Repos
- Heuristic:Princeton nlp Tree of thought llm Functools Partial Model Binding
Environments
- Environment:Pytorch Serve Python PyTorch Runtime
- Environment:Ray project Ray CI Build Matrix Environment
- Environment:Heibaiying BigData Notes Java 8 Maven Environment
- Environment:Openai Openai agents python OpenAI API Credentials
- Environment:Risingwavelabs Risingwave Docker Deployment Environment
- Environment:Predibase Lorax Model Source Credentials
- Environment:Mistralai Client python GCP Deployment Environment
- Environment:Ggml org Llama cpp CMake Build Environment
- Environment:Scikit learn contrib Imbalanced learn Python Scikit learn
- Environment:Apache Flink Java Build Environment