Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Haifengl Smile SQL Analytics Pipeline
- Workflow:Apache Flink File Sink Pipeline
- Workflow:Treeverse LakeFS Garbage Collection
- Workflow:Openai Whisper Word Level Timestamps
- Workflow:Marker Inc Korea AutoRAG Data Creation Pipeline
- Workflow:Cleanlab Cleanlab Token Classification Label Quality
- Workflow:Pyro ppl Pyro Bayesian Regression
- Workflow:Microsoft DeepSpeedExamples CIFAR10 Getting Started
- Workflow:Hiyouga LLaMA Factory DPO Preference Alignment
- Workflow:Google deepmind Dm control Control Suite RL Training
Principles
- Principle:Webdriverio Webdriverio Browser Navigation and Interaction
- Principle:Kornia Kornia Edge Detection
- Principle:Sail sg LongSpec VLLM Inference Client
- Principle:Scikit learn contrib Imbalanced learn Dataset Imbalancing
- Principle:Sail sg LongSpec Hydra Configuration
- Principle:Mlfoundations Open flamingo Data Quality Filtering
- Principle:FlowiseAI Flowise User Profile Management
- Principle:Mlc ai Web llm Extension Client Engine
- Principle:Volcengine Verl RLHF Data Preparation
- Principle:Apache Dolphinscheduler Server Lifecycle Management
Implementations
- Implementation:CrewAIInc CrewAI Serply News Search Tool
- Implementation:Elevenlabs Elevenlabs python GetPronunciationDictionaryMetadataResponse
- Implementation:Kubeflow Pipelines Pipeline Spec Protobuf
- Implementation:Sgl project Sglang Json Output Parsing
- Implementation:Microsoft Playwright Tracing Start
- Implementation:Apache Paimon FormatTableScan
- Implementation:Risingwavelabs Risingwave Binding
- Implementation:ArroyoSystems Arroyo Data Fetching
- Implementation:Kornia Kornia SSIM3D Loss
- Implementation:Sktime Pytorch forecasting GridUpdateCallback
Heuristics
- Heuristic:ContextualAI HALOs Batch Size Divisibility
- Heuristic:Helicone Helicone Provider URL Regex Priority
- Heuristic:Triton inference server Server Dynamic Batching Tuning
- Heuristic:Getgauge Taiko Element Actionability Checks
- Heuristic:Mit han lab Llm awq GPU Memory Management Patterns
- Heuristic:Vllm project Vllm KV Cache Block Size Selection
- Heuristic:Huggingface Alignment handbook Liger Kernel Memory
- Heuristic:Openclaw Openclaw Warning Suppression For Known Deprecations
- Heuristic:Hiyouga LLaMA Factory LoRA DDP Configuration
- Heuristic:Speechbrain Speechbrain Nonfinite Loss Handling
Environments
- Environment:Datajuicer Data juicer Ray Cluster Environment
- Environment:NVIDIA NeMo Curator Ray Cluster
- Environment:Microsoft DeepSpeedExamples SuperOffload Runtime
- Environment:Google research Deduplicate text datasets Rust Cargo Build Environment
- Environment:OpenHands OpenHands SaaS Server Environment
- Environment:Hiyouga LLaMA Factory FP8 Training Environment
- Environment:Evidentlyai Evidently Spark Engine Environment
- Environment:Sgl project Sglang GitHub Actions
- Environment:Lakeraai Pint benchmark Python 310 With Transformers
- Environment:Ggml org Ggml CUDA GPU Environment