Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Duckdb Duckdb Code Generation Pipeline
- Workflow:Ucbepic Docetl Playground Interactive Development
- Workflow:Apache Airflow Kubernetes Deployment via Helm
- Workflow:Apache Beam Dataflow Streaming Execution
- Workflow:OpenRLHF OpenRLHF Math Reasoning Training
- Workflow:Mlc ai Web llm Web Worker Deployment
- Workflow:Fede1024 Rust rdkafka Mock Cluster Testing
- Workflow:CrewAIInc CrewAI Flow Based Orchestration
- Workflow:NVIDIA TransformerEngine FSDP Distributed Training
- Workflow:PacktPublishing LLM Engineers Handbook Digital Data ETL
Principles
- Principle:Risingwavelabs Risingwave Batch Query Serving
- Principle:Online ml River Online Decision Trees
- Principle:DevExpress Testcafe Runner Creation
- Principle:Openai Whisper Evaluation Benchmarking
- Principle:Langgenius Dify Plugin Discovery
- Principle:SeldonIO Seldon core Pipeline Conditional Routing
- Principle:FlowiseAI Flowise Variable Management
- Principle:Webdriverio Webdriverio Test Execution Orchestration
- Principle:AUTOMATIC1111 Stable diffusion webui VAE decoding
- Principle:NVIDIA DALI GridMask Augmentation
Implementations
- Implementation:Apache Dolphinscheduler BaseDataSourceParamDTO Extension
- Implementation:Huggingface Datatrove JsonlWriter
- Implementation:Diagram of thought Diagram of thought Rigor Protocol Configuration
- Implementation:Ggml org Llama cpp Ggml Backend Load All
- Implementation:Online ml River Tree Splitter TEBST
- Implementation:NVIDIA NeMo Curator Benchmark Runner
- Implementation:Cypress io Cypress DetectFramework Setup
- Implementation:Microsoft DeepSpeedExamples Domino Training
- Implementation:Microsoft Onnxruntime CPU GatherNDGrad
- Implementation:LLMBook zh LLMBook zh github io Get Data DPO
Heuristics
- Heuristic:PacktPublishing LLM Engineers Handbook Temperature Selection By Task
- Heuristic:ARISE Initiative Robomimic Rollout Horizon Selection
- Heuristic:VainF Torch Pruning Pruning Ratio vs Parameter Ratio
- Heuristic:Helicone Helicone Anthropic Cache Double Count Prevention
- Heuristic:VainF Torch Pruning AutoGrad Dependency Graph
- Heuristic:Fede1024 Rust rdkafka Queue Buffering Priority
- Heuristic:Guardrails ai Guardrails RAIL Argument Parsing Security
- Heuristic:Unslothai Unsloth Padding Free Packing
- Heuristic:Iamhankai Forest of Thought Early Stop Majority Vote
- Heuristic:OpenHands OpenHands Clustered Race Condition Prevention
Environments
- Environment:Kubeflow Pipelines Kubernetes Cluster
- Environment:Deepset ai Haystack HuggingFace Model Environment
- Environment:LMCache LMCache Python Runtime
- Environment:Interpretml Interpret Visualization Environment
- Environment:Huggingface Datatrove Slurm Cluster Environment
- Environment:Open compass VLMEvalKit Python Runtime Environment
- Environment:Nautechsystems Nautilus trader Arrow Parquet Serialization
- Environment:Google deepmind Mujoco Python Bindings Environment
- Environment:Hiyouga LLaMA Factory Quantization Dependencies
- Environment:Huggingface Diffusers Attention Backends