Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Mbzuai oryx Awesome LLM Post training Awesome List Curation
- Workflow:Apache Dolphinscheduler Datasource Connection Management
- Workflow:Datahub project Datahub Python Metadata Emission
- Workflow:PacktPublishing LLM Engineers Handbook Feature Engineering
- Workflow:Online ml River Online Clustering
- Workflow:Pyro ppl Pyro MCMC Inference
- Workflow:Pytorch Serve LLM Deployment vLLM
- Workflow:Protectai Modelscan Custom Scanner Plugin
- Workflow:Openai Evals Building a custom eval
- Workflow:Cleanlab Cleanlab Token Classification Label Quality
Principles
- Principle:Microsoft DeepSpeedExamples DeepSpeed CLI Integration
- Principle:AUTOMATIC1111 Stable diffusion webui Conditional Function Dispatch
- Principle:Helicone Helicone Data Export
- Principle:ARISE Initiative Robosuite Manipulation Task Design
- Principle:Puppeteer Puppeteer CDP Session Management
- Principle:Apache Flink Object Reuse
- Principle:Kubeflow Kubeflow Contribution Guidelines
- Principle:Webdriverio Webdriverio Browser Navigation and Interaction
- Principle:Promptfoo Promptfoo External Data Source Integration
- Principle:Protectai Llm guard Input Scanner Factory Pattern
Implementations
- Implementation:Kserve Kserve PD LLMInferenceService Spec
- Implementation:Webdriverio Webdriverio Config Utils
- Implementation:ArroyoSystems Arroyo Pipeline Config Modal
- Implementation:Open compass VLMEvalKit MLVU
- Implementation:PeterL1n BackgroundMattingV2 MattingRefine
- Implementation:Sgl project Sglang Health And Metrics Endpoints
- Implementation:Google deepmind Mujoco MjModel Header
- Implementation:Datajuicer Data juicer DialogSentimentIntensityMapper
- Implementation:Tencent Ncnn Onnx2ncnn
- Implementation:TobikoData Sqlmesh Context Plan
Heuristics
- Heuristic:FlagOpen FlagEmbedding Length Sorted Batching
- Heuristic:Mbzuai oryx Awesome LLM Post training Paper Deduplication Via Dict
- Heuristic:Marker Inc Korea AutoRAG Module Selection Strategies
- Heuristic:Spotify Luigi Marker Table Idempotency
- Heuristic:Lucidrains X transformers MaskGIT Generation Tuning
- Heuristic:Farama Foundation Gymnasium Shared Memory Vector Env Optimization
- Heuristic:Ucbepic Docetl Token Counting And Truncation
- Heuristic:Neuml Txtai Faiss Index Sizing Tip
- Heuristic:Heibaiying BigData Notes HBase Connection Thread Safety Tip
- Heuristic:PacktPublishing LLM Engineers Handbook Chunking Strategy By Content Type
Environments
- Environment:Evidentlyai Evidently SQL Storage Environment
- Environment:Kubeflow Kubeflow Git GitHub Environment
- Environment:Mbzuai oryx Awesome LLM Post training Git CLI
- Environment:Huggingface Diffusers Attention Backends
- Environment:Astronomer Astronomer cosmos Cloud Provider Dependencies
- Environment:Avhz RustQuant Rust Stable
- Environment:Microsoft Autogen LLM Provider API Keys
- Environment:Sgl project Sglang Prometheus
- Environment:Nautechsystems Nautilus trader Arrow Parquet Serialization
- Environment:Huggingface Datasets SQL Dependencies