Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Recommenders team Recommenders ALS Spark Recommendation
- Workflow:Cohere ai Cohere python Model Finetuning
- Workflow:Kubeflow Pipelines XGBoost Training Pipeline
- Workflow:Datajuicer Data juicer LLM Powered Data Generation
- Workflow:Truera Trulens Snowflake Observability Pipeline
- Workflow:Helicone Helicone Cost Calculation Pipeline
- Workflow:Confident ai Deepeval End to End LLM Evaluation
- Workflow:FlowiseAI Flowise Evaluation Pipeline
- Workflow:Online ml River Streaming Anomaly Detection
- Workflow:Marker Inc Korea AutoRAG RAG Pipeline Optimization
Principles
- Principle:Pyro ppl Pyro Topic Modeling
- Principle:NVIDIA NeMo Aligner SteerLM Data Preparation
- Principle:Google deepmind Dm control Game Rules Configuration
- Principle:Avhz RustQuant Geometric Brownian Motion
- Principle:Langchain ai Langchain Version Bumping
- Principle:Googleapis Python genai Local Tokenization
- Principle:Heibaiying BigData Notes Flink Stream Transformations
- Principle:ARISE Initiative Robosuite Observable System
- Principle:Confident ai Deepeval Golden Generation from Documents
- Principle:Lm sys FastChat Condensed Rotary Embedding
Implementations
- Implementation:PrefectHQ Prefect DataAnalysis Output Model
- Implementation:Ray project Ray PyTorch Hyperparameter Tuning Tutorial
- Implementation:Astronomer Astronomer cosmos RedshiftUserPasswordProfileMapping
- Implementation:Ollama Ollama WriteManifest
- Implementation:Pyro ppl Pyro AffineBeta
- Implementation:Sgl project Sglang SM100 MLA Tile Scheduler
- Implementation:Deepspeedai DeepSpeed DeepCompile Header
- Implementation:Microsoft Playwright CryptoUtils
- Implementation:OpenRLHF OpenRLHF Interactive Chat
- Implementation:Kubeflow Pipelines XGBoost Cross Format Predict
Heuristics
- Heuristic:Mbzuai oryx Awesome LLM Post training Checkpoint Every 3 Papers
- Heuristic:PeterL1n BackgroundMattingV2 Data Augmentation Strategy
- Heuristic:Astronomer Astronomer cosmos Memory Optimised Imports
- Heuristic:Tencent Ncnn FP16 Precision Selection
- Heuristic:Astronomer Astronomer cosmos Static Parser Hang Workaround
- Heuristic:Huggingface Datatrove FineWeb Filter Pipeline Order
- Heuristic:Langchain ai Langgraph Checkpointer Selection Guide
- Heuristic:Neuml Txtai LLM Context Window Fallback
- Heuristic:Run llama Llama index Worker Count Configuration
- Heuristic:Triton inference server Server Server Default Configuration
Environments
- Environment:Alibaba MNN CPU Build Environment
- Environment:Facebookresearch Habitat lab Python 3 9 Core Dependencies
- Environment:Risingwavelabs Risingwave Java Connector Environment
- Environment:Farama Foundation Gymnasium Python 3 10 Runtime
- Environment:Farama Foundation Gymnasium MuJoCo Physics Backend
- Environment:Tencent Ncnn PyTorch Environment
- Environment:Ggml org Ggml CUDA GPU Environment
- Environment:Tensorflow Serving Python Client Environment
- Environment:Pytorch Serve Distributed Training Environment
- Environment:Datajuicer Data juicer LLM API Credentials Environment