Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Mbzuai oryx Awesome LLM Post training Deep Paper Collection
- Workflow:CarperAI Trlx ILQL Offline Training
- Workflow:Datajuicer Data juicer Distributed Ray Processing
- Workflow:Lance format Lance Vector Search Pipeline
- Workflow:LLMBook zh LLMBook zh github io Data Preprocessing Pipeline
- Workflow:Treeverse LakeFS Data Version Control With Branches
- Workflow:AnswerDotAI RAGatouille In Memory Retrieval
- Workflow:Apache Druid SQL Query Execution
- Workflow:Ggml org Llama cpp Multimodal Inference
- Workflow:LaurentMazare Tch rs MNIST Training
Principles
- Principle:Apache Beam Computation Configuration
- Principle:Huggingface Datatrove Media Writing Framework
- Principle:NVIDIA DALI Operator Schema Registration
- Principle:Scikit learn contrib Imbalanced learn Instance Hardness Thresholding
- Principle:Scikit learn Scikit learn Ensemble Training
- Principle:Ray project Ray Docker Image Building
- Principle:Volcengine Verl SFT Data Preparation
- Principle:Mlflow Mlflow Trace Assessment
- Principle:Lucidrains X transformers Variational Latent Language Modeling
- Principle:Online ml River Non Stationary Stream Loading
Implementations
- Implementation:Tensorflow Serving Caching Manager Test
- Implementation:ArroyoSystems Arroyo Nexmark Operator
- Implementation:Explodinggradients Ragas Loss Classes
- Implementation:Ggml org Ggml Webgpu shader lib
- Implementation:Alibaba ROLL SFT Get Encode Function
- Implementation:InternLM Lmdeploy Core ArrayOps
- Implementation:Open compass VLMEvalKit OlympiadBench Utils
- Implementation:NVIDIA NeMo Curator ArXiv Extractor
- Implementation:Unstructured IO Unstructured Measure Execution Time
- Implementation:Spotify Luigi HadoopJobRunner
Heuristics
- Heuristic:HKUDS AI Trader Linear Retry Backoff
- Heuristic:Gretelai Gretel synthetics Gumbel Softmax NaN Retry
- Heuristic:Bentoml BentoML Worker Count Strategy
- Heuristic:Pytorch Serve Batch Size Tuning
- Heuristic:Kubeflow Pipelines Resource Sizing For Components
- Heuristic:NVIDIA DALI NVJPEG Memory Preallocation
- Heuristic:Apache Paimon Vector Index Configuration Tips
- Heuristic:MarketSquare Robotframework browser MacOS Sonoma Startup Delay
- Heuristic:Huggingface Diffusers Dtype Precision Selection
- Heuristic:Togethercomputer Together python Fine Tuning Parameter Validation
Environments
- Environment:Apache Dolphinscheduler Node Pnpm Runtime
- Environment:Huggingface Datatrove IO Dependencies
- Environment:Datahub project Datahub Python 3 10 Ingestion Environment
- Environment:Spotify Luigi Apache Spark
- Environment:Promptfoo Promptfoo SQLite Database
- Environment:Eventual Inc Daft AI Provider Dependencies
- Environment:Mistralai Client python Python SDK Environment
- Environment:Langfuse Langfuse Node 24 Runtime
- Environment:Helicone Helicone Docker Compose Infrastructure
- Environment:Getgauge Taiko Chromium Browser