Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:PeterL1n BackgroundMattingV2 Video matting inference
- Workflow:Volcengine Verl Multi Turn Tool Use Training
- Workflow:PrefectHQ Prefect Web Scraping Pipeline
- Workflow:Huggingface Datatrove FineWeb Dataset Creation
- Workflow:DataExpert io Data engineer handbook PySpark Job Testing
- Workflow:DataExpert io Data engineer handbook Flink Kafka Streaming Pipeline
- Workflow:Arize ai Phoenix Trace Ingestion Pipeline
- Workflow:Unslothai Unsloth QLoRA SFT Finetuning
- Workflow:Kubeflow Kubeflow Platform Deployment
- Workflow:OpenGVLab InternVL Multi Stage Pretraining
Principles
- Principle:Huggingface Peft Adapter Persistence
- Principle:Scikit learn Scikit learn Ranking Metrics
- Principle:Anthropics Anthropic sdk python Client Initialization
- Principle:Haifengl Smile Nearest Neighbor Query
- Principle:Trailofbits Fickling Pickle Scanner Benchmarking
- Principle:Datajuicer Data juicer Data Processing Execution
- Principle:LaurentMazare Tch rs Frozen Feature Computation
- Principle:Apache Druid Streaming Source Configuration
- Principle:Lance format Lance Approximate Nearest Neighbor Search
- Principle:Trailofbits Fickling Static Safety Analysis
Implementations
- Implementation:Iterative Dvc Repo Du
- Implementation:Openai Openai python Video Create Params
- Implementation:Openclaw Openclaw FlyToml RenderYaml
- Implementation:Iterative Dvc Pyproject Config
- Implementation:Helicone Helicone ModelConfig AuthorMetadata
- Implementation:Huggingface Datasets IterableDataset Take
- Implementation:ContextualAI HALOs LM Eval CLI
- Implementation:Huggingface Trl SFTTrainer Train
- Implementation:ArroyoSystems Arroyo Aws Credential Provider
- Implementation:Lance format Lance NamespaceSchema
Heuristics
- Heuristic:Facebookresearch Audiocraft Codebook Dead Code Expiration
- Heuristic:Huggingface Alignment handbook Global Batch Size Scaling
- Heuristic:Huggingface Datasets Warning Deprecated Pandas Builder
- Heuristic:Predibase Lorax LoRA Kernel Selection By Rank
- Heuristic:AnswerDotAI RAGatouille Collection Size Index Tuning
- Heuristic:Pytorch Serve Ampere Tensor Core Optimization
- Heuristic:Microsoft Autogen Model Context Limiting
- Heuristic:Onnx Onnx Warning Deprecated InlineSelectedFunctions
- Heuristic:Run llama Llama index Chunk Size Optimization
- Heuristic:CrewAIInc CrewAI LLM Provider Message Workarounds
Environments
- Environment:Lm sys FastChat SFT Training Environment
- Environment:Avdvg InjectGuard CUDA GPU
- Environment:Eric mitchell Direct preference optimization PyTorch CUDA
- Environment:Mistralai Client python Realtime Transcription Environment
- Environment:Apache Flink Node Build Environment
- Environment:Rapidsai Cuml Python RAPIDS Stack
- Environment:Langchain ai Langchain Unit Test Network Isolation
- Environment:Cypress io Cypress Linux Display Server
- Environment:Pytorch Serve CUDA GPU Environment
- Environment:MaterializeInc Materialize Buildkite CI Runtime