Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Ray project Ray Build and Release Pipeline
- Workflow:InternLM Lmdeploy LLM Offline Batch Inference
- Workflow:Unslothai Unsloth Vision Model Finetuning
- Workflow:Openai Evals Running an eval set
- Workflow:Kubeflow Pipelines XGBoost Training Pipeline
- Workflow:Cohere ai Cohere python Model Finetuning
- Workflow:Facebookresearch Habitat lab HITL Interactive Evaluation
- Workflow:Datahub project Datahub Metadata Ingestion Pipeline
- Workflow:LMCache LMCache P2P KV Cache Sharing
- Workflow:Vespa engine Vespa Logging framework initialization
Principles
- Principle:Duckdb Duckdb Quantile Estimation
- Principle:Huggingface Alignment handbook Odds Ratio Preference Optimization
- Principle:Online ml River Streaming Silhouette
- Principle:DistrictDataLabs Yellowbrick Visualization Rendering
- Principle:Huggingface Diffusers DreamBooth Export
- Principle:Sdv dev SDV Fixed Combinations Constraint
- Principle:Shiyu coder Kronos Autoregressive Token Generation
- Principle:Intel Ipex llm GaLore Gradient Projection
- Principle:Ggml org Llama cpp Computation Graph Building
- Principle:Kubeflow Kubeflow Release Tagging
Implementations
- Implementation:NVIDIA NeMo Curator BucketsToEdgesStage
- Implementation:Lance format Lance NamespaceError
- Implementation:Openai Openai python Eval Update Response
- Implementation:Sdv dev SDV DayZSynthesizer Multi Table
- Implementation:Eventual Inc Daft Read Deltalake
- Implementation:Open compass VLMEvalKit XGenMM
- Implementation:Predibase Lorax Flash Mistral Modeling
- Implementation:Google deepmind Dm control Reference Pose Tracking
- Implementation:Bigscience workshop Petals AutoTokenizer From Pretrained
- Implementation:Elevenlabs Elevenlabs python StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel
Heuristics
- Heuristic:TobikoData Sqlmesh Fork Worker Tuning
- Heuristic:Explodinggradients Ragas LLM Temperature Defaults
- Heuristic:Google deepmind Dm control MJCF Model Composition Gotchas
- Heuristic:Lance format Lance BM25 FTS Configuration
- Heuristic:Apache Hudi Compaction Scheduling Safety
- Heuristic:Nautechsystems Nautilus trader Cache Buffer Interval Tuning
- Heuristic:SqueezeAILab ETS Thread Parallelism Suppression
- Heuristic:Elevenlabs Elevenlabs python Text Chunking Splitter Characters
- Heuristic:PacktPublishing LLM Engineers Handbook Dataset Generation Quality Filters
- Heuristic:Microsoft LoRA LoRA Rank Selection
Environments
- Environment:Mage ai Mage ai Python 3 9 Runtime
- Environment:ClickHouse ClickHouse Systemd Runtime
- Environment:Webdriverio Webdriverio Cloud Service Credentials
- Environment:Haosulab ManiSkill Real Robot LeRobot Deps
- Environment:Google research Deduplicate text datasets Python TFDS Environment
- Environment:Haifengl Smile Java 25 Runtime
- Environment:ThreeSR Awesome Inference Time Scaling Python Runtime Environment
- Environment:Sgl project Sglang Kubernetes
- Environment:DataTalksClub Data engineering zoomcamp Docker PostgreSQL Python Environment
- Environment:Marker Inc Korea AutoRAG API Keys Configuration