Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:ARISE Initiative Robosuite Domain Randomization Training
- Workflow:Huggingface Datatrove Dataset Tokenization
- Workflow:Lance format Lance Vector Search Pipeline
- Workflow:Bitsandbytes foundation Bitsandbytes FSDP QLoRA Distributed Training
- Workflow:Fastai Fastbook Image Classification
- Workflow:Mit han lab Llm awq TinyChat LLM Deployment
- Workflow:Marker Inc Korea AutoRAG Data Creation Pipeline
- Workflow:Huggingface Open r1 GRPO Reasoning Training
- Workflow:DataExpert io Data engineer handbook PySpark Iceberg Job Execution
- Workflow:Apache Hudi Docker Demo Setup
Principles
- Principle:Dagster io Dagster Run Configuration
- Principle:LaurentMazare Tch rs Dataset Iteration
- Principle:Openai Openai agents python Tool Execution Loop
- Principle:Cypress io Cypress Framework and Bundler Detection
- Principle:Nautechsystems Nautilus trader Data Subscription
- Principle:TobikoData Sqlmesh Macro And Jinja Translation
- Principle:Obss Sahi Detectron2 Config Export
- Principle:Huggingface Datatrove FineWeb Quality Heuristics
- Principle:Getgauge Taiko Request Blocking
- Principle:Mbzuai oryx Awesome LLM Post training Category Taxonomy Definition
Implementations
- Implementation:Kserve Kserve InferenceGraph Full CRD
- Implementation:Risingwavelabs Risingwave BatchAppendOnlyJDBCSink
- Implementation:Hiyouga LLaMA Factory V1 Data Engine
- Implementation:Alibaba MNN FlexBuffers Header
- Implementation:Spcl Graph of thoughts ChatGPT
- Implementation:FlowiseAI Flowise ViewMessagesDialog
- Implementation:Pola rs Polars Credential Provider Configuration
- Implementation:FlagOpen FlagEmbedding RetroMAE Data
- Implementation:Openai Openai python Eval List Params
- Implementation:Datajuicer Data juicer PipelineDAG
Heuristics
- Heuristic:Cypress io Cypress Xvfb Display Gotcha
- Heuristic:Helicone Helicone Anthropic Cache Double Count Prevention
- Heuristic:Diagram of thought Diagram of thought Typed Records For Auditability
- Heuristic:DevExpress Testcafe Concurrency Factor Limit
- Heuristic:LLMBook zh LLMBook zh github io DPO Beta Hyperparameter
- Heuristic:Trailofbits Fickling Severity Threshold Selection
- Heuristic:Speechbrain Speechbrain Data Augmentation Defaults
- Heuristic:FMInference FlexLLMGen Offloading Percent Tuning
- Heuristic:Anthropics Anthropic sdk python Streaming For Long Requests
- Heuristic:Cypress io Cypress V8 Snapshot Memory
Environments
- Environment:PacktPublishing LLM Engineers Handbook AWS SageMaker GPU Environment
- Environment:Facebookresearch Audiocraft FAD TensorFlow Environment
- Environment:InternLM Lmdeploy Python Dependencies
- Environment:BerriAI Litellm Python Runtime
- Environment:Kubeflow Kubeflow Python KFP SDK Environment
- Environment:Lucidrains X transformers Python Environment
- Environment:Microsoft Onnxruntime CUDA GPU Environment
- Environment:NVIDIA TransformerEngine CUDA Toolkit Requirements
- Environment:Apache Kafka Release Toolchain Environment
- Environment:Intel Ipex llm NPU Cpp Environment