Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Speechbrain Speechbrain Speaker Embedding Training
- Workflow:Huggingface Diffusers LoRA Finetuning
- Workflow:Deepset ai Haystack Extractive QA Pipeline
- Workflow:Apache Dolphinscheduler RPC Service Communication
- Workflow:PacktPublishing LLM Engineers Handbook Feature Engineering
- Workflow:Deepset ai Haystack RAG Evaluation Pipeline
- Workflow:FlowiseAI Flowise Document Store Ingestion
- Workflow:LaurentMazare Tch rs Transfer Learning
- Workflow:Vllm project Vllm OpenAI Compatible Serving
- Workflow:Pola rs Polars Data IO and Format Conversion
Principles
- Principle:Huggingface Datatrove Document Level Statistics
- Principle:Cleanlab Cleanlab Sklearn Compatible PyTorch Classifier
- Principle:Hiyouga LLaMA Factory Mixture of Experts
- Principle:Mlflow Mlflow Run Management
- Principle:Helicone Helicone Provider Communication
- Principle:Microsoft Semantic kernel Plugin Registration
- Principle:Neuml Txtai Workflow Scheduling
- Principle:Langchain ai Langgraph Language Model Configuration
- Principle:Langchain ai Langgraph Chatbot Simulation Evaluation
- Principle:Ucbepic Docetl Pipeline Configuration
Implementations
- Implementation:TobikoData Sqlmesh WebClient OpenAPI Spec
- Implementation:Apache Shardingsphere DatabaseMetaDataChangedListener OnChange
- Implementation:DataExpert io Data engineer handbook Namedtuple CreateDataFrame Pattern
- Implementation:Mlc ai Mlc llm Router Translate request
- Implementation:Microsoft DeepSpeedExamples Write Benchmark Log
- Implementation:Ggml org Llama cpp GGUF Split
- Implementation:NVIDIA TransformerEngine FusedSGD
- Implementation:EvolvingLMMs Lab Lmms eval Evaluation Shell Script
- Implementation:Huggingface Diffusers Modular Auto Docstring
- Implementation:Speechbrain Speechbrain Brain Evaluate With ErrorRateStats
Heuristics
- Heuristic:Huggingface Datatrove FineWeb Filter Pipeline Order
- Heuristic:Snorkel team Snorkel NLP Preprocessor Memoization
- Heuristic:Cleanlab Cleanlab Confident Threshold Heuristic
- Heuristic:Zai org CogVideo Training Hyperparameter Defaults
- Heuristic:Openai Whisper Compression Ratio Threshold
- Heuristic:Hiyouga LLaMA Factory Mixed Precision Training Tips
- Heuristic:Facebookresearch Habitat lab VER Tuning Guidelines
- Heuristic:Astronomer Astronomer cosmos Deprecation Migration Paths
- Heuristic:AnswerDotAI RAGatouille In Memory Reranking Limits
- Heuristic:Kserve Kserve Prefix Cache Consistency
Environments
- Environment:Dagster io Dagster PostgreSQL Storage
- Environment:Huggingface Transformers BitsAndBytes Quantization Env
- Environment:Guardrails ai Guardrails Python 3 10 Runtime
- Environment:Explodinggradients Ragas Google Drive Backend Environment
- Environment:FlowiseAI Flowise Docker Environment
- Environment:Spotify Luigi AWS S3 Storage
- Environment:Apache Dolphinscheduler Node Pnpm Runtime
- Environment:Mistralai Client python Realtime Transcription Environment
- Environment:EvolvingLMMs Lab Lmms eval GPU Compute Environment
- Environment:Mlflow Mlflow OpenAI LLM Integration Environment