Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:LaurentMazare Tch rs Llama New

From Leeroopedia


Knowledge Sources
Domains NLP, Model_Architecture
Last Updated 2026-02-08 14:00 GMT

Overview

Concrete tool for constructing the LLaMA transformer model architecture in Rust provided by the tch-rs examples.

Description

Llama::new builds the complete LLaMA model: token embedding (wte), N transformer blocks (each with RmsNorm, CausalSelfAttention, MLP), final RmsNorm (ln_f), and output linear head (lm_head with no bias). Each block contains multi-head attention with RoPE and SwiGLU feed-forward. The llama wrapper function additionally precomputes rotary embeddings and wraps the model in a closure that applies temperature scaling and softmax.

Usage

Use to build a LLaMA model for text generation. After construction, load converted safetensors weights via the mmap loading pattern.

Code Reference

Source Location

  • Repository: tch-rs
  • File: examples/llama/main.rs
  • Lines: 244-259 (Llama::new), 261-271 (forward), 288-296 (wrapper)

Signature

impl Llama {
    fn new(vs: nn::Path, config: &Config) -> Self
    fn forward(&self, x: &Tensor, freqs_cis: &Tensor) -> Tensor
}

Import

// Internal to examples/llama/main.rs
// Uses tch::nn for layer construction
use tch::nn;

I/O Contract

Inputs

Name Type Required Description
vs nn::Path Yes VarStore path for parameter registration
config &Config Yes Model config (n_layer, n_head, n_embd, vocab_size)

Outputs

Name Type Description
Llama struct Model with wte (Embedding), blocks (Vec<Block>), ln_f (RmsNorm), lm_head (Linear)

Usage Examples

use tch::nn;

let vs = nn::VarStore::new(tch::Device::Cpu);
let config = Config::config_7b();
let llama = Llama::new(vs.root(), &config);

// After loading weights:
let logits = llama.forward(&token_ids, &freqs_cis);  // [1, 1, 32000]

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment