Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Google deepmind Mujoco Engine Util Sparse AVX

From Leeroopedia
Knowledge Sources
Domains Physics Simulation, SIMD Optimization, Sparse Linear Algebra
Last Updated 2026-02-15 04:00 GMT

Overview

Header-only AVX (Advanced Vector Extensions) SIMD implementations of performance-critical sparse operations for MuJoCo, providing 4-wide double-precision vectorized computation.

Description

This header provides AVX-optimized implementations of sparse linear algebra operations that are conditionally compiled when mjUSEPLATFORMSIMD is defined and the __AVX__ compiler flag is present (double precision only). The key functions are: mju_dotSparse_avx (sparse dot product using 256-bit AVX registers to process 4 doubles in parallel with manual gather from indexed elements, horizontal reduction via 128-bit extract and add), mju_dotSparseX3_avx (batched sparse dot product for 3 vectors simultaneously, reusing the gathered vec2 values across all three dot products for supernode optimization), mju_mulMatVecSparse_avx (sparse matrix-vector multiplication with supernode support, dispatching rows in blocks of 3 via dotSparseX3), mju_addToSclScl_avx (res = res*scl1 + vec*scl2 using vectorized multiply-add), and mju_compare_avx (integer vector comparison using SSE2 128-bit operations). Each function processes elements in chunks of 4 with a scalar tail loop for remaining elements.

Usage

These functions are called via compile-time dispatch from the non-AVX versions in engine_util_sparse.h and engine_util_sparse.c when the platform supports AVX instructions, transparently accelerating sparse operations throughout the solver pipeline.

Code Reference

Source Location

Key Functions

// Sparse dot product with AVX (4-wide double)
static inline
mjtNum mju_dotSparse_avx(const mjtNum* vec1, const mjtNum* vec2,
                         int nnz1, const int* ind1);

// Batched sparse dot product for 3 vectors (supernode)
static inline
void mju_dotSparseX3_avx(mjtNum* res0, mjtNum* res1, mjtNum* res2,
                         const mjtNum* vec10, const mjtNum* vec11,
                         const mjtNum* vec12, const mjtNum* vec2,
                         int nnz1, const int* ind1);

// Sparse matrix-vector multiply with AVX and supernode support
static inline
void mju_mulMatVecSparse_avx(mjtNum* res, const mjtNum* mat, const mjtNum* vec,
                             int nr, const int* rownnz, const int* rowadr,
                             const int* colind, const int* rowsuper);

// Vectorized scaled addition: res = res*scl1 + vec*scl2
static inline
void mju_addToSclScl_avx(mjtNum* res, const mjtNum* vec,
                         mjtNum scl1, mjtNum scl2, int n);

// Integer vector comparison using SSE2
static inline
int mju_compare_avx(const int* vec1, const int* vec2, int n);

Import

#include "engine/engine_util_sparse_avx.h"

I/O Contract

Inputs

Name Type Required Description
vec1 mjtNum* Yes Sparse vector values (contiguous, indexed by ind1)
vec2 mjtNum* Yes Dense vector (indexed by ind1 for gather)
nnz1 int Yes Number of non-zero elements in sparse vector
ind1 int* Yes Indices of non-zero elements
mat mjtNum* Yes Sparse matrix values in CSR format
rowsuper int* No Supernode sizes for batched row processing

Outputs

Name Type Description
return value mjtNum Dot product result (for dotSparse_avx)
res0, res1, res2 mjtNum* Three dot product results (for dotSparseX3_avx)
res mjtNum* Result vector for matrix-vector multiply or scaled addition
return value (compare) int 1 if vectors equal, 0 otherwise

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment