Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Alibaba MNN WinogradGenerateCL

From Leeroopedia


Property Value
Page Type Implementation
Repository Alibaba MNN
Source File backupcode/winogradGenerateCL.cpp (281 lines)
Language C++
Domains GPU_Computing, Convolution, Code_Generation
Date 2026-02-10

Overview

⚠️ DEPRECATED: This tool resides in the backupcode/ directory and has been superseded by pre-generated static OpenCL kernels in source/backend/opencl/execution/cl/. See Alibaba_MNN_Warning_Deprecated_Winograd_Codegen for details.

WinogradGenerateCL is a command-line code generation tool that produces OpenCL kernel source files for Winograd convolution transforms. The tool takes numerical parameters describing the desired transform configuration (unit size, kernel size, and interpolation factor) and emits optimized .cl kernel files suitable for GPU-accelerated inference.

The Winograd algorithm reduces the number of multiplications needed for convolution by transforming inputs and weights into a special domain where the convolution becomes element-wise multiplication. This tool automates the generation of the GPU kernel code that performs the source transform (input tile transformation) and destination transform (output reconstruction) on OpenCL-capable devices.

The generated kernels embed precomputed transform matrices as literal constants, avoiding the runtime overhead of matrix construction and enabling the OpenCL compiler to apply constant folding and other optimizations.

Code Reference

Primary Entry Point

Function Location Description
main(int argc, const char* argv[]) backupcode/winogradGenerateCL.cpp:L92-281 Parses CLI arguments, computes Winograd transform matrices using WinogradGenerater, and writes OpenCL kernel source to output files.
_printFloat(ostream&, float) backupcode/winogradGenerateCL.cpp:L84-90 Helper that formats floating-point values for embedding in generated kernel source, handling precision and special-case formatting.

Function Signature

// backupcode/winogradGenerateCL.cpp:L92
int main(int argc, const char* argv[])

Key Includes

#include <MNN/MNNDefine.h>
#include "math/Matrix.hpp"
#include "math/WingoradGenerater.hpp"

I/O Contract

Inputs

Parameter Type Description
unit int (CLI arg 1) The output tile size m in the Winograd F(m, r) formulation. Determines how many output elements are computed per tile.
kernelSize int (CLI arg 2) The convolution kernel size r. Combined with unit, determines the input tile size as unit + kernelSize - 1.
interp float (CLI arg 3) Interpolation factor controlling the choice of transform interpolation points. Affects numerical stability and accuracy of the generated transforms.

Outputs

Output Format Description
OpenCL source transform kernel .cl file Kernel performing B^T * d * B to transform input tiles into the Winograd domain.
OpenCL destination transform kernel .cl file Kernel performing A^T * M * A to reconstruct output tiles from element-wise multiplication results.

Usage Examples

Basic Invocation

Generate Winograd F(2,3) kernels (2x2 output tile, 3x3 kernel):

./winogradGenerateCL 2 3 0.5

This produces OpenCL kernel files containing the source and destination transforms for a 4x4 input tile (2 + 3 - 1 = 4).

Generating F(4,3) Kernels

For larger output tiles with a 3x3 convolution kernel:

./winogradGenerateCL 4 3 1.0

This generates transforms operating on 6x6 input tiles, offering a higher arithmetic savings ratio at the cost of increased transform overhead.

Internal Workflow

  1. Parse command-line arguments for unit size, kernel size, and interpolation factor.
  2. Instantiate WinogradGenerater with the specified parameters.
  3. Compute the source transform matrix B and destination transform matrix A.
  4. Iterate over matrix elements, using _printFloat to emit each coefficient as an OpenCL literal.
  5. Write the complete kernel source to output .cl files, embedding the transform matrices as inline constants.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment