Install ONNX Runtime generate() API

Python package installation

Note: only one of these sets of packages (CPU, DirectML, CUDA) should be installed in your environment.

CPU

pip install onnxruntime-genai

DirectML

pip install onnxruntime-genai-directml

CUDA

If you are installing the CUDA variant of onnxruntime-genai, the CUDA toolkit must be installed.

The CUDA toolkit can be downloaded from the CUDA Toolkit Archive.

Ensure that the CUDA_PATH environment variable is set to the location of your CUDA installation.

CUDA 11

pip install onnxruntime-genai-cuda --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-11/pypi/simple/

CUDA 12

pip install onnxruntime-genai-cuda

Nuget package installation

Note: install only one of these packages (CPU, DirectML, CUDA) in your project.

Pre-requisites

ONNX Runtime dependency

ONNX Runtime generate() versions 0.3.0 and earlier came bundled with the core ONNX Runtime binaries. From version 0.4.0 onwards, the packages are separated to allow a more flexible developer experience.

CPU

dotnet add package Microsoft.ML.OnnxRuntimeGenAI

CUDA

Note: only CUDA 11 is supported for versions 0.3.0 and earlier, and only CUDA 12 is supported for versions 0.4.0 and later.

dotnet add package Microsoft.ML.OnnxRuntimeGenAI.Cuda

DirectML

dotnet add package Microsoft.ML.OnnxRuntimeGenAI.DirectML