ONNX Runtime generate() C# API

Note: this API is in preview and is subject to change.

Overview

Model class

Constructor

public Model(string modelPath)

Generate method

public Sequences Generate(GeneratorParams generatorParams)

Tokenizer class

Constructor

public Tokenizer(Model model)

Encode method

public Sequences Encode(string str)

Encode batch method

public Sequences EncodeBatch(string[] strings)

Decode method

public string Decode(ReadOnlySpan<int> sequence)

Decode batch method

public string[] DecodeBatch(Sequences sequences)

Create stream method

public TokenizerStream CreateStream()

TokenizerStream class

Decode method

public string Decode(int token)

GeneratorParams class

Constructor

public GeneratorParams(Model model)

Set search option (double)

public void SetSearchOption(string searchOption, double value)

Set search option (bool) method

public void SetSearchOption(string searchOption, bool value)

Try graph capture with max batch size

 public void TryGraphCaptureWithMaxBatchSize(int maxBatchSize)

Set input ids method

public void SetInputIDs(ReadOnlySpan<int> inputIDs, ulong sequenceLength, ulong batchSize)

Set input sequences method

public void SetInputSequences(Sequences sequences)

Set model inputs

public void SetModelInput(string name, Tensor value)

Generator class

Constructor

public Generator(Model model, GeneratorParams generatorParams)

Is done method

public bool IsDone()

Compute logits

public void ComputeLogits()

Generate next token method

public void GenerateNextToken()

Get sequence

public ReadOnlySpan<int> GetSequence(ulong index)

Set active adapter

Sets the active adapter on this Generator instance.

using var model = new Model(modelPath);
using var genParams = new GeneratorParams(model);
using var generator = new Generator(model, genParams);
using var adapters = new Adapters(model);
string adapterName = "..."

generator.SetActiveAdapter(adapters, adapterName);

Parameters

  • adapters: the previously created Adapter object
  • adapterName: the name of the adapter to activate

Return value

void

Exception

Throws on error.

Sequences class

Num sequences member

public ulong NumSequences { get { return _numSequences; } }

[] operator

public ReadOnlySpan<int> this[ulong sequenceIndex]

Adapter class

This API is used to load and switch fine-tuned adapters, such as LoRA adapters.

Constructor

Construct an instance of an Adapter class.

using var model = new Model(modelPath);

using var adapters = new Adapters(model);

Parameters

  • model: a previously constructed model class

Load Adapter method

Loads an adapter file from disk.

string adapterPath = Path()
string adapterName = ...

adapters.LoadAdapter(adapterPath, adapterName);

Parameters

  • adapterPath: the path to the adapter file on disk
  • adapterName: a string identifier used to refer to the adapter in subsequent methods

Return value

void

Unload Adapter method

Unloads an adapter file from memory.

adapters.UnLoadAdapter(adapterName);

Parameters

  • adapterName: the name of the adapter to unload

Return value

void

Execption

Throws an exception on error.