llama_chat_apply_template function

  1. @Native<Int32 Function(Pointer<Char>, Pointer<llama_chat_message>, Size, Bool, Pointer<Char>, Int32)>(ffi.Pointer<ffi.Char>, ffi.Pointer<llama_chat_message>, ffi.Size, ffi.Bool, ffi.Pointer<ffi.Char>, ffi.Int32)>()
int llama_chat_apply_template(
  1. Pointer<Char> tmpl,
  2. Pointer<llama_chat_message> chat,
  3. int n_msg,
  4. bool add_ass,
  5. Pointer<Char> buf,
  6. int length,
)

Apply chat template. Inspired by hf apply_chat_template() on python.

NOTE: This function does not use a jinja parser. It only support a pre-defined list of template. See more: https://github.com/ggml-org/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template @param tmpl A Jinja template to use for this chat. @param chat Pointer to a list of multiple llama_chat_message @param n_msg Number of llama_chat_message in this chat @param add_ass Whether to end the prompt with the token(s) that indicate the start of an assistant message. @param buf A buffer to hold the output formatted prompt. The recommended alloc size is 2 * (total number of characters of all messages) @param length The size of the allocated buffer @return The total number of bytes of the formatted prompt. If is it larger than the size of buffer, you may need to re-alloc it and then re-apply the template.

Implementation

@ffi.Native<
  ffi.Int32 Function(
    ffi.Pointer<ffi.Char>,
    ffi.Pointer<llama_chat_message>,
    ffi.Size,
    ffi.Bool,
    ffi.Pointer<ffi.Char>,
    ffi.Int32,
  )
>()
external int llama_chat_apply_template(
  ffi.Pointer<ffi.Char> tmpl,
  ffi.Pointer<llama_chat_message> chat,
  int n_msg,
  bool add_ass,
  ffi.Pointer<ffi.Char> buf,
  int length,
);