llama_chat_apply_template function
Apply chat template. Inspired by hf apply_chat_template() on python.
NOTE: This function does not use a jinja parser. It only support a pre-defined list of template. See more: https://github.com/ggml-org/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template @param tmpl A Jinja template to use for this chat. @param chat Pointer to a list of multiple llama_chat_message @param n_msg Number of llama_chat_message in this chat @param add_ass Whether to end the prompt with the token(s) that indicate the start of an assistant message. @param buf A buffer to hold the output formatted prompt. The recommended alloc size is 2 * (total number of characters of all messages) @param length The size of the allocated buffer @return The total number of bytes of the formatted prompt. If is it larger than the size of buffer, you may need to re-alloc it and then re-apply the template.
Implementation
@ffi.Native<
ffi.Int32 Function(
ffi.Pointer<ffi.Char>,
ffi.Pointer<llama_chat_message>,
ffi.Size,
ffi.Bool,
ffi.Pointer<ffi.Char>,
ffi.Int32,
)
>()
external int llama_chat_apply_template(
ffi.Pointer<ffi.Char> tmpl,
ffi.Pointer<llama_chat_message> chat,
int n_msg,
bool add_ass,
ffi.Pointer<ffi.Char> buf,
int length,
);