llama_chat_apply_template function - llamadart library

Apply chat template. Inspired by hf apply_chat_template() on python.

NOTE: This function does not use a jinja parser. It only support a pre-defined list of template. See more: https://github.com/ggml-org/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template @param tmpl A Jinja template to use for this chat. @param chat Pointer to a list of multiple llama_chat_message @param n_msg Number of llama_chat_message in this chat @param add_ass Whether to end the prompt with the token(s) that indicate the start of an assistant message. @param buf A buffer to hold the output formatted prompt. The recommended alloc size is 2 * (total number of characters of all messages) @param length The size of the allocated buffer @return The total number of bytes of the formatted prompt. If is it larger than the size of buffer, you may need to re-alloc it and then re-apply the template.

Implementation

@ffi.Native< ffi.Int32 Function( ffi.Pointer<ffi.Char>, ffi.Pointer<llama_chat_message>, ffi.Size, ffi.Bool, ffi.Pointer<ffi.Char>, ffi.Int32, ) >() external int llama_chat_apply_template( ffi.Pointer<ffi.Char> tmpl, ffi.Pointer<llama_chat_message> chat, int n_msg, bool add_ass, ffi.Pointer<ffi.Char> buf, int length, );