nanochat-german-tokenizer / tokenizer_config.json
stefan-it's picture
feat: add tokenizer config
35fae2a verified
raw
history blame contribute delete
510 Bytes
{
"tokenizer_class": "PreTrainedTokenizerFast",
"bos_token": "<|bos|>",
"eos_token": "<|assistant_end|>",
"pad_token": "<|assistant_end|>",
"additional_special_tokens": [
"<|user_start|>",
"<|user_end|>",
"<|assistant_start|>",
"<|python_start|>",
"<|python_end|>",
"<|output_start|>",
"<|output_end|>"
],
"chat_template": "chat_template.jinja",
"model_input_names": [
"input_ids",
"attention_mask"
]
}