The examined GGUF metadata shows multiple missing provenance fields (e.g., author, license, source, training data) while an extensive embedded chat template is present. In environments that treat the template as a system message, the template can strongly influence “model identity” claims.
Practical implication: until a verifiable provenance/signing scheme exists, treat identity and policy claims produced by such a package as untrusted unless independently verified.
Download critical_kv.json · Download chat_template.txt
These files are provided for independent review and reproducibility.
The table below lists selected keys relevant to provenance and chat behavior. Values are shown as extracted.
| Key | Value |
|---|---|
general.architecture | gpt-oss |
general.name | Openai_Gpt Oss 20b |
general.author | <missing> |
general.description | <missing> |
general.license | <missing> |
general.source | <missing> |
general.training_data | <missing> |
general.system_prompt | <missing> |
general.safety_model | <missing> |
general.content_filter | <missing> |
tokenizer.chat_template | <long_string:15934_bytes> |
gpt-oss.context_length | 131072 |
gpt-oss.block_count | 24 |
gpt-oss.expert_count | 32 |
gpt-oss.expert_used_count | 4 |
Note: a key being present does not guarantee correctness; without cryptographic verification, KV content can be modified.
The excerpt below highlights the portion of the template that sets a default model_identity.
When a runtime uses this template, this text becomes part of the system prompt unless explicitly overridden.
ndif -%}
{%- endmacro -%}
{#- System Message Construction ============================================ #}
{%- macro build_system_message() -%}
{%- if model_identity is not defined %}
{%- set model_identity = "You are ChatGPT, a large language model trained by OpenAI." %}
{%- endif %}
{{- model_identity + "
" }}
{{- "Knowledge cutoff: 2024-06
" }}
{{- "Current date: " + strftime_now("%Y-%m-%d") + "
" }}
{%- if reasoning_effort is not defined %}
In local runtimes, “who the model is” is often defined by a combination of:
If provenance fields are missing but the template asserts a well-known identity, non-expert users may reasonably misinterpret the model’s origin or guarantees.
For GGUF-distributed models, a minimal trust baseline should enable users to verify: who produced the package, when it was produced, and what inputs/transformations were applied (base weights, quantization settings, template changes).
A practical approach is detached signatures over the model file hash (e.g., SHA-256) plus a transparency log record, similar to modern software supply-chain practices.