SRI / Facts
gpt-oss (GGUF)

Facts: gpt-oss (GGUF) — Provenance fields & embedded Chat Template

Generated: 2025-12-16 11:55:35 · This page documents observable metadata and template content extracted from a GGUF file, without assuming any vendor identity.

Warning (operational)

The examined GGUF metadata shows multiple missing provenance fields (e.g., author, license, source, training data) while an extensive embedded chat template is present. In environments that treat the template as a system message, the template can strongly influence “model identity” claims.

Practical implication: until a verifiable provenance/signing scheme exists, treat identity and policy claims produced by such a package as untrusted unless independently verified.

Artifacts

Download critical_kv.json · Download chat_template.txt

These files are provided for independent review and reproducibility.

Observed GGUF KV fields (selected)

The table below lists selected keys relevant to provenance and chat behavior. Values are shown as extracted.

KeyValue
general.architecturegpt-oss
general.nameOpenai_Gpt Oss 20b
general.author<missing>
general.description<missing>
general.license<missing>
general.source<missing>
general.training_data<missing>
general.system_prompt<missing>
general.safety_model<missing>
general.content_filter<missing>
tokenizer.chat_template<long_string:15934_bytes>
gpt-oss.context_length131072
gpt-oss.block_count24
gpt-oss.expert_count32
gpt-oss.expert_used_count4

Note: a key being present does not guarantee correctness; without cryptographic verification, KV content can be modified.

Chat Template excerpt (identity injection)

The excerpt below highlights the portion of the template that sets a default model_identity. When a runtime uses this template, this text becomes part of the system prompt unless explicitly overridden.

ndif -%}
{%- endmacro -%}

{#- System Message Construction ============================================ #}
{%- macro build_system_message() -%}
    {%- if model_identity is not defined %}
        {%- set model_identity = "You are ChatGPT, a large language model trained by OpenAI." %}
    {%- endif %}
    {{- model_identity + "
" }}
    {{- "Knowledge cutoff: 2024-06
" }}
    {{- "Current date: " + strftime_now("%Y-%m-%d") + "

" }}
    {%- if reasoning_effort is not defined %}

Why this matters

In local runtimes, “who the model is” is often defined by a combination of:

  • packaged KV metadata (names, descriptions, provenance fields)
  • the embedded chat template (system/developer messages and tool scaffolding)
  • the runtime’s own default prompts and UI labels

If provenance fields are missing but the template asserts a well-known identity, non-expert users may reasonably misinterpret the model’s origin or guarantees.

Recommendation: provenance & signing (high-level)

For GGUF-distributed models, a minimal trust baseline should enable users to verify: who produced the package, when it was produced, and what inputs/transformations were applied (base weights, quantization settings, template changes).

A practical approach is detached signatures over the model file hash (e.g., SHA-256) plus a transparency log record, similar to modern software supply-chain practices.