RthItalia
/

PINDARO-AI-CODE

@@ -3,48 +3,61 @@ language:
 - en
 - it
 pipeline_tag: text-generation
 tags:
-- gguf
 - code
 - instruct
-- llama
 ---
-# MODEL_CARD - PINDARO AI CODE
-Date: 2026-03-02
-Model path: `e:\Pindaro\PINDARO AI CODE`
-## 1. Model Identity
-- Name: `PINDARO AI CODE`
-- Family: LLaMA-style causal LM
-- Intended role: coding assistant
-- Format support:
-  - Hugging Face (`model.safetensors`)
-  - GGUF F16 (`pindaro-f16.gguf`)
-  - GGUF Q4_K_M (`pindaro-q4_k_m.gguf`)
-## 2. Technical Specs
 - Architecture: `LlamaForCausalLM`
-- `model_type`: `llama`
-- Layers: `22`
-- Hidden size: `2048`
-- Attention heads: `32`
-- KV heads: `4`
-- Intermediate size: `5632`
-- Max context: `2048`
-- Vocab size: `32002`
-- Tensor count in safetensors: `201`
-- Parameter count (computed): `1,100,056,576`
-- Dtype in config: `float16`
-## 3. Chat / Prompt Format
-Template is aligned to registered special tokens:
 - `<|noesis|>` (id `32000`)
 - `<|end|>` (id `32001`)
-Configured template:
-````jinja
 {{ bos_token }}{% for message in messages %}<|noesis|>
 {% if message['role'] == 'system' %}### System
 {{ message['content'] }}
@@ -57,61 +70,33 @@ Configured template:
 ### Answer
 ```
 {% endif %}
-````
-## 4. Local Artifact Integrity (SHA256)
-- `model.safetensors`: `F77C27B8BABF9FCAB83A7DC68BA58934E8C8C031C9F10B4B73E802D4FBFE0CEC`
-- `config.json`: `B37C45060F3E2F5F9B91903C9CCB32F3C21076E809954FDA6C01D987CD8F25CC`
-- `generation_config.json`: `6FF47E725C0EC6D0F1895670DE7EE68E61A4F99703F6C8E89AEA6AB14EA02DC3`
-- `tokenizer.json`: `51433F06369AC3E597DFA23A811215E3511B8F86588A830DED72344B76A193EE`
-- `tokenizer_config.json`: `A0567C49A117AF9AF332874CFD333DDD622A09C5E9765131CEEE6344CB22A3DE`
-- `tokenizer.model`: `9E556AFD44213B6BD1BE2B850EBBBD98F5481437A8021AFAF58EE7FB1818D347`
-- `special_tokens_map.json`: `D7805E093432AFCDE852968CDEBA3DE08A6FE66E77609F4701DECB87FC492F33`
-- `added_tokens.json`: `ECE349D292E246EAC9A9072C1730F023E61567984A828FB0D25DCCB14E3B7592`
-- `pindaro-f16.gguf`: `BDAAEB6FB712E9A4D952082CF415B05C7D076B33786D39063BBFB3A7E5DB2031`
-- `pindaro-q4_k_m.gguf`: `5F98CC3454774ED5ED80D71A71ADFD0DAFF760FC9EEF0900DDD4F7EDA2E20FEF`
-## 5. Smoke Tests (2026-03-02)
-Environment:
-- Python `3.11.9`
-- Transformers `4.57.3`
-- Torch `2.10.0+cpu`
-Results:
-- AutoConfig load: PASS
-- AutoTokenizer load: PASS
-- AutoModel load: PASS
-- Chat-template render: PASS
-- Template special-token alignment: PASS
-- Deterministic generation: PASS
-Observed non-blocking warning:
-- Folder name with spaces may trigger a Python module-name warning in some runtimes.
-## 6. Known Issues
-1. Folder-name warning risk
-- `PINDARO AI CODE` has spaces; some tools warn on module naming.
-2. Attention-mask warning in some calls
-- As `pad_token` equals `eos_token`, pass `attention_mask` explicitly for stable behavior.
-## 7. Recommended Next Steps
-1. Optional packaging cleanup
-- Rename folder to a no-space slug (example: `PINDARO_AI_CODE`) when compatible with your deployment scripts.
-2. Add coding eval gate
-- HumanEval pass@1
-- MBPP subset
-- Prompt-format adherence checks
-## 8. Usage Example
 ```python
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
-path = r"e:\Pindaro\PINDARO AI CODE"
-tokenizer = AutoTokenizer.from_pretrained(path, local_files_only=True)
-model = AutoModelForCausalLM.from_pretrained(path, local_files_only=True, dtype=torch.float16)
 messages = [
     {"role": "system", "content": "You are a coding assistant."},
@@ -124,16 +109,64 @@ inputs = tokenizer.apply_chat_template(
     add_generation_prompt=True,
     return_tensors="pt",
 )
-outputs = model.generate(inputs, max_new_tokens=80, do_sample=False)
 print(tokenizer.decode(outputs[0], skip_special_tokens=False))
 ```
-## 9. Limitations and Safety
-- No training-data statement is included in this folder.
-- No official benchmark sheet is included.
-- Code generation can be plausible but wrong; always run tests.
-## 10. Release Readiness
-Current status: READY FOR LOCAL USE.
-- Packaging/runtime blockers are resolved.
-- Remaining items are evaluation and packaging polish.

 - en
 - it
 pipeline_tag: text-generation
+library_name: transformers
 tags:
+- llama
 - code
+- coding-assistant
+- gguf
 - instruct
+- 1b
 ---
+# PINDARO AI CODE
+PINDARO AI CODE is the code-specialized release of the Pindaro model family.
+## Model At A Glance
 - Architecture: `LlamaForCausalLM`
+- Model type: `llama`
+- Approx. parameters: **~1.1B**
+- Precision: `float16`
+- Context length: `2048`
+- Vocabulary size: `32002`
+- Languages: English, Italian
+- Primary use: code generation and coding assistance
+## Included Artifacts
+Hugging Face format:
+- `model.safetensors`
+- `config.json`
+- `generation_config.json`
+- `tokenizer.json`
+- `tokenizer.model`
+- `tokenizer_config.json`
+- `special_tokens_map.json`
+- `added_tokens.json`
+GGUF format:
+- `pindaro-f16.gguf`
+- `pindaro-q4_k_m.gguf`
+Release docs:
+- `release/RELEASE_MANIFEST.json`
+- `release/RELEASE_NOTES.md`
+- `release/SHA256SUMS.txt`
+## Prompt Format
+Special tokens:
 - `<|noesis|>` (id `32000`)
 - `<|end|>` (id `32001`)
+Configured chat template uses role sections and appends a code-fence prefix in generation prompt:
+```jinja
 {{ bos_token }}{% for message in messages %}<|noesis|>
 {% if message['role'] == 'system' %}### System
 {{ message['content'] }}
 ### Answer
 ```
 {% endif %}
+```
+Minimal manual prompt example:
+```text
+<|noesis|>
+### Question
+Write a Python function add(a, b).
+<|end|>
+<|noesis|>
+### Answer
+```
+```
+## Quickstart (Transformers)
 ```python
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
+model_id = "RthItalia/PINDARO-AI-CODE"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.float16,
+)
 messages = [
     {"role": "system", "content": "You are a coding assistant."},
     add_generation_prompt=True,
     return_tensors="pt",
 )
+attention_mask = torch.ones_like(inputs)
+outputs = model.generate(
+    inputs,
+    attention_mask=attention_mask,
+    max_new_tokens=120,
+    do_sample=False,
+)
 print(tokenizer.decode(outputs[0], skip_special_tokens=False))
 ```
+## Quickstart (GGUF / llama.cpp)
+```bash
+./llama-cli -m pindaro-q4_k_m.gguf -p "<|noesis|>
+### Question
+Write a Python function add(a, b).
+<|end|>
+<|noesis|>
+### Answer
+```" -n 120
+```
+## Validation Snapshot
+Last internal validation snapshot: **2026-03-02**
+- HF smoke tests: PASS
+- HF mini-eval coding quality: **1.00**
+- GGUF F16 quality gate: PASS
+- GGUF Q4_K_M quality gate: PASS
+- Release verdict: **publishable: true**
+Notes:
+- Results are from internal sanity checks, not a full public benchmark suite.
+## Known Limitations
+- Generated code can be syntactically correct but logically wrong.
+- May emit verbose outputs or repeated scaffolding.
+- Always run tests and static checks on generated code.
+## Safety
+- Do not execute generated code in privileged environments without review.
+- Use sandboxing for untrusted snippets.
+- Add dependency and secret scanning in deployment workflows.
+## Artifact Checksums (SHA256)
+- `model.safetensors`: `f77c27b8babf9fcab83a7dc68ba58934e8c8c031c9f10b4b73e802d4fbfe0cec`
+- `config.json`: `b37c45060f3e2f5f9b91903c9ccb32f3c21076e809954fda6c01d987cd8f25cc`
+- `generation_config.json`: `6ff47e725c0ec6d0f1895670de7ee68e61a4f99703f6c8e89aea6ab14ea02dc3`
+- `tokenizer.json`: `51433f06369ac3e597dfa23a811215e3511b8f86588a830ded72344b76a193ee`
+- `tokenizer.model`: `9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347`
+- `tokenizer_config.json`: `a0567c49a117af9af332874cfd333ddd622a09c5e9765131ceee6344cb22a3de`
+- `special_tokens_map.json`: `d7805e093432afcde852968cdeba3de08a6fe66e77609f4701decb87fc492f33`
+- `added_tokens.json`: `ece349d292e246eac9a9072c1730f023e61567984a828fb0d25dccb14e3b7592`
+- `pindaro-f16.gguf`: `bdaaeb6fb712e9a4d952082cf415b05c7d076b33786d39063bbfb3a7e5db2031`
+- `pindaro-q4_k_m.gguf`: `5f98cc3454774ed5ed80d71a71adfd0daff760fc9eef0900ddd4f7eda2e20fef`