banyanhq
/

banyan-5b-deep

@@ -1,4 +1,18 @@
-HF Export: Banyan 5B Deep (T5 tokenizer)
 Contents
 - model-00001-of-00001.safetensors, model.safetensors.index.json
@@ -6,7 +20,7 @@ Contents
 - tokenizer.json, tokenizer_config.json, special_tokens_map.json, spiece.model (custom T5)
 - generation_config.json
-Usage
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
@@ -21,8 +35,7 @@ out = model.generate(**enc, max_new_tokens=64, do_sample=True, temperature=0.8)
 print(tok.decode(out[0], skip_special_tokens=False))
 ```
-Notes
 - The tokenizer is SentencePiece-based (T5). Do not add EOS at prompt time; use `add_special_tokens=False` when tokenizing prompts for generation.
 - The model config is tailored to vocab_size=32100 and rope_theta=500000.
 - If you prefer multi-shard weights, provide a `model.safetensors.index.json` and re-save.

+---
+language:
+  - en
+library_name: transformers
+license: other
+pipeline_tag: text-generation
+tags:
+  - causal-lm
+  - llama
+  - sharded
+  - t5-tokenizer
+# base_model intentionally omitted for a custom model
+---
+# HF Export: Banyan 5B Deep (T5 tokenizer)
 Contents
 - model-00001-of-00001.safetensors, model.safetensors.index.json
 - tokenizer.json, tokenizer_config.json, special_tokens_map.json, spiece.model (custom T5)
 - generation_config.json
+# Usage
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
 print(tok.decode(out[0], skip_special_tokens=False))
 ```
+# Notes
 - The tokenizer is SentencePiece-based (T5). Do not add EOS at prompt time; use `add_special_tokens=False` when tokenizing prompts for generation.
 - The model config is tailored to vocab_size=32100 and rope_theta=500000.
 - If you prefer multi-shard weights, provide a `model.safetensors.index.json` and re-save.