LeviDeHaan commited on
Commit
69e40b2
·
verified ·
1 Parent(s): 2e7f22d

Initial upload of SmolNewsAnalysis-002

Browse files
MODEL_CARD.md ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: HuggingFaceTB/SmolLM2-360M-Instruct
5
+ pipeline_tag: text-generation
6
+ model-index:
7
+ - name: SmolNewsAnalysis-002
8
+ results:
9
+ - task:
10
+ type: text-generation
11
+ name: Financial news JSON scoring
12
+ metrics:
13
+ - name: Train loss
14
+ type: loss
15
+ value: 0.0925
16
+ ---
17
+
18
+ # SmolNewsAnalysis-002 — SmolLM2-360M Financial News JSON Analyst
19
+
20
+ ## Model Details
21
+ - **Developer** [Levi De Haan](https://levidehaan.com/)
22
+ - **Base model** `HuggingFaceTB/SmolLM2-360M-Instruct` (360M parameter decoder-only transformer).
23
+ - **Architecture** SmolLM-compatible causal language model with `<|im_start|>` chat formatting.
24
+ - **Fine-tuning method** LoRA adapters (rank 8, alpha 16, dropout 0) trained with LLaMA-Factory and merged into the base weights.
25
+ - **License** Apache-2.0 (inherits the base model license). No additional restrictions are applied.
26
+
27
+ ## Intended Use
28
+ - **Primary objective** Convert financial news headlines and summaries into a compact JSON object containing `symbol`, `site`, `source_name`, `sentiment_score`, `sentiment_confidence`, `wow_score`, and `wow_confidence` for ingestion by the Twatter news pipeline (`src/stock_news_processor.py`).
29
+ - **Input scope** Short- to mid-length finance news briefs gathered from Alpaca/FMP feeds or similar sources.
30
+ - **Out-of-scope** General-purpose chat, long-form articles beyond the trimmed 1 800-character window, or non-financial domains.
31
+
32
+ ## Prompt Template
33
+ - **System message** automatically injected by `chat_template.jinja` and the `Modelfile`:
34
+
35
+ ```text
36
+ You are a precise financial news analyst. Read the news text and output a compact JSON with fields: symbol, site, source_name, sentiment_score, sentiment_confidence, wow_score, wow_confidence. Output only the JSON without commentary.
37
+ ```
38
+
39
+ - **SmolLM chat framing**:
40
+
41
+ ```text
42
+ <|im_start|>system
43
+ You are a precise financial news analyst. Read the news text and output a compact JSON with fields: symbol, site, source_name, sentiment_score, sentiment_confidence, wow_score, wow_confidence. Output only the JSON without commentary.<|im_end|>
44
+ <|im_start|>user
45
+ <news article title/summary + metadata>
46
+ <|im_end|>
47
+ <|im_start|>assistant
48
+ ```
49
+
50
+ - **`[INST]` format** when `TRAINED_MODEL_TYPE="llama"` in `SharedLLMManager`:
51
+
52
+ ```text
53
+ <s>[INST] <<SYS>>
54
+ You are a precise financial news analyst. Read the news text and output a compact JSON with fields: symbol, site, source_name, sentiment_score, sentiment_confidence, wow_score, wow_confidence. Output only the JSON without commentary.
55
+ <</SYS>>
56
+
57
+ <news article title/summary + metadata> [/INST]
58
+ ```
59
+
60
+ The model is optimized to emit a single JSON object; downstream parsing stops at the first closing brace.
61
+
62
+ ## Training Data
63
+ - **Dataset** `training_data/news_data/stock_news_training.json` (1506 deduplicated instruction/response pairs) produced via `extract_stock_json_news_training.py`.
64
+ - **Composition** finance news with ticker/site metadata and minified JSON labels.
65
+
66
+ ## Training Procedure
67
+ - **Frameworks** LLaMA-Factory + PEFT (LoRA) on bf16 hardware.
68
+ - **Key hyperparameters** `learning_rate=5e-5`, `per_device_train_batch_size=2`, `gradient_accumulation_steps=8`, `num_train_epochs=10`, `cutoff_len=2048`, `lora_r=8`, `lora_alpha=16`, `lora_dropout=0`, `lr_scheduler_type=cosine_with_restarts`, `max_grad_norm=1.0`, `warmup_steps=0`.
69
+ - **Tokens seen** 8757824 (`num_input_tokens_seen` in `all_results.json`).
70
+ - **Final train loss** 0.0925 (see `LLaMA-Factory/saves/SmolLM2-360M-Instruct/lora/financial_news_model_json/all_results.json`).
71
+
72
+
73
+ ## Limitations
74
+ - **Domain** Focused on short-form financial news; additional preprocessing may be required for long articles.
75
+ - **Ticker detection** Relies on upstream metadata—empty `symbol` fields indicate the source lacked a ticker.
76
+ - **JSON validity** Typically robust, yet integrating systems should validate responses before use.
77
+ - **Temporal awareness** Model knowledge reflects historical data snapshots and does not account for real-time events.
78
+
79
+ ## Usage Example
80
+
81
+ ```python
82
+ from transformers import AutoTokenizer, AutoModelForCausalLM
83
+
84
+ model_id = "LeviDeHaan/SmolNewsAnalysis-002"
85
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
86
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto")
87
+
88
+ prompt = """<|im_start|>system\nYou are a precise financial news analyst...<|im_end|>\n"""
89
+ prompt += "<|im_start|>user\nTesla shares climb after deliveries beat expectations. Symbol: TSLA Site: bloomberg.com\n<|im_end|>\n<|im_start|>assistant\n"
90
+
91
+ inputs = tokenizer(prompt, return_tensors="pt")
92
+ outputs = model.generate(**inputs, max_new_tokens=160, temperature=0.1)
93
+ print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
94
+ ```
95
+
96
+ ## Contact
97
+ - **Maintainer** [Levi De Haan](https://levidehaan.com/)
98
+ - **Project page** https://levidehaan.com/projects/twatter
99
+ - **Hugging Face discussions** https://huggingface.co/LeviDeHaan/SmolNewsAnalysis-002/discussions
Modelfile ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ollama modelfile auto-generated by llamafactory
2
+
3
+ FROM .
4
+
5
+ TEMPLATE """{{ if .System }}<|im_start|>system
6
+ {{ .System }}<|im_end|>
7
+ {{ end }}{{ range .Messages }}{{ if eq .Role "user" }}<|im_start|>user
8
+ {{ .Content }}<|im_end|>
9
+ <|im_start|>assistant
10
+ {{ else if eq .Role "assistant" }}{{ .Content }}<|im_end|>
11
+ {{ end }}{{ end }}"""
12
+
13
+ SYSTEM """You are a precise financial news analyst. Read the news text and output a compact JSON with fields: symbol, site, source_name, sentiment_score, sentiment_confidence, wow_score, wow_confidence. Output only the JSON without commentary."""
14
+
15
+ PARAMETER stop "<|im_end|>"
16
+ PARAMETER num_ctx 4096
README.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SmolLM2-360M Financial News JSON Analyst (`SmolNewsAnalysis-002`)
2
+
3
+ - **Hugging Face model card**: https://huggingface.co/LeviDeHaan/SmolNewsAnalysis-002
4
+ - **Author**: [Levi De Haan](https://levidehaan.com/)
5
+ - **Project overview**: https://levidehaan.com/projects/twatter
6
+
7
+ ## Overview
8
+ - **Purpose** Fine-tuned SmolLM2-360M-Instruct to summarize Alpaca/FMP financial news into structured sentiment + significance scores consumed by the Twatter news pipeline.
9
+ - **Base model** `HuggingFaceTB/SmolLM2-360M-Instruct` (360M parameters, Apache-2.0).
10
+ - **Architecture** Decoder-only transformer compatible with SmolLM chat formatting (`...`).
11
+ - **Finetuning method** LoRA adapters (rank 8, alpha 16, dropout 0) merged into base weights post-training.
12
+ - **Repository integration** Loaded via `SharedLLMManager.TrainedModelClient` and invoked by `stock_news_processor.py` for Alpaca feed scoring.
13
+
14
+ ## What it Predicts
15
+ - **sentiment_score** Float in `[-1, 1]` summarizing bullish/bearish tone.
16
+ - **sentiment_confidence** Model confidence for sentiment score (float `0-1`).
17
+ - **wow_score** Market impact category normalized to `Extremely Bad News`, `Bad News`, `Meh News`, `Regular News`, `Big News`, or `Huge News`.
18
+ - **wow_confidence** Confidence for wow_score (float `0-1`).
19
+ - **symbol** Canonical ticker symbol extracted from the article payload (may be empty if missing upstream).
20
+ - **site / source_name** Strings describing origin; pass-through of Alpaca metadata for downstream routing.
21
+
22
+ ## Prompt Format
23
+ - **System message** Injected automatically by `chat_template.jinja` and `Modelfile` when missing:
24
+
25
+ ```text
26
+ You are a precise financial news analyst. Read the news text and output a compact JSON with fields: symbol, site, source_name, sentiment_score, sentiment_confidence, wow_score, wow_confidence. Output only the JSON without commentary.
27
+ ```
28
+
29
+ - **Full chat template** used for SmolLM-style prompts:
30
+
31
+ ```text
32
+ <|im_start|>system
33
+ You are a precise financial news analyst. Read the news text and output a compact JSON with fields: symbol, site, source_name, sentiment_score, sentiment_confidence, wow_score, wow_confidence. Output only the JSON without commentary.<|im_end|>
34
+ <|im_start|>user
35
+ <news article title/summary + metadata>
36
+ <|im_end|>
37
+ <|im_start|>assistant
38
+
39
+ ```
40
+
41
+ - **Alternative `[INST]` framing** used in `SharedLLMManager.TrainedModelClient.generate()` when `TRAINED_MODEL_TYPE="llama"`:
42
+
43
+ ```text
44
+ <s>[INST] <<SYS>>
45
+ You are a precise financial news analyst. Read the news text and output a compact JSON with fields: symbol, site, source_name, sentiment_score, sentiment_confidence, wow_score, wow_confidence. Output only the JSON without commentary.
46
+ <</SYS>>
47
+
48
+ <news article title/summary + metadata> [/INST]
49
+ ```
50
+
51
+ **Output** Must be a single JSON object; downstream parsing stops at the first closing brace.
52
+
53
+ ## Quick Start
54
+
55
+ ```python
56
+ from transformers import AutoTokenizer, AutoModelForCausalLM
57
+
58
+ model_id = "LeviDeHaan/SmolNewsAnalysis-002"
59
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
60
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto")
61
+
62
+ prompt = """<|im_start|>system\nYou are a precise financial news analyst...<|im_end|>\n"""
63
+ prompt += "<|im_start|>user\nTesla shares climb after deliveries beat expectations. Symbol: TSLA Site: bloomberg.com\n<|im_end|>\n<|im_start|>assistant\n"
64
+
65
+ inputs = tokenizer(prompt, return_tensors="pt")
66
+ outputs = model.generate(**inputs, max_new_tokens=160, temperature=0.1)
67
+ response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
68
+ print(response)
69
+ ```
70
+
71
+ ## Training Data
72
+ - **Source** Aggregated Alpaca/FMP news processed by `stock_news_processor.py` and exported through `extract_stock_json_news_training.py` to `training_data/news_data/stock_news_training.json`.
73
+ - **Samples** 1 506 deduplicated instruction/response pairs (hash dedupe over title/summary + ticker + site).
74
+ - **Wow distribution** `Big News` 645, `Regular News` 272, `Bad News` 253, `Huge News` 198, `Meh News` 120, `Extremely Bad News` 16 (plus 2 legacy `Bad News (negative but not catastrophic)` entries coerced to canonical values at inference time).
75
+ - **Content** News titles, summaries, symbol/site metadata, and minified JSON outputs describing sentiment and impact.
76
+
77
+ ## Training Procedure
78
+ - **Framework** LLaMA-Factory (SmolLM2 template) + PEFT LoRA on bf16 accelerators.
79
+ - **Hyperparameters** `learning_rate=5e-5`, `per_device_train_batch_size=2`, `gradient_accumulation_steps=8`, `num_train_epochs=10`, `cutoff_len=2048`, `lora_r=8`, `lora_alpha=16`, `lora_dropout=0`, `lr_scheduler_type=cosine_with_restarts`, `max_grad_norm=1.0`, `warmup_steps=0`.
80
+ - **Tokens seen** 8 757 824 (`num_input_tokens_seen` in `all_results.json`).
81
+ - **Final train loss** 0.0925.
82
+ - **Adapters** Merged into base weights before export; no LoRA files required at inference.
83
+
84
+ ## Known Limitations
85
+ - **Domain specificity** Tuned on short news briefs; long-form articles may require additional summarization or truncation (`stock_news_processor.py` trims to 1800 chars).
86
+ - **JSON adherence** Strong but still validate output to guard against malformed fields.
87
+ - **Ticker coverage** Relies on upstream symbol detection; missing tickers yield blank `symbol` values.
88
+ - **Wow taxonomy** Responses outside the canonical set default to `Regular News` via analyzer normalization.
89
+ - **Market latency** Model reflects historical data only; no real-time awareness or price prediction.
90
+
91
+ ## License
92
+ - **Model weights** Apache-2.0 (inherits from base model).
93
+
94
+ ## Contact & Support
95
+ - **Maintainer** [Levi De Haan](https://levidehaan.com/)
96
+ - **Project page** https://levidehaan.com/projects/twatter
97
+ - **Hugging Face discussions** https://huggingface.co/LeviDeHaan/SmolNewsAnalysis-002/discussions
chat_template.jinja ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {% for message in messages %}
2
+ {% if loop.first and messages[0]['role'] != 'system' %}
3
+ {{ '<|im_start|>system\nYou are a precise financial news analyst. Read the news text and output a compact JSON with fields: symbol, site, source_name, sentiment_score, sentiment_confidence, wow_score, wow_confidence. Output only the JSON without commentary.<|im_end|>\n' }}
4
+ {% endif %}
5
+ {{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>\n' }}
6
+ {% endfor %}
7
+ {% if add_generation_prompt %}
8
+ {{ '<|im_start|>assistant\n' }}
9
+ {% endif %}
config.json ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "LlamaForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 1,
8
+ "eos_token_id": 2,
9
+ "head_dim": 64,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 960,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 2560,
14
+ "is_llama_config": true,
15
+ "max_position_embeddings": 8192,
16
+ "mlp_bias": false,
17
+ "model_type": "llama",
18
+ "num_attention_heads": 15,
19
+ "num_hidden_layers": 32,
20
+ "num_key_value_heads": 5,
21
+ "pad_token_id": 2,
22
+ "pretraining_tp": 1,
23
+ "rms_norm_eps": 1e-05,
24
+ "rope_interleaved": false,
25
+ "rope_scaling": null,
26
+ "rope_theta": 100000,
27
+ "tie_word_embeddings": true,
28
+ "torch_dtype": "bfloat16",
29
+ "transformers.js_config": {
30
+ "kv_cache_dtype": {
31
+ "fp16": "float16",
32
+ "q4f16": "float16"
33
+ }
34
+ },
35
+ "transformers_version": "4.52.4",
36
+ "use_cache": true,
37
+ "vocab_size": 49152
38
+ }
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "pad_token_id": 2,
6
+ "transformers_version": "4.52.4"
7
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee452849e070e7995d3f2d3318bb77cc21e2a637243c737641cd6015a204dd39
3
+ size 723674912
special_tokens_map.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>"
5
+ ],
6
+ "bos_token": {
7
+ "content": "<|im_start|>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false
12
+ },
13
+ "eos_token": {
14
+ "content": "<|im_end|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false
19
+ },
20
+ "pad_token": {
21
+ "content": "<|im_end|>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false
26
+ },
27
+ "unk_token": {
28
+ "content": "<|endoftext|>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false
33
+ }
34
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<|endoftext|>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<|im_start|>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "<|im_end|>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<repo_name>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "4": {
37
+ "content": "<reponame>",
38
+ "lstrip": false,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ },
44
+ "5": {
45
+ "content": "<file_sep>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false,
50
+ "special": true
51
+ },
52
+ "6": {
53
+ "content": "<filename>",
54
+ "lstrip": false,
55
+ "normalized": false,
56
+ "rstrip": false,
57
+ "single_word": false,
58
+ "special": true
59
+ },
60
+ "7": {
61
+ "content": "<gh_stars>",
62
+ "lstrip": false,
63
+ "normalized": false,
64
+ "rstrip": false,
65
+ "single_word": false,
66
+ "special": true
67
+ },
68
+ "8": {
69
+ "content": "<issue_start>",
70
+ "lstrip": false,
71
+ "normalized": false,
72
+ "rstrip": false,
73
+ "single_word": false,
74
+ "special": true
75
+ },
76
+ "9": {
77
+ "content": "<issue_comment>",
78
+ "lstrip": false,
79
+ "normalized": false,
80
+ "rstrip": false,
81
+ "single_word": false,
82
+ "special": true
83
+ },
84
+ "10": {
85
+ "content": "<issue_closed>",
86
+ "lstrip": false,
87
+ "normalized": false,
88
+ "rstrip": false,
89
+ "single_word": false,
90
+ "special": true
91
+ },
92
+ "11": {
93
+ "content": "<jupyter_start>",
94
+ "lstrip": false,
95
+ "normalized": false,
96
+ "rstrip": false,
97
+ "single_word": false,
98
+ "special": true
99
+ },
100
+ "12": {
101
+ "content": "<jupyter_text>",
102
+ "lstrip": false,
103
+ "normalized": false,
104
+ "rstrip": false,
105
+ "single_word": false,
106
+ "special": true
107
+ },
108
+ "13": {
109
+ "content": "<jupyter_code>",
110
+ "lstrip": false,
111
+ "normalized": false,
112
+ "rstrip": false,
113
+ "single_word": false,
114
+ "special": true
115
+ },
116
+ "14": {
117
+ "content": "<jupyter_output>",
118
+ "lstrip": false,
119
+ "normalized": false,
120
+ "rstrip": false,
121
+ "single_word": false,
122
+ "special": true
123
+ },
124
+ "15": {
125
+ "content": "<jupyter_script>",
126
+ "lstrip": false,
127
+ "normalized": false,
128
+ "rstrip": false,
129
+ "single_word": false,
130
+ "special": true
131
+ },
132
+ "16": {
133
+ "content": "<empty_output>",
134
+ "lstrip": false,
135
+ "normalized": false,
136
+ "rstrip": false,
137
+ "single_word": false,
138
+ "special": true
139
+ }
140
+ },
141
+ "additional_special_tokens": [
142
+ "<|im_start|>",
143
+ "<|im_end|>"
144
+ ],
145
+ "bos_token": "<|im_start|>",
146
+ "clean_up_tokenization_spaces": false,
147
+ "eos_token": "<|im_end|>",
148
+ "extra_special_tokens": {},
149
+ "model_max_length": 8192,
150
+ "pad_token": "<|im_end|>",
151
+ "padding_side": "left",
152
+ "split_special_tokens": false,
153
+ "tokenizer_class": "GPT2Tokenizer",
154
+ "unk_token": "<|endoftext|>",
155
+ "vocab_size": 49152
156
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff