matthewdicks98's picture
Simplify SGLang launch command
7fa6b65 verified
|
Raw
History Blame Contribute Delete
19 kB
---
library_name: transformers
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen3-0.6B/blob/main/LICENSE
pipeline_tag: text-generation
base_model:
- Qwen/Qwen3-0.6B-Base
tags:
- finance
- nlp
---
<p align="center">
<img src="https://github.com/NosibleAI/nosible-py/blob/main/docs/_static/readme.png?raw=true"/>
<p>
## Changelog
- **v1.2.0:** Multilingual upgrade. Extends coverage to **94 languages** while holding English performance flat, and adds currency / G10-geography coverage.
- **v1.1.0:** English forward-looking model trained on real-world Nosible Search Feeds.
**forward-looking-v1.2-base** is a temporal-orientation classification model. Given a short text snippet, it determines whether the snippet's main event or topic is **forward-looking** (planned, expected, forecast, or scheduled) or **not-forward-looking** (a past event, a current state, or a timeless fact). It is fine-tuned from [**Qwen3-0.6B-Base**](https://huggingface.co/Qwen/Qwen3-0.6B) and reframes the task as instruction following, producing a single label token per input.
This is the **multilingual successor to [forward-looking-v1.1-base](https://huggingface.co/NOSIBLE/forward-looking-v1.1-base)**. v1.1 was trained primarily on English; v1.2 extends the same task to **94 languages** (English plus 93 additional languages) so the model can classify temporal orientation on text as it appears across global news and search feeds.
### What's new in v1.2
- **Multilingual coverage.** The training corpus extends the English [Forward-Looking](https://huggingface.co/datasets/NOSIBLE/forward-looking) data with faithful translations across 93 additional languages, where the forward / not-forward label is preserved through translation (future-tense framing is not flattened into the past, and vice versa).
- **Wider topic coverage.** v1.2 adds currency and G10-geography feeds, so temporal orientation is classified consistently across company, country / region, and currency news.
- **English held flat.** v1.2 is a multilingual extension, not an English re-train. English accuracy and macro-F1 are unchanged within run noise (see below).
- **The multilingual gap narrowed.** On the held-out validation set, the English-vs-multilingual accuracy gap shrinks from **~8.3pp** (v1.1) to **~4.7pp** (v1.2).
### Performance overview
All numbers below are measured on the **live SGLang endpoint** (OpenAI-compatible chat-completions, `enable_thinking=False`, `temperature=0`), scored against the same held-out validation splits for both models. Deltas are in **percentage points (pp)**.
#### Headline
| Slice | n | Metric | v1.1 | v1.2 | Δ |
|-------|---:|--------|-----:|-----:|----:|
| English val | 20,000 | Accuracy | 92.12% | 91.96% | -0.16pp |
| English val | 20,000 | Macro-F1 | 91.91% | 91.73% | -0.18pp |
| **Multilingual val** | 19,155 | Accuracy | 83.83% | **87.22%** | **+3.39pp** |
| **Multilingual val** | 19,155 | Macro-F1 | 83.28% | **86.92%** | **+3.64pp** |
| Currency / geo feeds | 4,012 | Accuracy | 86.86% | 88.56% | +1.70pp |
| Currency / geo feeds | 4,012 | Macro-F1 | 86.18% | 88.20% | +2.02pp |
The small English regression (-0.16pp accuracy) is within per-run noise — v1.2 was a multilingual extension, not an English re-train, and the English baseline was already very strong. Multilingual accuracy improves by **+3.39pp**.
#### Selected languages (largest validation slices)
| Language | n | v1.1 acc | v1.2 acc | Δ acc |
|----------|---:|--------:|--------:|------:|
| Japanese (ja) | 1,500 | 84.47% | 87.80% | +3.33pp |
| Spanish (es) | 1,271 | 87.96% | 90.87% | +2.91pp |
| German (de) | 1,603 | 87.40% | 90.14% | +2.74pp |
| Italian (it) | 669 | 87.44% | 90.13% | +2.69pp |
| Russian (ru) | 1,678 | 88.02% | 90.58% | +2.56pp |
| French (fr) | 1,225 | 89.06% | 91.43% | +2.37pp |
| Portuguese (pt) | 731 | 89.47% | 91.52% | +2.05pp |
| Dutch (nl) | 506 | 88.74% | 90.12% | +1.38pp |
| Chinese (zh) | 1,377 | 89.91% | 90.92% | +1.01pp |
| Polish (pl) | 588 | 82.14% | 87.93% | +5.79pp |
The gains are largest on lower-resource languages — for example accuracy rises on Tamil (61.95% → 75.22%), Gujarati (74.00% → 83.00%), and Swahili (60.53% → 67.54%), with even larger macro-F1 improvements.
## Strict Usage Requirements
> [!CAUTION]
> 1. **Disable Thinking:** You **must** set `enable_thinking=False` (or disable reasoning tokens).
> 2. **Exact System Prompt:** You **must** use the specific system prompt: `"Classify whether it is forward looking or not forward looking."`
> 3. **Constrain Output:** You **must** restrict generation to the two label tokens the model emits: `forward` and `_forward`. The token `_forward` denotes the **not-forward** class — map it back after decoding.
> * **SGLang:** Use `regex="(forward|_forward)"` in the API call.
> * **vLLM:** Use `guided_choice=["forward", "_forward"]` in the API call.
> * **llama.cpp / GGUF:** Apply a GBNF grammar or regex to force selection from the two tokens.
>
> Deviating from these requirements will **severely** impact performance and reliability.
> [!NOTE]
> The two classes are **forward** and **not-forward**. Internally the model was trained so that the not-forward class is emitted as the single token `_forward`. Always map `_forward` → `not-forward` (and `forward` → `forward`) after reading the model output.
## Quickstart (local GPU)
Since this model was trained as a Causal LM using specific chat templates, you must use `apply_chat_template` with the exact system prompt used during training.
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "NOSIBLE/forward-looking-v1.2-base"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
# Multilingual input is supported (94 languages).
text = "El banco central elevará las tasas de interés el próximo mes."
# 1. Structure the prompt exactly as used in training
messages = [
{"role": "system", "content": "Classify whether it is forward looking or not forward looking."},
{"role": "user", "content": text},
]
# 2. Apply chat template (thinking MUST be disabled)
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False,
)
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
# 3. Generate the label token (only a single token is expected)
outputs = model.generate(**inputs, max_new_tokens=1)
raw = tokenizer.decode(outputs[0], skip_special_tokens=True).split("<|im_start|>assistant\n")[-1].strip()
# Map the model-form token back to the human-readable label
label = "not-forward" if raw == "_forward" else "forward"
print(label)
# Expected Output: forward
```
## Deployment
For production we recommend serving with [**SGLang**](https://github.com/sgl-project/sglang) (`sglang>=0.4.6.post1`), which exposes an OpenAI-compatible API endpoint. The model is based on Qwen3-0.6B and can be deployed anywhere Qwen3-0.6B can.
**Launch the server:**
```shell
python3 -m sglang.launch_server --model-path NOSIBLE/forward-looking-v1.2-base --dtype bfloat16 --host 0.0.0.0 --port 8080
```
**Call the endpoint** using the OpenAI-compatible client. Requesting `logprobs` lets you read a calibrated confidence for each label.
```python
import math
from openai import OpenAI
# OpenAI-compatible client pointed at your SGLang server (set base_url to your
# endpoint URL if remote). The request shape mirrors signals_deploy_v12.predict_one.
client = OpenAI(base_url="http://localhost:8080/v1", api_key="EMPTY")
model_id = "NOSIBLE/forward-looking-v1.2-base"
# Multilingual input is supported.
text = "El banco central elevará las tasas de interés el próximo mes."
# The model emits the tokens `forward` / `_forward`; `_forward` == not-forward.
label_map = {"forward": "forward", "_forward": "not-forward"}
messages = [
{"role": "system", "content": "Classify whether it is forward looking or not forward looking."},
{"role": "user", "content": text},
]
completion = client.chat.completions.create(
model=model_id,
messages=messages,
temperature=0,
stream=False,
logprobs=True,
top_logprobs=2,
extra_body={"chat_template_kwargs": {"enable_thinking": False}}, # Must be set to false.
)
# The top-1 token is the predicted label; map the model-form token back to the
# human-readable label. The full top_logprobs slice gives a per-label confidence.
top = completion.choices[0].logprobs.content[0].top_logprobs
print(f"Input: {text}")
print(f"Predicted Label: {label_map.get(top[0].token.strip(), top[0].token.strip())}")
print("--- Label Confidence ---")
for lp in top:
name = label_map.get(lp.token.strip(), lp.token.strip())
print(f"Token: {name!r} | Probability: {math.exp(lp.logprob):.2%}")
```
#### Expected Output
```text
Input: El banco central elevará las tasas de interés el próximo mes.
Predicted Label: forward
--- Label Confidence ---
Token: 'forward' | Probability: 99.99%
Token: 'not-forward' | Probability: 0.01%
```
**Legal Notice:** This model is a modification of the Qwen3-0.6B model. In compliance with the Apache 2.0 license, we retain all original copyright notices and provide this modification under the same license terms.
## Limitations
* **Parameter Size (0.6B):** As a small language model, it is designed for fast, specific classification and may struggle with highly nuanced or ambiguous text that requires extensive world knowledge.
* **Per-language quality varies.** Accuracy on the highest-resource languages approaches the English baseline; lower-resource languages remain below it despite the large v1.2 improvements.
* **Domain Specificity:** The model is fine-tuned on **financial contexts** and classifies temporal orientation, not general-purpose tense detection.
* **Reporting-verb ambiguity:** Temporal orientation is judged by the main event, not the tense of reporting verbs ("said", "announced"). Very terse or context-free snippets can be ambiguous.
## Disclaimer
* **Not Financial Advice:** The outputs of this model **should not be interpreted as financial advice, investment recommendations, or an endorsement** of any financial instrument or asset.
* **Risk:** Financial markets are inherently volatile and risky. **Never make investment decisions based solely on the output of an AI model.** Always consult with a qualified financial professional.
## Team & Credits
This model was developed and maintained by the following team:
* [**Matthew Dicks**](https://www.linkedin.com/in/matthewdicks98/)
* [**Gareth Warburton**](https://www.linkedin.com/in/garethwarburton/)
* [**Stuart Reid**](https://www.linkedin.com/in/stuartgordonreid/)
## Citation
If you use this model, please cite it as follows:
```bibtex
@misc{nosible2025forwardlookingv12,
author = {NOSIBLE},
title = {Forward Looking v1.2 Base},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face Repository},
howpublished = {https://huggingface.co/NOSIBLE/forward-looking-v1.2-base}
}
```
## Full language breakdown
v1.1 was trained on **English only**; v1.2 adds the **93 languages** below (94 total
with English). The figures are the training-time evaluation per language and reproduce
on the served SGLang endpoint to within ~0.2pp. Deltas are in percentage points (pp),
sorted by validation row count.
| Language | n | v1.1 acc | v1.2 acc | Δ acc | v1.1 F1 | v1.2 F1 | Δ F1 |
|----------|---:|--------:|--------:|------:|-------:|-------:|------:|
| Russian (ru) | 1,677 | 88.19% | 90.64% | +2.45pp | 88.00% | 90.56% | +2.56pp |
| German (de) | 1,603 | 87.34% | 90.27% | +2.93pp | 87.01% | 90.12% | +3.11pp |
| Japanese (ja) | 1,500 | 84.47% | 87.93% | +3.46pp | 83.92% | 87.55% | +3.63pp |
| Chinese (zh) | 1,377 | 89.83% | 90.56% | +0.73pp | 89.68% | 90.47% | +0.79pp |
| Spanish (es) | 1,271 | 87.96% | 90.72% | +2.76pp | 87.75% | 90.55% | +2.80pp |
| French (fr) | 1,225 | 88.82% | 91.59% | +2.77pp | 88.63% | 91.46% | +2.83pp |
| Portuguese (pt) | 731 | 89.74% | 91.52% | +1.78pp | 89.57% | 91.39% | +1.82pp |
| Italian (it) | 669 | 87.59% | 89.69% | +2.10pp | 87.40% | 89.55% | +2.15pp |
| Polish (pl) | 588 | 82.31% | 88.27% | +5.96pp | 81.83% | 88.13% | +6.30pp |
| Dutch (nl) | 506 | 88.74% | 90.32% | +1.58pp | 88.31% | 89.99% | +1.68pp |
| Turkish (tr) | 393 | 82.19% | 86.77% | +4.58pp | 81.75% | 86.54% | +4.79pp |
| Indonesian (id) | 386 | 87.56% | 87.82% | +0.26pp | 86.92% | 87.32% | +0.40pp |
| Vietnamese (vi) | 368 | 85.33% | 89.95% | +4.62pp | 85.20% | 89.91% | +4.71pp |
| Czech (cs) | 332 | 84.94% | 85.84% | +0.90pp | 84.72% | 85.60% | +0.88pp |
| Korean (ko) | 291 | 87.97% | 88.32% | +0.35pp | 87.95% | 88.30% | +0.35pp |
| Arabic (ar) | 289 | 87.54% | 92.39% | +4.85pp | 87.42% | 92.36% | +4.94pp |
| Ukrainian (uk) | 246 | 85.37% | 91.46% | +6.09pp | 84.79% | 91.29% | +6.50pp |
| Swedish (sv) | 228 | 83.77% | 85.96% | +2.19pp | 83.38% | 85.78% | +2.40pp |
| Romanian (ro) | 223 | 83.41% | 85.65% | +2.24pp | 83.18% | 85.47% | +2.29pp |
| Hindi (hi) | 191 | 83.25% | 85.34% | +2.09pp | 83.00% | 85.22% | +2.22pp |
| Greek (el) | 187 | 84.49% | 87.17% | +2.68pp | 83.58% | 86.52% | +2.94pp |
| Thai (th) | 179 | 88.83% | 88.27% | -0.56pp | 88.48% | 87.97% | -0.51pp |
| Hungarian (hu) | 177 | 76.27% | 83.05% | +6.78pp | 75.52% | 82.76% | +7.24pp |
| Danish (da) | 176 | 81.82% | 88.64% | +6.82pp | 81.35% | 88.46% | +7.11pp |
| Malay (ms) | 143 | 85.31% | 88.11% | +2.80pp | 84.73% | 87.82% | +3.09pp |
| Slovak (sk) | 143 | 83.92% | 86.01% | +2.09pp | 83.53% | 85.76% | +2.23pp |
| Bengali (bn) | 142 | 85.21% | 88.73% | +3.52pp | 84.57% | 88.50% | +3.93pp |
| Persian (fa) | 136 | 78.68% | 80.15% | +1.47pp | 77.16% | 80.02% | +2.86pp |
| Finnish (fi) | 136 | 78.68% | 85.29% | +6.61pp | 76.31% | 84.43% | +8.12pp |
| Urdu (ur) | 120 | 82.50% | 87.50% | +5.00pp | 81.83% | 87.18% | +5.35pp |
| Norwegian (no) | 116 | 89.66% | 90.52% | +0.86pp | 89.19% | 90.26% | +1.07pp |
| Swahili (sw) | 114 | 59.65% | 65.79% | +6.14pp | 44.19% | 63.06% | +18.87pp |
| Serbian (sr) | 113 | 86.73% | 91.15% | +4.42pp | 86.20% | 90.99% | +4.79pp |
| Tamil (ta) | 113 | 62.83% | 75.22% | +12.39pp | 61.97% | 74.98% | +13.01pp |
| Hebrew (he) | 110 | 80.91% | 87.27% | +6.36pp | 80.55% | 87.23% | +6.68pp |
| Marathi (mr) | 109 | 72.48% | 75.23% | +2.75pp | 71.61% | 74.93% | +3.32pp |
| Bulgarian (bg) | 107 | 85.05% | 87.85% | +2.80pp | 85.01% | 87.78% | +2.77pp |
| Punjabi (pa) | 105 | 74.29% | 78.10% | +3.81pp | 73.82% | 77.57% | +3.75pp |
| Telugu (te) | 102 | 62.75% | 68.63% | +5.88pp | 58.09% | 67.10% | +9.01pp |
| Hausa (ha) | 101 | 52.48% | 60.40% | +7.92pp | 47.53% | 58.61% | +11.08pp |
| Gujarati (gu) | 100 | 73.00% | 83.00% | +10.00pp | 70.88% | 82.61% | +11.73pp |
| Tagalog (tl) | 99 | 71.72% | 71.72% | +0.00pp | 67.13% | 67.68% | +0.55pp |
| Kannada (kn) | 93 | 64.52% | 68.82% | +4.30pp | 59.75% | 64.01% | +4.26pp |
| Croatian (hr) | 91 | 89.01% | 91.21% | +2.20pp | 88.90% | 91.03% | +2.13pp |
| Azerbaijani (az) | 87 | 85.06% | 83.91% | -1.15pp | 84.39% | 82.70% | -1.69pp |
| Pashto (ps) | 87 | 51.72% | 71.26% | +19.54pp | 37.94% | 70.50% | +32.56pp |
| Uzbek (uz) | 84 | 59.52% | 72.62% | +13.10pp | 55.24% | 72.52% | +17.28pp |
| Burmese (my) | 83 | 55.42% | 81.93% | +26.51pp | 47.64% | 81.76% | +34.12pp |
| Nepali (ne) | 83 | 74.70% | 85.54% | +10.84pp | 74.33% | 85.52% | +11.19pp |
| Malayalam (ml) | 82 | 75.61% | 80.49% | +4.88pp | 68.94% | 77.91% | +8.97pp |
| Odia (or) | 81 | 75.31% | 79.01% | +3.70pp | 72.71% | 76.60% | +3.89pp |
| Kazakh (kk) | 77 | 71.43% | 77.92% | +6.49pp | 69.58% | 76.32% | +6.74pp |
| Somali (so) | 77 | 70.13% | 72.73% | +2.60pp | 45.09% | 65.10% | +20.01pp |
| Amharic (am) | 75 | 53.33% | 56.00% | +2.67pp | 37.22% | 42.83% | +5.61pp |
| Lithuanian (lt) | 73 | 73.97% | 78.08% | +4.11pp | 72.66% | 77.88% | +5.22pp |
| Sindhi (sd) | 73 | 78.08% | 83.56% | +5.48pp | 76.10% | 82.08% | +5.98pp |
| Sinhala (si) | 68 | 66.18% | 63.24% | -2.94pp | 59.85% | 51.91% | -7.94pp |
| Afrikaans (af) | 66 | 87.88% | 83.33% | -4.55pp | 87.59% | 83.14% | -4.45pp |
| Khmer (km) | 66 | 40.91% | 71.21% | +30.30pp | 31.01% | 70.88% | +39.87pp |
| Slovenian (sl) | 66 | 78.79% | 95.45% | +16.66pp | 76.64% | 95.32% | +18.68pp |
| Assamese (as) | 63 | 76.19% | 79.37% | +3.18pp | 74.55% | 77.05% | +2.50pp |
| Armenian (hy) | 56 | 83.93% | 82.14% | -1.79pp | 82.30% | 78.88% | -3.42pp |
| Kyrgyz (ky) | 49 | 63.27% | 75.51% | +12.24pp | 56.76% | 72.16% | +15.40pp |
| Latvian (lv) | 48 | 81.25% | 87.50% | +6.25pp | 80.21% | 87.30% | +7.09pp |
| Mongolian (mn) | 46 | 65.22% | 65.22% | +0.00pp | 54.90% | 54.90% | +0.00pp |
| Lao (lo) | 44 | 81.82% | 90.91% | +9.09pp | 75.76% | 87.88% | +12.12pp |
| Georgian (ka) | 42 | 57.14% | 69.05% | +11.91pp | 51.79% | 67.56% | +15.77pp |
| Sanskrit (sa) | 22 | 72.73% | 68.18% | -4.55pp | 70.54% | 66.45% | -4.09pp |
| Bosnian (bs) | 21 | 85.71% | 80.95% | -4.76pp | 78.79% | 76.67% | -2.12pp |
| Catalan (ca) | 21 | 95.24% | 90.48% | -4.76pp | 95.06% | 90.28% | -4.78pp |
| Irish (ga) | 20 | 65.00% | 65.00% | +0.00pp | 49.82% | 49.82% | +0.00pp |
| Belarusian (be) | 19 | 73.68% | 68.42% | -5.26pp | 63.60% | 59.29% | -4.31pp |
| Malagasy (mg) | 19 | 57.89% | 68.42% | +10.53pp | 51.28% | 66.07% | +14.79pp |
| Breton (br) | 18 | 55.56% | 61.11% | +5.55pp | 44.62% | 57.86% | +13.24pp |
| Welsh (cy) | 18 | 44.44% | 55.56% | +11.12pp | 37.50% | 55.00% | +17.50pp |
| Basque (eu) | 18 | 77.78% | 94.44% | +16.66pp | 60.00% | 92.59% | +32.59pp |
| Latin (la) | 18 | 88.89% | 88.89% | +0.00pp | 71.88% | 71.88% | +0.00pp |
| Macedonian (mk) | 18 | 83.33% | 83.33% | +0.00pp | 83.28% | 83.28% | +0.00pp |
| Oromo (om) | 18 | 55.56% | 55.56% | +0.00pp | 44.62% | 44.62% | +0.00pp |
| Serbo-Croatian (sh) | 18 | 88.89% | 94.44% | +5.55pp | 87.50% | 93.45% | +5.95pp |
| Xhosa (xh) | 18 | 55.56% | 50.00% | -5.56pp | 35.71% | 41.09% | +5.38pp |
| Yiddish (yi) | 18 | 72.22% | 83.33% | +11.11pp | 41.94% | 73.40% | +31.46pp |
| Galician (gl) | 17 | 94.12% | 100.00% | +5.88pp | 94.12% | 100.00% | +5.88pp |
| Icelandic (is) | 17 | 47.06% | 58.82% | +11.76pp | 39.53% | 47.11% | +7.58pp |
| Estonian (et) | 16 | 68.75% | 81.25% | +12.50pp | 67.61% | 81.18% | +13.57pp |
| Western Frisian (fy) | 16 | 68.75% | 68.75% | +0.00pp | 67.61% | 67.61% | +0.00pp |
| Scottish Gaelic (gd) | 16 | 56.25% | 56.25% | +0.00pp | 45.89% | 45.89% | +0.00pp |
| Javanese (jv) | 16 | 75.00% | 81.25% | +6.25pp | 74.60% | 80.57% | +5.97pp |
| Kurdish (ku) | 16 | 68.75% | 75.00% | +6.25pp | 54.29% | 66.67% | +12.38pp |
| Albanian (sq) | 16 | 75.00% | 75.00% | +0.00pp | 74.60% | 73.33% | -1.27pp |
| Sundanese (su) | 16 | 81.25% | 87.50% | +6.25pp | 79.22% | 87.30% | +8.08pp |
| Uyghur (ug) | 16 | 68.75% | 62.50% | -6.25pp | 54.29% | 38.46% | -15.83pp |
| Esperanto (eo) | 15 | 80.00% | 80.00% | +0.00pp | 72.05% | 76.19% | +4.14pp |