Instructions to use TildeAI/TildeOpen-30b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TildeAI/TildeOpen-30b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="TildeAI/TildeOpen-30b")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("TildeAI/TildeOpen-30b") model = AutoModelForCausalLM.from_pretrained("TildeAI/TildeOpen-30b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use TildeAI/TildeOpen-30b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "TildeAI/TildeOpen-30b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TildeAI/TildeOpen-30b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/TildeAI/TildeOpen-30b
- SGLang
How to use TildeAI/TildeOpen-30b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "TildeAI/TildeOpen-30b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TildeAI/TildeOpen-30b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "TildeAI/TildeOpen-30b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TildeAI/TildeOpen-30b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use TildeAI/TildeOpen-30b with Docker Model Runner:
docker model run hf.co/TildeAI/TildeOpen-30b
Update README.md
Browse files
README.md
CHANGED
|
@@ -106,17 +106,20 @@ outputs = model.generate(
|
|
| 106 |
# Evaluation
|
| 107 |
## Per-Character Perplexity
|
| 108 |
**What is Perplexity?** Perplexity measures how well a language model predicts text. A model with low perplexity makes accurate predictions consistently, while a high perplexity means the model is frequently "surprised" by unexpected words or patterns. Lower perplexity indicates the model has learned language patterns more effectively. It's less "surprised" by what it encounters because it better understands how the language works.
|
| 109 |
-
**Why Character-Level?** Different language models use different internal vocabularies - some break text into whole words, others into word fragments, and some into individual characters. This makes direct comparison difficult.
|
| 110 |
-
Character-level perplexity creates a standardised comparison by calculating how well each model would theoretically perform if we measured their predictions character-by-character. We're not changing how the models work - instead, we use mathematical conversion to approximate their character-level performance based on their predictions.
|
| 111 |
-
|
| 112 |
Perplexity fairly evaluates how well each model handles:
|
| 113 |
- Spelling accuracy across a diverse vocabulary
|
| 114 |
- Grammar rules that span multiple words
|
| 115 |
- Sentence structure and flow
|
| 116 |
-
- Language-specific patterns (
|
|
|
|
|
|
|
|
|
|
|
|
|
| 117 |
**Why does this Matter?** Models with lower perplexity generally perform better on real-world tasks like text generation, translation, and understanding context. It's a reliable indicator of overall language competency across different applications.
|
|
|
|
| 118 |
**What data did we use?**
|
| 119 |
We use WMT24++ as it is a multilingual, language-parallel evaluation set that none of the models have seen during training. WMT24++ is a composite of texts from news, literature, speech, and social media; thus, it is suitable for foundational model benchmarking.
|
|
|
|
| 120 |
| Language | TildeOpen-30B | Gemma-2-27B | EuroLLM-9B | ALIA-40B |
|
| 121 |
|----------|---------------|-------------|------------|-----------------|
|
| 122 |
| Bulgarian | **2.1716** | 2.3541 | 2.3502 | 2.2411 |
|
|
|
|
| 106 |
# Evaluation
|
| 107 |
## Per-Character Perplexity
|
| 108 |
**What is Perplexity?** Perplexity measures how well a language model predicts text. A model with low perplexity makes accurate predictions consistently, while a high perplexity means the model is frequently "surprised" by unexpected words or patterns. Lower perplexity indicates the model has learned language patterns more effectively. It's less "surprised" by what it encounters because it better understands how the language works.
|
|
|
|
|
|
|
|
|
|
| 109 |
Perplexity fairly evaluates how well each model handles:
|
| 110 |
- Spelling accuracy across a diverse vocabulary
|
| 111 |
- Grammar rules that span multiple words
|
| 112 |
- Sentence structure and flow
|
| 113 |
+
- Language-specific patterns (how different languages form plural forms or compound words)
|
| 114 |
+
|
| 115 |
+
**Why Character-Level?** Different language models use different internal vocabularies - some break text into whole words, others into word fragments, and some into individual characters. This makes direct comparison difficult.
|
| 116 |
+
Character-level perplexity creates a standardised comparison by calculating how well each model would theoretically perform if we measured their predictions character-by-character. We're not changing how the models work - instead, we use mathematical conversion to approximate their character-level performance based on their predictions.
|
| 117 |
+
|
| 118 |
**Why does this Matter?** Models with lower perplexity generally perform better on real-world tasks like text generation, translation, and understanding context. It's a reliable indicator of overall language competency across different applications.
|
| 119 |
+
|
| 120 |
**What data did we use?**
|
| 121 |
We use WMT24++ as it is a multilingual, language-parallel evaluation set that none of the models have seen during training. WMT24++ is a composite of texts from news, literature, speech, and social media; thus, it is suitable for foundational model benchmarking.
|
| 122 |
+
|
| 123 |
| Language | TildeOpen-30B | Gemma-2-27B | EuroLLM-9B | ALIA-40B |
|
| 124 |
|----------|---------------|-------------|------------|-----------------|
|
| 125 |
| Bulgarian | **2.1716** | 2.3541 | 2.3502 | 2.2411 |
|