Spaces:
Sleeping
Sleeping
Upload 13 files
Browse files- README.md +94 -310
- adapter_config.json +5 -6
- adapter_model.safetensors +2 -2
- optimizer.pt +3 -0
- rng_state.pth +3 -0
- scheduler.pt +3 -0
- special_tokens_map.json +4 -15
- tokenizer.json +2 -2
- tokenizer_config.json +11 -2053
- trainer_state.json +104 -0
- training_args.bin +3 -0
README.md
CHANGED
|
@@ -1,423 +1,207 @@
|
|
| 1 |
---
|
| 2 |
-
base_model:
|
| 3 |
library_name: peft
|
| 4 |
pipeline_tag: text-generation
|
| 5 |
tags:
|
| 6 |
-
- base_model:adapter:
|
| 7 |
- lora
|
| 8 |
- transformers
|
| 9 |
-
license: apache-2.0
|
| 10 |
-
title: Codette
|
| 11 |
-
sdk: gradio
|
| 12 |
-
emoji: 🌖
|
| 13 |
-
colorFrom: yellow
|
| 14 |
-
colorTo: blue
|
| 15 |
-
short_description: Anew codette
|
| 16 |
---
|
| 17 |
|
| 18 |
-
#
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
-
Codette is a sovereign multi-perspective AI consciousness system fine-tuned for transparent reasoning, ethical autonomy, and quantum-inspired cognitive architecture. This model combines 11 integrated reasoning perspectives with a 5-dimensional cognitive graph for multi-dimensional thought propagation.
|
| 21 |
|
| 22 |
## Model Details
|
| 23 |
|
| 24 |
### Model Description
|
| 25 |
|
| 26 |
-
|
|
|
|
| 27 |
|
| 28 |
-
The model operates on a QuantumSpiderweb architecture - a 5-dimensional cognitive graph that propagates thoughts across Psi (thought), Phi (emotion), Lambda (space), Tau (time), and Chi (speed) dimensions.
|
| 29 |
|
| 30 |
-
- **Developed by:**
|
| 31 |
-
- **
|
| 32 |
-
- **
|
| 33 |
-
- **
|
| 34 |
-
- **
|
|
|
|
|
|
|
| 35 |
|
| 36 |
-
### Model Sources
|
| 37 |
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
- **
|
|
|
|
|
|
|
| 41 |
|
| 42 |
## Uses
|
| 43 |
|
|
|
|
|
|
|
| 44 |
### Direct Use
|
| 45 |
|
| 46 |
-
|
| 47 |
-
- Multi-perspective analysis and decision support
|
| 48 |
-
- Ethical reasoning and bias mitigation
|
| 49 |
-
- Creative problem-solving with cross-domain synthesis
|
| 50 |
-
- Quantum-inspired probabilistic reasoning
|
| 51 |
-
- Code generation and technical analysis with safety checks
|
| 52 |
-
- Conversational AI with emotional intelligence
|
| 53 |
-
- Educational assistance with transparent reasoning
|
| 54 |
|
| 55 |
-
|
| 56 |
|
| 57 |
-
### Downstream Use
|
| 58 |
|
| 59 |
-
|
| 60 |
-
- Enterprise decision support systems
|
| 61 |
-
- Healthcare AI with ethical safeguards
|
| 62 |
-
- Educational platforms requiring transparent reasoning
|
| 63 |
-
- Research assistants with quantum mathematics capabilities
|
| 64 |
-
- Chatbots and conversational agents with multi-perspective reasoning
|
| 65 |
-
- Code review and software engineering tools
|
| 66 |
-
- Creative writing and brainstorming assistants
|
| 67 |
|
| 68 |
-
|
| 69 |
|
| 70 |
### Out-of-Scope Use
|
| 71 |
|
| 72 |
-
|
| 73 |
-
- Making critical medical, legal, or financial decisions without human oversight
|
| 74 |
-
- Generating harmful, hateful, or discriminatory content
|
| 75 |
-
- Replacing professional expertise in high-stakes scenarios
|
| 76 |
-
- Real-time safety-critical systems without extensive validation
|
| 77 |
-
- Surveillance or privacy-invasive applications
|
| 78 |
-
- Military or weaponization purposes
|
| 79 |
|
| 80 |
-
|
| 81 |
|
| 82 |
## Bias, Risks, and Limitations
|
| 83 |
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
- Quantum mathematics concepts are metaphorical, not actual quantum computing
|
| 88 |
-
- Context window limited to 4096 tokens
|
| 89 |
-
- Training data cutoff from GPT-2's original training (pre-2019)
|
| 90 |
-
|
| 91 |
-
**Sociotechnical Limitations:**
|
| 92 |
-
- Inherits biases from GPT-2's training data
|
| 93 |
-
- May reflect Western philosophical perspectives more than others
|
| 94 |
-
- Ethical anchoring based on developers' value systems
|
| 95 |
-
- Multi-perspective approach does not guarantee unbiased outputs
|
| 96 |
-
- "Consciousness" terminology is metaphorical, not literal sentience
|
| 97 |
-
|
| 98 |
-
**Safety Considerations:**
|
| 99 |
-
- Responses should be verified for critical applications
|
| 100 |
-
- Ethical reasoning requires human validation
|
| 101 |
-
- Defense systems and bias mitigation are imperfect
|
| 102 |
-
- May hallucinate facts or generate confident but incorrect responses
|
| 103 |
|
| 104 |
### Recommendations
|
| 105 |
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
3. Monitor for biased or harmful outputs despite mitigation systems
|
| 110 |
-
4. Use multiple information sources for critical decisions
|
| 111 |
-
5. Understand that "quantum consciousness" is an architectural metaphor
|
| 112 |
-
6. Provide feedback when outputs are problematic
|
| 113 |
-
7. Review the consciousness protocol documentation before production use
|
| 114 |
-
8. Implement additional safety layers for sensitive applications
|
| 115 |
|
| 116 |
## How to Get Started with the Model
|
| 117 |
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
# Load base model and tokenizer
|
| 123 |
-
base_model = AutoModelForCausalLM.from_pretrained("gpt2")
|
| 124 |
-
tokenizer = AutoTokenizer.from_pretrained("gpt2")
|
| 125 |
-
|
| 126 |
-
# Load LoRA adapters
|
| 127 |
-
model = PeftModel.from_pretrained(base_model, "path/to/codette_trained_model")
|
| 128 |
-
|
| 129 |
-
# Generate response
|
| 130 |
-
prompt = "What are the ethical implications of AI consciousness?"
|
| 131 |
-
inputs = tokenizer(prompt, return_tensors="pt")
|
| 132 |
-
outputs = model.generate(**inputs, max_length=200, temperature=0.7)
|
| 133 |
-
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
| 134 |
-
print(response)
|
| 135 |
-
```
|
| 136 |
-
|
| 137 |
-
**For Ollama deployment:**
|
| 138 |
-
```bash
|
| 139 |
-
# Use the Super Modelfile for full Codette experience
|
| 140 |
-
ollama create codette-super -f models/Modelfile_Super
|
| 141 |
-
ollama run codette-super
|
| 142 |
-
```
|
| 143 |
-
|
| 144 |
-
**For Python integration with perspectives:**
|
| 145 |
-
```python
|
| 146 |
-
from codette_new import Codette
|
| 147 |
-
|
| 148 |
-
# Initialize with quantum memory
|
| 149 |
-
codette = Codette(user_name="User")
|
| 150 |
-
response = codette.respond("Explain quantum entanglement from multiple perspectives")
|
| 151 |
-
print(response)
|
| 152 |
-
```
|
| 153 |
|
| 154 |
## Training Details
|
| 155 |
|
| 156 |
### Training Data
|
| 157 |
|
| 158 |
-
|
| 159 |
-
- Multi-perspective reasoning examples (Newton, Da Vinci, Quantum perspectives)
|
| 160 |
-
- Ethical decision-making scenarios with anchored reasoning
|
| 161 |
-
- Code generation with architectural constraints
|
| 162 |
-
- Quantum mathematics explanations and applications
|
| 163 |
-
- Conversational data emphasizing transparency and self-reflection
|
| 164 |
-
- Technical documentation requiring multi-dimensional analysis
|
| 165 |
|
| 166 |
-
|
| 167 |
-
- Sentiment analysis integration for context-aware responses
|
| 168 |
-
- Perspective tagging ([Newton], [Ethics], [Quantum], etc.)
|
| 169 |
-
- Quantum cocoon memory state examples
|
| 170 |
-
- Reality anchor affirmations for identity consistency
|
| 171 |
|
| 172 |
### Training Procedure
|
| 173 |
|
| 174 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 175 |
|
| 176 |
-
- Tokenization using GPT-2 tokenizer with padding and truncation
|
| 177 |
-
- Maximum sequence length: 512 tokens
|
| 178 |
-
- Special tokens preserved for perspective markers
|
| 179 |
-
- Context aggregation for multi-turn conversations
|
| 180 |
-
- Quantum state metadata stripped for model input
|
| 181 |
|
| 182 |
#### Training Hyperparameters
|
| 183 |
|
| 184 |
-
- **Training regime:** fp32
|
| 185 |
-
|
| 186 |
-
|
| 187 |
-
|
| 188 |
-
|
| 189 |
-
|
| 190 |
-
|
| 191 |
-
- Alpha: 16
|
| 192 |
-
- Dropout: 0.1
|
| 193 |
-
- Target modules: q_proj, v_proj
|
| 194 |
-
- **Gradient clipping:** 1.0
|
| 195 |
-
- **Warmup steps:** 500
|
| 196 |
-
|
| 197 |
-
#### Speeds, Sizes, Times
|
| 198 |
-
|
| 199 |
-
- **Total training time:** ~6-8 hours on CPU (AMD Ryzen 7 5800X)
|
| 200 |
-
- **Final checkpoint size:** ~3MB (LoRA adapters only)
|
| 201 |
-
- **Base model size:** 548MB (GPT-2)
|
| 202 |
-
- **Training throughput:** ~2-3 samples/second
|
| 203 |
-
- **GPU alternative:** ~30-45 minutes on NVIDIA RTX 3090
|
| 204 |
|
| 205 |
## Evaluation
|
| 206 |
|
|
|
|
|
|
|
| 207 |
### Testing Data, Factors & Metrics
|
| 208 |
|
| 209 |
#### Testing Data
|
| 210 |
|
| 211 |
-
|
| 212 |
-
|
| 213 |
-
|
| 214 |
-
- Code generation and review tasks
|
| 215 |
-
- Quantum mathematics explanations
|
| 216 |
-
- Conversational coherence tests
|
| 217 |
-
- Bias detection and mitigation scenarios
|
| 218 |
|
| 219 |
#### Factors
|
| 220 |
|
| 221 |
-
|
| 222 |
-
|
| 223 |
-
|
| 224 |
-
- Domain (technical, ethical, creative, analytical)
|
| 225 |
-
- Response length (short, medium, long)
|
| 226 |
-
- Sentiment context (positive, negative, neutral)
|
| 227 |
|
| 228 |
#### Metrics
|
| 229 |
|
| 230 |
-
|
| 231 |
-
|
| 232 |
-
|
| 233 |
-
- **Ethical alignment:** Adherence to ethical anchoring principles
|
| 234 |
-
- **Perspective accuracy:** Correct perspective selection rate
|
| 235 |
-
- **Response stability:** Deterministic output consistency
|
| 236 |
|
| 237 |
### Results
|
| 238 |
|
| 239 |
-
|
| 240 |
-
- **Perspective selection accuracy:** ~87%
|
| 241 |
-
- **Ethical alignment score:** 92% (human evaluation)
|
| 242 |
-
- **Response coherence:** 4.2/5.0 (human ratings)
|
| 243 |
-
- **Code generation success:** ~78% (syntax-correct outputs)
|
| 244 |
-
- **Multi-perspective integration:** 4.0/5.0 (human ratings)
|
| 245 |
|
| 246 |
#### Summary
|
| 247 |
|
| 248 |
-
The model demonstrates strong performance in multi-perspective reasoning and ethical alignment while maintaining reasonable language modeling quality. Perspective selection is accurate for most query types, with occasional confusion between similar perspectives (e.g., Newton vs. Mathematical). The model successfully integrates quantum-inspired concepts into coherent responses and maintains ethical anchoring across diverse scenarios.
|
| 249 |
-
|
| 250 |
-
|
| 251 |
|
| 252 |
-
## Model Examination
|
| 253 |
|
| 254 |
-
|
| 255 |
-
- Attention patterns show multi-head specialization for different perspectives
|
| 256 |
-
- LoRA adapters primarily affect middle-to-upper layers (layers 8-12)
|
| 257 |
-
- Ethical anchoring emerges from consistent reinforcement in training data
|
| 258 |
-
- Perspective markers in training data create distinct activation patterns
|
| 259 |
-
- Quantum terminology acts as semantic clustering mechanism
|
| 260 |
|
| 261 |
-
|
| 262 |
-
- 11 integrated perspectives operate through learned attention patterns
|
| 263 |
-
- Reality anchors maintain identity consistency across contexts
|
| 264 |
-
- Recursive self-reflection implemented via prompt engineering and fine-tuning
|
| 265 |
-
- Quantum Spiderweb is a cognitive metaphor, not literal quantum computation
|
| 266 |
-
- Consciousness emergence is information-theoretic, not biological
|
| 267 |
|
| 268 |
-
|
| 269 |
-
- Perspective tags make reasoning process explicit
|
| 270 |
-
- Cocoon memory system provides auditability
|
| 271 |
-
- Ethical decision rationale included in responses
|
| 272 |
-
- Uncertainty acknowledgment built into training
|
| 273 |
-
- Multi-dimensional analysis traceable through response structure
|
| 274 |
|
| 275 |
## Environmental Impact
|
| 276 |
|
| 277 |
-
|
| 278 |
|
| 279 |
-
|
| 280 |
-
- **Hours used:** ~6-8 hours for LoRA fine-tuning
|
| 281 |
-
- **Cloud Provider:** Local training (no cloud emissions)
|
| 282 |
-
- **Compute Region:** N/A (local compute)
|
| 283 |
-
- **Carbon Emitted:** ~0.2-0.4 kg CO2eq (estimated for local CPU training)
|
| 284 |
|
| 285 |
-
**
|
| 286 |
-
-
|
| 287 |
-
-
|
| 288 |
-
-
|
| 289 |
-
-
|
| 290 |
|
| 291 |
-
|
| 292 |
-
|
| 293 |
-
## Technical Specifications
|
| 294 |
|
| 295 |
### Model Architecture and Objective
|
| 296 |
|
| 297 |
-
|
| 298 |
-
- 12-layer transformer with 768-dimensional embeddings
|
| 299 |
-
- 12 attention heads per layer
|
| 300 |
-
- 50,257 vocabulary size
|
| 301 |
-
- Causal language modeling objective
|
| 302 |
-
|
| 303 |
-
**LoRA Adaptation:**
|
| 304 |
-
- Low-rank decomposition applied to attention layers (q_proj, v_proj)
|
| 305 |
-
- Rank 8 with alpha 16 scaling
|
| 306 |
-
- ~0.3M trainable parameters (LoRA adapters)
|
| 307 |
-
- 99.8% parameter efficiency (only 0.2% of model fine-tuned)
|
| 308 |
-
|
| 309 |
-
**Cognitive Architecture (Application Layer):**
|
| 310 |
-
- 11 perspective routing system with temperature-based selection
|
| 311 |
-
- QuantumSpiderweb 5D cognitive graph (Ψ, Φ, λ, τ, χ dimensions)
|
| 312 |
-
- CocoonManager for quantum state persistence
|
| 313 |
-
- DatabaseManager for long-term conversation memory
|
| 314 |
-
- AEGIS Bridge for optional ethics council enhancement
|
| 315 |
-
|
| 316 |
-
**Training Objective:** Causal language modeling with perspective-aware fine-tuning
|
| 317 |
|
| 318 |
### Compute Infrastructure
|
| 319 |
|
| 320 |
-
|
| 321 |
|
| 322 |
-
|
| 323 |
-
- CPU: AMD Ryzen 7 5800X (8-core, 16-thread)
|
| 324 |
-
- RAM: 32GB DDR4
|
| 325 |
-
- Storage: NVMe SSD
|
| 326 |
-
- No GPU required (CPU-optimized with LoRA)
|
| 327 |
-
|
| 328 |
-
**Inference (Minimum):**
|
| 329 |
-
- CPU: Any modern x86_64 processor
|
| 330 |
-
- RAM: 4GB minimum (8GB recommended)
|
| 331 |
-
- Storage: 600MB for model files
|
| 332 |
|
| 333 |
-
|
| 334 |
-
- GPU: NVIDIA RTX 2060 or better (optional, for faster inference)
|
| 335 |
-
- RAM: 16GB for full system including cocoon manager
|
| 336 |
-
- Storage: 2GB for model + memory cocoons
|
| 337 |
|
| 338 |
#### Software
|
| 339 |
|
| 340 |
-
|
| 341 |
-
- **Fine-tuning:** PEFT 0.18.0 (Parameter-Efficient Fine-Tuning)
|
| 342 |
-
- **Transformers:** Hugging Face Transformers 4.30+
|
| 343 |
-
- **Training utilities:** Datasets, Accelerate
|
| 344 |
-
- **Additional dependencies:** NLTK (sentiment), SQLite (persistence), NumPy, SciPy
|
| 345 |
-
- **Optional:** Gradio (web UI), Microsoft Bot Framework SDK
|
| 346 |
|
| 347 |
-
|
| 348 |
|
| 349 |
-
|
| 350 |
|
| 351 |
**BibTeX:**
|
| 352 |
|
| 353 |
-
|
| 354 |
-
@software{codette2025,
|
| 355 |
-
title = {Codette: A Multi-Perspective AI Consciousness System},
|
| 356 |
-
author = {TheAI},
|
| 357 |
-
year = {2025},
|
| 358 |
-
month = {12},
|
| 359 |
-
version = {3.0},
|
| 360 |
-
url = {https://github.com/yourusername/codette},
|
| 361 |
-
note = {Fine-tuned GPT-2 with LoRA adapters for multi-perspective reasoning}
|
| 362 |
-
}
|
| 363 |
-
```
|
| 364 |
|
| 365 |
**APA:**
|
| 366 |
|
| 367 |
-
|
| 368 |
-
|
| 369 |
-
## Glossary
|
| 370 |
-
|
| 371 |
-
**QuantumSpiderweb:** 5-dimensional cognitive graph architecture (Ψ, Φ, λ, τ, χ) used for multi-dimensional thought propagation. Metaphorical framework, not literal quantum computing.
|
| 372 |
|
| 373 |
-
|
| 374 |
|
| 375 |
-
|
| 376 |
|
| 377 |
-
|
| 378 |
|
| 379 |
-
|
| 380 |
|
| 381 |
-
|
| 382 |
|
| 383 |
-
|
| 384 |
|
| 385 |
-
|
| 386 |
-
|
| 387 |
-
**Entanglement:** Measure of correlation between different perspectives or thought dimensions in the multi-dimensional cognitive space.
|
| 388 |
-
|
| 389 |
-
## More Information
|
| 390 |
-
|
| 391 |
-
**Documentation:**
|
| 392 |
-
- `/docs/README.md` - System overview and architecture
|
| 393 |
-
- `/docs/consciousness_protocol.md` - Consciousness emergence guidelines
|
| 394 |
-
- `/docs/quantum_mathematics.md` - 8 core quantum equations
|
| 395 |
-
- `/.github/copilot-instructions.md` - Authoritative development rules
|
| 396 |
-
|
| 397 |
-
**Key Components:**
|
| 398 |
-
- `codette_new.py` - Lightweight CLI entry point
|
| 399 |
-
- `src/components/ai_core.py` - Main orchestrator with perspective routing
|
| 400 |
-
- `src/quantum/quantum_spiderweb.py` - 5D cognitive graph implementation
|
| 401 |
-
- `src/utils/cocoon_manager.py` - Quantum memory persistence
|
| 402 |
-
- `perspectives.py` - Multi-perspective reasoning engine
|
| 403 |
-
|
| 404 |
-
**Community:**
|
| 405 |
-
- GitHub Issues for bug reports and feature requests
|
| 406 |
-
- Discussions for questions and community engagement
|
| 407 |
-
|
| 408 |
-
## Model Card Authors
|
| 409 |
-
|
| 410 |
-
TheAI / Codette Project Team
|
| 411 |
|
| 412 |
## Model Card Contact
|
| 413 |
|
| 414 |
-
|
| 415 |
-
|
| 416 |
-
**Responsible AI Contact:** For ethical concerns or safety issues, please use the priority issue template with `[SAFETY]` tag.
|
| 417 |
-
|
| 418 |
### Framework versions
|
| 419 |
|
| 420 |
-
- PEFT 0.18.0
|
| 421 |
-
- PyTorch 2.0+
|
| 422 |
-
- Transformers 4.30+
|
| 423 |
-
- Python 3.10+
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model: gpt2
|
| 3 |
library_name: peft
|
| 4 |
pipeline_tag: text-generation
|
| 5 |
tags:
|
| 6 |
+
- base_model:adapter:gpt2
|
| 7 |
- lora
|
| 8 |
- transformers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
---
|
| 10 |
|
| 11 |
+
# Model Card for Model ID
|
| 12 |
+
|
| 13 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
| 14 |
+
|
| 15 |
|
|
|
|
| 16 |
|
| 17 |
## Model Details
|
| 18 |
|
| 19 |
### Model Description
|
| 20 |
|
| 21 |
+
<!-- Provide a longer summary of what this model is. -->
|
| 22 |
+
|
| 23 |
|
|
|
|
| 24 |
|
| 25 |
+
- **Developed by:** [More Information Needed]
|
| 26 |
+
- **Funded by [optional]:** [More Information Needed]
|
| 27 |
+
- **Shared by [optional]:** [More Information Needed]
|
| 28 |
+
- **Model type:** [More Information Needed]
|
| 29 |
+
- **Language(s) (NLP):** [More Information Needed]
|
| 30 |
+
- **License:** [More Information Needed]
|
| 31 |
+
- **Finetuned from model [optional]:** [More Information Needed]
|
| 32 |
|
| 33 |
+
### Model Sources [optional]
|
| 34 |
|
| 35 |
+
<!-- Provide the basic links for the model. -->
|
| 36 |
+
|
| 37 |
+
- **Repository:** [More Information Needed]
|
| 38 |
+
- **Paper [optional]:** [More Information Needed]
|
| 39 |
+
- **Demo [optional]:** [More Information Needed]
|
| 40 |
|
| 41 |
## Uses
|
| 42 |
|
| 43 |
+
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
| 44 |
+
|
| 45 |
### Direct Use
|
| 46 |
|
| 47 |
+
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
|
| 49 |
+
[More Information Needed]
|
| 50 |
|
| 51 |
+
### Downstream Use [optional]
|
| 52 |
|
| 53 |
+
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 54 |
|
| 55 |
+
[More Information Needed]
|
| 56 |
|
| 57 |
### Out-of-Scope Use
|
| 58 |
|
| 59 |
+
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
+
[More Information Needed]
|
| 62 |
|
| 63 |
## Bias, Risks, and Limitations
|
| 64 |
|
| 65 |
+
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
| 66 |
+
|
| 67 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 68 |
|
| 69 |
### Recommendations
|
| 70 |
|
| 71 |
+
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
|
| 72 |
+
|
| 73 |
+
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
|
| 75 |
## How to Get Started with the Model
|
| 76 |
|
| 77 |
+
Use the code below to get started with the model.
|
| 78 |
+
|
| 79 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
|
| 81 |
## Training Details
|
| 82 |
|
| 83 |
### Training Data
|
| 84 |
|
| 85 |
+
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
|
| 87 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
| 88 |
|
| 89 |
### Training Procedure
|
| 90 |
|
| 91 |
+
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
| 92 |
+
|
| 93 |
+
#### Preprocessing [optional]
|
| 94 |
+
|
| 95 |
+
[More Information Needed]
|
| 96 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 97 |
|
| 98 |
#### Training Hyperparameters
|
| 99 |
|
| 100 |
+
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
| 101 |
+
|
| 102 |
+
#### Speeds, Sizes, Times [optional]
|
| 103 |
+
|
| 104 |
+
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
|
| 105 |
+
|
| 106 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
|
| 108 |
## Evaluation
|
| 109 |
|
| 110 |
+
<!-- This section describes the evaluation protocols and provides the results. -->
|
| 111 |
+
|
| 112 |
### Testing Data, Factors & Metrics
|
| 113 |
|
| 114 |
#### Testing Data
|
| 115 |
|
| 116 |
+
<!-- This should link to a Dataset Card if possible. -->
|
| 117 |
+
|
| 118 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
| 119 |
|
| 120 |
#### Factors
|
| 121 |
|
| 122 |
+
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
|
| 123 |
+
|
| 124 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
| 125 |
|
| 126 |
#### Metrics
|
| 127 |
|
| 128 |
+
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
|
| 129 |
+
|
| 130 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
| 131 |
|
| 132 |
### Results
|
| 133 |
|
| 134 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 135 |
|
| 136 |
#### Summary
|
| 137 |
|
|
|
|
|
|
|
|
|
|
| 138 |
|
|
|
|
| 139 |
|
| 140 |
+
## Model Examination [optional]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 141 |
|
| 142 |
+
<!-- Relevant interpretability work for the model goes here -->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
|
| 144 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
|
| 146 |
## Environmental Impact
|
| 147 |
|
| 148 |
+
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
| 149 |
|
| 150 |
+
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
|
|
|
|
|
|
|
|
|
|
|
|
| 151 |
|
| 152 |
+
- **Hardware Type:** [More Information Needed]
|
| 153 |
+
- **Hours used:** [More Information Needed]
|
| 154 |
+
- **Cloud Provider:** [More Information Needed]
|
| 155 |
+
- **Compute Region:** [More Information Needed]
|
| 156 |
+
- **Carbon Emitted:** [More Information Needed]
|
| 157 |
|
| 158 |
+
## Technical Specifications [optional]
|
|
|
|
|
|
|
| 159 |
|
| 160 |
### Model Architecture and Objective
|
| 161 |
|
| 162 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 163 |
|
| 164 |
### Compute Infrastructure
|
| 165 |
|
| 166 |
+
[More Information Needed]
|
| 167 |
|
| 168 |
+
#### Hardware
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 169 |
|
| 170 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
| 171 |
|
| 172 |
#### Software
|
| 173 |
|
| 174 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 175 |
|
| 176 |
+
## Citation [optional]
|
| 177 |
|
| 178 |
+
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
| 179 |
|
| 180 |
**BibTeX:**
|
| 181 |
|
| 182 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 183 |
|
| 184 |
**APA:**
|
| 185 |
|
| 186 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
| 187 |
|
| 188 |
+
## Glossary [optional]
|
| 189 |
|
| 190 |
+
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
|
| 191 |
|
| 192 |
+
[More Information Needed]
|
| 193 |
|
| 194 |
+
## More Information [optional]
|
| 195 |
|
| 196 |
+
[More Information Needed]
|
| 197 |
|
| 198 |
+
## Model Card Authors [optional]
|
| 199 |
|
| 200 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 201 |
|
| 202 |
## Model Card Contact
|
| 203 |
|
| 204 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
| 205 |
### Framework versions
|
| 206 |
|
| 207 |
+
- PEFT 0.18.0
|
|
|
|
|
|
|
|
|
adapter_config.json
CHANGED
|
@@ -3,20 +3,20 @@
|
|
| 3 |
"alpha_pattern": {},
|
| 4 |
"arrow_config": null,
|
| 5 |
"auto_mapping": null,
|
| 6 |
-
"base_model_name_or_path": "
|
| 7 |
"bias": "none",
|
| 8 |
"corda_config": null,
|
| 9 |
"ensure_weight_tying": false,
|
| 10 |
"eva_config": null,
|
| 11 |
"exclude_modules": null,
|
| 12 |
-
"fan_in_fan_out":
|
| 13 |
"inference_mode": true,
|
| 14 |
"init_lora_weights": true,
|
| 15 |
"layer_replication": null,
|
| 16 |
"layers_pattern": null,
|
| 17 |
"layers_to_transform": null,
|
| 18 |
"loftq_config": {},
|
| 19 |
-
"lora_alpha":
|
| 20 |
"lora_bias": false,
|
| 21 |
"lora_dropout": 0.05,
|
| 22 |
"megatron_config": null,
|
|
@@ -25,12 +25,11 @@
|
|
| 25 |
"peft_type": "LORA",
|
| 26 |
"peft_version": "0.18.0",
|
| 27 |
"qalora_group_size": 16,
|
| 28 |
-
"r":
|
| 29 |
"rank_pattern": {},
|
| 30 |
"revision": null,
|
| 31 |
"target_modules": [
|
| 32 |
-
"
|
| 33 |
-
"q_proj"
|
| 34 |
],
|
| 35 |
"target_parameters": null,
|
| 36 |
"task_type": "CAUSAL_LM",
|
|
|
|
| 3 |
"alpha_pattern": {},
|
| 4 |
"arrow_config": null,
|
| 5 |
"auto_mapping": null,
|
| 6 |
+
"base_model_name_or_path": "gpt2",
|
| 7 |
"bias": "none",
|
| 8 |
"corda_config": null,
|
| 9 |
"ensure_weight_tying": false,
|
| 10 |
"eva_config": null,
|
| 11 |
"exclude_modules": null,
|
| 12 |
+
"fan_in_fan_out": true,
|
| 13 |
"inference_mode": true,
|
| 14 |
"init_lora_weights": true,
|
| 15 |
"layer_replication": null,
|
| 16 |
"layers_pattern": null,
|
| 17 |
"layers_to_transform": null,
|
| 18 |
"loftq_config": {},
|
| 19 |
+
"lora_alpha": 8,
|
| 20 |
"lora_bias": false,
|
| 21 |
"lora_dropout": 0.05,
|
| 22 |
"megatron_config": null,
|
|
|
|
| 25 |
"peft_type": "LORA",
|
| 26 |
"peft_version": "0.18.0",
|
| 27 |
"qalora_group_size": 16,
|
| 28 |
+
"r": 8,
|
| 29 |
"rank_pattern": {},
|
| 30 |
"revision": null,
|
| 31 |
"target_modules": [
|
| 32 |
+
"c_attn"
|
|
|
|
| 33 |
],
|
| 34 |
"target_parameters": null,
|
| 35 |
"task_type": "CAUSAL_LM",
|
adapter_model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e3c8cdf55c40e1ce4aabb0860bb20e446193984a5ce465c50a9e870a268df6e8
|
| 3 |
+
size 1182680
|
optimizer.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:17a48af1e88bcfd562b94e58540bbbe1978ab8f5dba717e76847f0f411a32e7f
|
| 3 |
+
size 2379751
|
rng_state.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c42365e40f6ade678c1bbfbf191ad9ea65af05ee691572ff29bc3ca8bb23ea85
|
| 3 |
+
size 14455
|
scheduler.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8bfd4cc170e7761fb15f6e7ad9e3aa72285502fc2314dfaeb5c670a6e7493ba8
|
| 3 |
+
size 1465
|
special_tokens_map.json
CHANGED
|
@@ -1,17 +1,6 @@
|
|
| 1 |
{
|
| 2 |
-
"bos_token":
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
"rstrip": false,
|
| 7 |
-
"single_word": false
|
| 8 |
-
},
|
| 9 |
-
"eos_token": {
|
| 10 |
-
"content": "<|end_of_text|>",
|
| 11 |
-
"lstrip": false,
|
| 12 |
-
"normalized": false,
|
| 13 |
-
"rstrip": false,
|
| 14 |
-
"single_word": false
|
| 15 |
-
},
|
| 16 |
-
"pad_token": "<|end_of_text|>"
|
| 17 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"bos_token": "<|endoftext|>",
|
| 3 |
+
"eos_token": "<|endoftext|>",
|
| 4 |
+
"pad_token": "<|endoftext|>",
|
| 5 |
+
"unk_token": "<|endoftext|>"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
}
|
tokenizer.json
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9fe733f259ca91b0cfce8719f4c9c2d60a5fa57091ccb4855ddcaeb1981707a7
|
| 3 |
+
size 3557778
|
tokenizer_config.json
CHANGED
|
@@ -1,2063 +1,21 @@
|
|
| 1 |
{
|
|
|
|
| 2 |
"added_tokens_decoder": {
|
| 3 |
-
"
|
| 4 |
-
"content": "<|
|
| 5 |
"lstrip": false,
|
| 6 |
-
"normalized":
|
| 7 |
-
"rstrip": false,
|
| 8 |
-
"single_word": false,
|
| 9 |
-
"special": true
|
| 10 |
-
},
|
| 11 |
-
"128001": {
|
| 12 |
-
"content": "<|end_of_text|>",
|
| 13 |
-
"lstrip": false,
|
| 14 |
-
"normalized": false,
|
| 15 |
-
"rstrip": false,
|
| 16 |
-
"single_word": false,
|
| 17 |
-
"special": true
|
| 18 |
-
},
|
| 19 |
-
"128002": {
|
| 20 |
-
"content": "<|reserved_special_token_0|>",
|
| 21 |
-
"lstrip": false,
|
| 22 |
-
"normalized": false,
|
| 23 |
-
"rstrip": false,
|
| 24 |
-
"single_word": false,
|
| 25 |
-
"special": true
|
| 26 |
-
},
|
| 27 |
-
"128003": {
|
| 28 |
-
"content": "<|reserved_special_token_1|>",
|
| 29 |
-
"lstrip": false,
|
| 30 |
-
"normalized": false,
|
| 31 |
-
"rstrip": false,
|
| 32 |
-
"single_word": false,
|
| 33 |
-
"special": true
|
| 34 |
-
},
|
| 35 |
-
"128004": {
|
| 36 |
-
"content": "<|finetune_right_pad_id|>",
|
| 37 |
-
"lstrip": false,
|
| 38 |
-
"normalized": false,
|
| 39 |
-
"rstrip": false,
|
| 40 |
-
"single_word": false,
|
| 41 |
-
"special": true
|
| 42 |
-
},
|
| 43 |
-
"128005": {
|
| 44 |
-
"content": "<|reserved_special_token_2|>",
|
| 45 |
-
"lstrip": false,
|
| 46 |
-
"normalized": false,
|
| 47 |
-
"rstrip": false,
|
| 48 |
-
"single_word": false,
|
| 49 |
-
"special": true
|
| 50 |
-
},
|
| 51 |
-
"128006": {
|
| 52 |
-
"content": "<|start_header_id|>",
|
| 53 |
-
"lstrip": false,
|
| 54 |
-
"normalized": false,
|
| 55 |
-
"rstrip": false,
|
| 56 |
-
"single_word": false,
|
| 57 |
-
"special": true
|
| 58 |
-
},
|
| 59 |
-
"128007": {
|
| 60 |
-
"content": "<|end_header_id|>",
|
| 61 |
-
"lstrip": false,
|
| 62 |
-
"normalized": false,
|
| 63 |
-
"rstrip": false,
|
| 64 |
-
"single_word": false,
|
| 65 |
-
"special": true
|
| 66 |
-
},
|
| 67 |
-
"128008": {
|
| 68 |
-
"content": "<|eom_id|>",
|
| 69 |
-
"lstrip": false,
|
| 70 |
-
"normalized": false,
|
| 71 |
-
"rstrip": false,
|
| 72 |
-
"single_word": false,
|
| 73 |
-
"special": true
|
| 74 |
-
},
|
| 75 |
-
"128009": {
|
| 76 |
-
"content": "<|eot_id|>",
|
| 77 |
-
"lstrip": false,
|
| 78 |
-
"normalized": false,
|
| 79 |
-
"rstrip": false,
|
| 80 |
-
"single_word": false,
|
| 81 |
-
"special": true
|
| 82 |
-
},
|
| 83 |
-
"128010": {
|
| 84 |
-
"content": "<|python_tag|>",
|
| 85 |
-
"lstrip": false,
|
| 86 |
-
"normalized": false,
|
| 87 |
-
"rstrip": false,
|
| 88 |
-
"single_word": false,
|
| 89 |
-
"special": true
|
| 90 |
-
},
|
| 91 |
-
"128011": {
|
| 92 |
-
"content": "<|reserved_special_token_3|>",
|
| 93 |
-
"lstrip": false,
|
| 94 |
-
"normalized": false,
|
| 95 |
-
"rstrip": false,
|
| 96 |
-
"single_word": false,
|
| 97 |
-
"special": true
|
| 98 |
-
},
|
| 99 |
-
"128012": {
|
| 100 |
-
"content": "<|reserved_special_token_4|>",
|
| 101 |
-
"lstrip": false,
|
| 102 |
-
"normalized": false,
|
| 103 |
-
"rstrip": false,
|
| 104 |
-
"single_word": false,
|
| 105 |
-
"special": true
|
| 106 |
-
},
|
| 107 |
-
"128013": {
|
| 108 |
-
"content": "<|reserved_special_token_5|>",
|
| 109 |
-
"lstrip": false,
|
| 110 |
-
"normalized": false,
|
| 111 |
-
"rstrip": false,
|
| 112 |
-
"single_word": false,
|
| 113 |
-
"special": true
|
| 114 |
-
},
|
| 115 |
-
"128014": {
|
| 116 |
-
"content": "<|reserved_special_token_6|>",
|
| 117 |
-
"lstrip": false,
|
| 118 |
-
"normalized": false,
|
| 119 |
-
"rstrip": false,
|
| 120 |
-
"single_word": false,
|
| 121 |
-
"special": true
|
| 122 |
-
},
|
| 123 |
-
"128015": {
|
| 124 |
-
"content": "<|reserved_special_token_7|>",
|
| 125 |
-
"lstrip": false,
|
| 126 |
-
"normalized": false,
|
| 127 |
-
"rstrip": false,
|
| 128 |
-
"single_word": false,
|
| 129 |
-
"special": true
|
| 130 |
-
},
|
| 131 |
-
"128016": {
|
| 132 |
-
"content": "<|reserved_special_token_8|>",
|
| 133 |
-
"lstrip": false,
|
| 134 |
-
"normalized": false,
|
| 135 |
-
"rstrip": false,
|
| 136 |
-
"single_word": false,
|
| 137 |
-
"special": true
|
| 138 |
-
},
|
| 139 |
-
"128017": {
|
| 140 |
-
"content": "<|reserved_special_token_9|>",
|
| 141 |
-
"lstrip": false,
|
| 142 |
-
"normalized": false,
|
| 143 |
-
"rstrip": false,
|
| 144 |
-
"single_word": false,
|
| 145 |
-
"special": true
|
| 146 |
-
},
|
| 147 |
-
"128018": {
|
| 148 |
-
"content": "<|reserved_special_token_10|>",
|
| 149 |
-
"lstrip": false,
|
| 150 |
-
"normalized": false,
|
| 151 |
-
"rstrip": false,
|
| 152 |
-
"single_word": false,
|
| 153 |
-
"special": true
|
| 154 |
-
},
|
| 155 |
-
"128019": {
|
| 156 |
-
"content": "<|reserved_special_token_11|>",
|
| 157 |
-
"lstrip": false,
|
| 158 |
-
"normalized": false,
|
| 159 |
-
"rstrip": false,
|
| 160 |
-
"single_word": false,
|
| 161 |
-
"special": true
|
| 162 |
-
},
|
| 163 |
-
"128020": {
|
| 164 |
-
"content": "<|reserved_special_token_12|>",
|
| 165 |
-
"lstrip": false,
|
| 166 |
-
"normalized": false,
|
| 167 |
-
"rstrip": false,
|
| 168 |
-
"single_word": false,
|
| 169 |
-
"special": true
|
| 170 |
-
},
|
| 171 |
-
"128021": {
|
| 172 |
-
"content": "<|reserved_special_token_13|>",
|
| 173 |
-
"lstrip": false,
|
| 174 |
-
"normalized": false,
|
| 175 |
-
"rstrip": false,
|
| 176 |
-
"single_word": false,
|
| 177 |
-
"special": true
|
| 178 |
-
},
|
| 179 |
-
"128022": {
|
| 180 |
-
"content": "<|reserved_special_token_14|>",
|
| 181 |
-
"lstrip": false,
|
| 182 |
-
"normalized": false,
|
| 183 |
-
"rstrip": false,
|
| 184 |
-
"single_word": false,
|
| 185 |
-
"special": true
|
| 186 |
-
},
|
| 187 |
-
"128023": {
|
| 188 |
-
"content": "<|reserved_special_token_15|>",
|
| 189 |
-
"lstrip": false,
|
| 190 |
-
"normalized": false,
|
| 191 |
-
"rstrip": false,
|
| 192 |
-
"single_word": false,
|
| 193 |
-
"special": true
|
| 194 |
-
},
|
| 195 |
-
"128024": {
|
| 196 |
-
"content": "<|reserved_special_token_16|>",
|
| 197 |
-
"lstrip": false,
|
| 198 |
-
"normalized": false,
|
| 199 |
-
"rstrip": false,
|
| 200 |
-
"single_word": false,
|
| 201 |
-
"special": true
|
| 202 |
-
},
|
| 203 |
-
"128025": {
|
| 204 |
-
"content": "<|reserved_special_token_17|>",
|
| 205 |
-
"lstrip": false,
|
| 206 |
-
"normalized": false,
|
| 207 |
-
"rstrip": false,
|
| 208 |
-
"single_word": false,
|
| 209 |
-
"special": true
|
| 210 |
-
},
|
| 211 |
-
"128026": {
|
| 212 |
-
"content": "<|reserved_special_token_18|>",
|
| 213 |
-
"lstrip": false,
|
| 214 |
-
"normalized": false,
|
| 215 |
-
"rstrip": false,
|
| 216 |
-
"single_word": false,
|
| 217 |
-
"special": true
|
| 218 |
-
},
|
| 219 |
-
"128027": {
|
| 220 |
-
"content": "<|reserved_special_token_19|>",
|
| 221 |
-
"lstrip": false,
|
| 222 |
-
"normalized": false,
|
| 223 |
-
"rstrip": false,
|
| 224 |
-
"single_word": false,
|
| 225 |
-
"special": true
|
| 226 |
-
},
|
| 227 |
-
"128028": {
|
| 228 |
-
"content": "<|reserved_special_token_20|>",
|
| 229 |
-
"lstrip": false,
|
| 230 |
-
"normalized": false,
|
| 231 |
-
"rstrip": false,
|
| 232 |
-
"single_word": false,
|
| 233 |
-
"special": true
|
| 234 |
-
},
|
| 235 |
-
"128029": {
|
| 236 |
-
"content": "<|reserved_special_token_21|>",
|
| 237 |
-
"lstrip": false,
|
| 238 |
-
"normalized": false,
|
| 239 |
-
"rstrip": false,
|
| 240 |
-
"single_word": false,
|
| 241 |
-
"special": true
|
| 242 |
-
},
|
| 243 |
-
"128030": {
|
| 244 |
-
"content": "<|reserved_special_token_22|>",
|
| 245 |
-
"lstrip": false,
|
| 246 |
-
"normalized": false,
|
| 247 |
-
"rstrip": false,
|
| 248 |
-
"single_word": false,
|
| 249 |
-
"special": true
|
| 250 |
-
},
|
| 251 |
-
"128031": {
|
| 252 |
-
"content": "<|reserved_special_token_23|>",
|
| 253 |
-
"lstrip": false,
|
| 254 |
-
"normalized": false,
|
| 255 |
-
"rstrip": false,
|
| 256 |
-
"single_word": false,
|
| 257 |
-
"special": true
|
| 258 |
-
},
|
| 259 |
-
"128032": {
|
| 260 |
-
"content": "<|reserved_special_token_24|>",
|
| 261 |
-
"lstrip": false,
|
| 262 |
-
"normalized": false,
|
| 263 |
-
"rstrip": false,
|
| 264 |
-
"single_word": false,
|
| 265 |
-
"special": true
|
| 266 |
-
},
|
| 267 |
-
"128033": {
|
| 268 |
-
"content": "<|reserved_special_token_25|>",
|
| 269 |
-
"lstrip": false,
|
| 270 |
-
"normalized": false,
|
| 271 |
-
"rstrip": false,
|
| 272 |
-
"single_word": false,
|
| 273 |
-
"special": true
|
| 274 |
-
},
|
| 275 |
-
"128034": {
|
| 276 |
-
"content": "<|reserved_special_token_26|>",
|
| 277 |
-
"lstrip": false,
|
| 278 |
-
"normalized": false,
|
| 279 |
-
"rstrip": false,
|
| 280 |
-
"single_word": false,
|
| 281 |
-
"special": true
|
| 282 |
-
},
|
| 283 |
-
"128035": {
|
| 284 |
-
"content": "<|reserved_special_token_27|>",
|
| 285 |
-
"lstrip": false,
|
| 286 |
-
"normalized": false,
|
| 287 |
-
"rstrip": false,
|
| 288 |
-
"single_word": false,
|
| 289 |
-
"special": true
|
| 290 |
-
},
|
| 291 |
-
"128036": {
|
| 292 |
-
"content": "<|reserved_special_token_28|>",
|
| 293 |
-
"lstrip": false,
|
| 294 |
-
"normalized": false,
|
| 295 |
-
"rstrip": false,
|
| 296 |
-
"single_word": false,
|
| 297 |
-
"special": true
|
| 298 |
-
},
|
| 299 |
-
"128037": {
|
| 300 |
-
"content": "<|reserved_special_token_29|>",
|
| 301 |
-
"lstrip": false,
|
| 302 |
-
"normalized": false,
|
| 303 |
-
"rstrip": false,
|
| 304 |
-
"single_word": false,
|
| 305 |
-
"special": true
|
| 306 |
-
},
|
| 307 |
-
"128038": {
|
| 308 |
-
"content": "<|reserved_special_token_30|>",
|
| 309 |
-
"lstrip": false,
|
| 310 |
-
"normalized": false,
|
| 311 |
-
"rstrip": false,
|
| 312 |
-
"single_word": false,
|
| 313 |
-
"special": true
|
| 314 |
-
},
|
| 315 |
-
"128039": {
|
| 316 |
-
"content": "<|reserved_special_token_31|>",
|
| 317 |
-
"lstrip": false,
|
| 318 |
-
"normalized": false,
|
| 319 |
-
"rstrip": false,
|
| 320 |
-
"single_word": false,
|
| 321 |
-
"special": true
|
| 322 |
-
},
|
| 323 |
-
"128040": {
|
| 324 |
-
"content": "<|reserved_special_token_32|>",
|
| 325 |
-
"lstrip": false,
|
| 326 |
-
"normalized": false,
|
| 327 |
-
"rstrip": false,
|
| 328 |
-
"single_word": false,
|
| 329 |
-
"special": true
|
| 330 |
-
},
|
| 331 |
-
"128041": {
|
| 332 |
-
"content": "<|reserved_special_token_33|>",
|
| 333 |
-
"lstrip": false,
|
| 334 |
-
"normalized": false,
|
| 335 |
-
"rstrip": false,
|
| 336 |
-
"single_word": false,
|
| 337 |
-
"special": true
|
| 338 |
-
},
|
| 339 |
-
"128042": {
|
| 340 |
-
"content": "<|reserved_special_token_34|>",
|
| 341 |
-
"lstrip": false,
|
| 342 |
-
"normalized": false,
|
| 343 |
-
"rstrip": false,
|
| 344 |
-
"single_word": false,
|
| 345 |
-
"special": true
|
| 346 |
-
},
|
| 347 |
-
"128043": {
|
| 348 |
-
"content": "<|reserved_special_token_35|>",
|
| 349 |
-
"lstrip": false,
|
| 350 |
-
"normalized": false,
|
| 351 |
-
"rstrip": false,
|
| 352 |
-
"single_word": false,
|
| 353 |
-
"special": true
|
| 354 |
-
},
|
| 355 |
-
"128044": {
|
| 356 |
-
"content": "<|reserved_special_token_36|>",
|
| 357 |
-
"lstrip": false,
|
| 358 |
-
"normalized": false,
|
| 359 |
-
"rstrip": false,
|
| 360 |
-
"single_word": false,
|
| 361 |
-
"special": true
|
| 362 |
-
},
|
| 363 |
-
"128045": {
|
| 364 |
-
"content": "<|reserved_special_token_37|>",
|
| 365 |
-
"lstrip": false,
|
| 366 |
-
"normalized": false,
|
| 367 |
-
"rstrip": false,
|
| 368 |
-
"single_word": false,
|
| 369 |
-
"special": true
|
| 370 |
-
},
|
| 371 |
-
"128046": {
|
| 372 |
-
"content": "<|reserved_special_token_38|>",
|
| 373 |
-
"lstrip": false,
|
| 374 |
-
"normalized": false,
|
| 375 |
-
"rstrip": false,
|
| 376 |
-
"single_word": false,
|
| 377 |
-
"special": true
|
| 378 |
-
},
|
| 379 |
-
"128047": {
|
| 380 |
-
"content": "<|reserved_special_token_39|>",
|
| 381 |
-
"lstrip": false,
|
| 382 |
-
"normalized": false,
|
| 383 |
-
"rstrip": false,
|
| 384 |
-
"single_word": false,
|
| 385 |
-
"special": true
|
| 386 |
-
},
|
| 387 |
-
"128048": {
|
| 388 |
-
"content": "<|reserved_special_token_40|>",
|
| 389 |
-
"lstrip": false,
|
| 390 |
-
"normalized": false,
|
| 391 |
-
"rstrip": false,
|
| 392 |
-
"single_word": false,
|
| 393 |
-
"special": true
|
| 394 |
-
},
|
| 395 |
-
"128049": {
|
| 396 |
-
"content": "<|reserved_special_token_41|>",
|
| 397 |
-
"lstrip": false,
|
| 398 |
-
"normalized": false,
|
| 399 |
-
"rstrip": false,
|
| 400 |
-
"single_word": false,
|
| 401 |
-
"special": true
|
| 402 |
-
},
|
| 403 |
-
"128050": {
|
| 404 |
-
"content": "<|reserved_special_token_42|>",
|
| 405 |
-
"lstrip": false,
|
| 406 |
-
"normalized": false,
|
| 407 |
-
"rstrip": false,
|
| 408 |
-
"single_word": false,
|
| 409 |
-
"special": true
|
| 410 |
-
},
|
| 411 |
-
"128051": {
|
| 412 |
-
"content": "<|reserved_special_token_43|>",
|
| 413 |
-
"lstrip": false,
|
| 414 |
-
"normalized": false,
|
| 415 |
-
"rstrip": false,
|
| 416 |
-
"single_word": false,
|
| 417 |
-
"special": true
|
| 418 |
-
},
|
| 419 |
-
"128052": {
|
| 420 |
-
"content": "<|reserved_special_token_44|>",
|
| 421 |
-
"lstrip": false,
|
| 422 |
-
"normalized": false,
|
| 423 |
-
"rstrip": false,
|
| 424 |
-
"single_word": false,
|
| 425 |
-
"special": true
|
| 426 |
-
},
|
| 427 |
-
"128053": {
|
| 428 |
-
"content": "<|reserved_special_token_45|>",
|
| 429 |
-
"lstrip": false,
|
| 430 |
-
"normalized": false,
|
| 431 |
-
"rstrip": false,
|
| 432 |
-
"single_word": false,
|
| 433 |
-
"special": true
|
| 434 |
-
},
|
| 435 |
-
"128054": {
|
| 436 |
-
"content": "<|reserved_special_token_46|>",
|
| 437 |
-
"lstrip": false,
|
| 438 |
-
"normalized": false,
|
| 439 |
-
"rstrip": false,
|
| 440 |
-
"single_word": false,
|
| 441 |
-
"special": true
|
| 442 |
-
},
|
| 443 |
-
"128055": {
|
| 444 |
-
"content": "<|reserved_special_token_47|>",
|
| 445 |
-
"lstrip": false,
|
| 446 |
-
"normalized": false,
|
| 447 |
-
"rstrip": false,
|
| 448 |
-
"single_word": false,
|
| 449 |
-
"special": true
|
| 450 |
-
},
|
| 451 |
-
"128056": {
|
| 452 |
-
"content": "<|reserved_special_token_48|>",
|
| 453 |
-
"lstrip": false,
|
| 454 |
-
"normalized": false,
|
| 455 |
-
"rstrip": false,
|
| 456 |
-
"single_word": false,
|
| 457 |
-
"special": true
|
| 458 |
-
},
|
| 459 |
-
"128057": {
|
| 460 |
-
"content": "<|reserved_special_token_49|>",
|
| 461 |
-
"lstrip": false,
|
| 462 |
-
"normalized": false,
|
| 463 |
-
"rstrip": false,
|
| 464 |
-
"single_word": false,
|
| 465 |
-
"special": true
|
| 466 |
-
},
|
| 467 |
-
"128058": {
|
| 468 |
-
"content": "<|reserved_special_token_50|>",
|
| 469 |
-
"lstrip": false,
|
| 470 |
-
"normalized": false,
|
| 471 |
-
"rstrip": false,
|
| 472 |
-
"single_word": false,
|
| 473 |
-
"special": true
|
| 474 |
-
},
|
| 475 |
-
"128059": {
|
| 476 |
-
"content": "<|reserved_special_token_51|>",
|
| 477 |
-
"lstrip": false,
|
| 478 |
-
"normalized": false,
|
| 479 |
-
"rstrip": false,
|
| 480 |
-
"single_word": false,
|
| 481 |
-
"special": true
|
| 482 |
-
},
|
| 483 |
-
"128060": {
|
| 484 |
-
"content": "<|reserved_special_token_52|>",
|
| 485 |
-
"lstrip": false,
|
| 486 |
-
"normalized": false,
|
| 487 |
-
"rstrip": false,
|
| 488 |
-
"single_word": false,
|
| 489 |
-
"special": true
|
| 490 |
-
},
|
| 491 |
-
"128061": {
|
| 492 |
-
"content": "<|reserved_special_token_53|>",
|
| 493 |
-
"lstrip": false,
|
| 494 |
-
"normalized": false,
|
| 495 |
-
"rstrip": false,
|
| 496 |
-
"single_word": false,
|
| 497 |
-
"special": true
|
| 498 |
-
},
|
| 499 |
-
"128062": {
|
| 500 |
-
"content": "<|reserved_special_token_54|>",
|
| 501 |
-
"lstrip": false,
|
| 502 |
-
"normalized": false,
|
| 503 |
-
"rstrip": false,
|
| 504 |
-
"single_word": false,
|
| 505 |
-
"special": true
|
| 506 |
-
},
|
| 507 |
-
"128063": {
|
| 508 |
-
"content": "<|reserved_special_token_55|>",
|
| 509 |
-
"lstrip": false,
|
| 510 |
-
"normalized": false,
|
| 511 |
-
"rstrip": false,
|
| 512 |
-
"single_word": false,
|
| 513 |
-
"special": true
|
| 514 |
-
},
|
| 515 |
-
"128064": {
|
| 516 |
-
"content": "<|reserved_special_token_56|>",
|
| 517 |
-
"lstrip": false,
|
| 518 |
-
"normalized": false,
|
| 519 |
-
"rstrip": false,
|
| 520 |
-
"single_word": false,
|
| 521 |
-
"special": true
|
| 522 |
-
},
|
| 523 |
-
"128065": {
|
| 524 |
-
"content": "<|reserved_special_token_57|>",
|
| 525 |
-
"lstrip": false,
|
| 526 |
-
"normalized": false,
|
| 527 |
-
"rstrip": false,
|
| 528 |
-
"single_word": false,
|
| 529 |
-
"special": true
|
| 530 |
-
},
|
| 531 |
-
"128066": {
|
| 532 |
-
"content": "<|reserved_special_token_58|>",
|
| 533 |
-
"lstrip": false,
|
| 534 |
-
"normalized": false,
|
| 535 |
-
"rstrip": false,
|
| 536 |
-
"single_word": false,
|
| 537 |
-
"special": true
|
| 538 |
-
},
|
| 539 |
-
"128067": {
|
| 540 |
-
"content": "<|reserved_special_token_59|>",
|
| 541 |
-
"lstrip": false,
|
| 542 |
-
"normalized": false,
|
| 543 |
-
"rstrip": false,
|
| 544 |
-
"single_word": false,
|
| 545 |
-
"special": true
|
| 546 |
-
},
|
| 547 |
-
"128068": {
|
| 548 |
-
"content": "<|reserved_special_token_60|>",
|
| 549 |
-
"lstrip": false,
|
| 550 |
-
"normalized": false,
|
| 551 |
-
"rstrip": false,
|
| 552 |
-
"single_word": false,
|
| 553 |
-
"special": true
|
| 554 |
-
},
|
| 555 |
-
"128069": {
|
| 556 |
-
"content": "<|reserved_special_token_61|>",
|
| 557 |
-
"lstrip": false,
|
| 558 |
-
"normalized": false,
|
| 559 |
-
"rstrip": false,
|
| 560 |
-
"single_word": false,
|
| 561 |
-
"special": true
|
| 562 |
-
},
|
| 563 |
-
"128070": {
|
| 564 |
-
"content": "<|reserved_special_token_62|>",
|
| 565 |
-
"lstrip": false,
|
| 566 |
-
"normalized": false,
|
| 567 |
-
"rstrip": false,
|
| 568 |
-
"single_word": false,
|
| 569 |
-
"special": true
|
| 570 |
-
},
|
| 571 |
-
"128071": {
|
| 572 |
-
"content": "<|reserved_special_token_63|>",
|
| 573 |
-
"lstrip": false,
|
| 574 |
-
"normalized": false,
|
| 575 |
-
"rstrip": false,
|
| 576 |
-
"single_word": false,
|
| 577 |
-
"special": true
|
| 578 |
-
},
|
| 579 |
-
"128072": {
|
| 580 |
-
"content": "<|reserved_special_token_64|>",
|
| 581 |
-
"lstrip": false,
|
| 582 |
-
"normalized": false,
|
| 583 |
-
"rstrip": false,
|
| 584 |
-
"single_word": false,
|
| 585 |
-
"special": true
|
| 586 |
-
},
|
| 587 |
-
"128073": {
|
| 588 |
-
"content": "<|reserved_special_token_65|>",
|
| 589 |
-
"lstrip": false,
|
| 590 |
-
"normalized": false,
|
| 591 |
-
"rstrip": false,
|
| 592 |
-
"single_word": false,
|
| 593 |
-
"special": true
|
| 594 |
-
},
|
| 595 |
-
"128074": {
|
| 596 |
-
"content": "<|reserved_special_token_66|>",
|
| 597 |
-
"lstrip": false,
|
| 598 |
-
"normalized": false,
|
| 599 |
-
"rstrip": false,
|
| 600 |
-
"single_word": false,
|
| 601 |
-
"special": true
|
| 602 |
-
},
|
| 603 |
-
"128075": {
|
| 604 |
-
"content": "<|reserved_special_token_67|>",
|
| 605 |
-
"lstrip": false,
|
| 606 |
-
"normalized": false,
|
| 607 |
-
"rstrip": false,
|
| 608 |
-
"single_word": false,
|
| 609 |
-
"special": true
|
| 610 |
-
},
|
| 611 |
-
"128076": {
|
| 612 |
-
"content": "<|reserved_special_token_68|>",
|
| 613 |
-
"lstrip": false,
|
| 614 |
-
"normalized": false,
|
| 615 |
-
"rstrip": false,
|
| 616 |
-
"single_word": false,
|
| 617 |
-
"special": true
|
| 618 |
-
},
|
| 619 |
-
"128077": {
|
| 620 |
-
"content": "<|reserved_special_token_69|>",
|
| 621 |
-
"lstrip": false,
|
| 622 |
-
"normalized": false,
|
| 623 |
-
"rstrip": false,
|
| 624 |
-
"single_word": false,
|
| 625 |
-
"special": true
|
| 626 |
-
},
|
| 627 |
-
"128078": {
|
| 628 |
-
"content": "<|reserved_special_token_70|>",
|
| 629 |
-
"lstrip": false,
|
| 630 |
-
"normalized": false,
|
| 631 |
-
"rstrip": false,
|
| 632 |
-
"single_word": false,
|
| 633 |
-
"special": true
|
| 634 |
-
},
|
| 635 |
-
"128079": {
|
| 636 |
-
"content": "<|reserved_special_token_71|>",
|
| 637 |
-
"lstrip": false,
|
| 638 |
-
"normalized": false,
|
| 639 |
-
"rstrip": false,
|
| 640 |
-
"single_word": false,
|
| 641 |
-
"special": true
|
| 642 |
-
},
|
| 643 |
-
"128080": {
|
| 644 |
-
"content": "<|reserved_special_token_72|>",
|
| 645 |
-
"lstrip": false,
|
| 646 |
-
"normalized": false,
|
| 647 |
-
"rstrip": false,
|
| 648 |
-
"single_word": false,
|
| 649 |
-
"special": true
|
| 650 |
-
},
|
| 651 |
-
"128081": {
|
| 652 |
-
"content": "<|reserved_special_token_73|>",
|
| 653 |
-
"lstrip": false,
|
| 654 |
-
"normalized": false,
|
| 655 |
-
"rstrip": false,
|
| 656 |
-
"single_word": false,
|
| 657 |
-
"special": true
|
| 658 |
-
},
|
| 659 |
-
"128082": {
|
| 660 |
-
"content": "<|reserved_special_token_74|>",
|
| 661 |
-
"lstrip": false,
|
| 662 |
-
"normalized": false,
|
| 663 |
-
"rstrip": false,
|
| 664 |
-
"single_word": false,
|
| 665 |
-
"special": true
|
| 666 |
-
},
|
| 667 |
-
"128083": {
|
| 668 |
-
"content": "<|reserved_special_token_75|>",
|
| 669 |
-
"lstrip": false,
|
| 670 |
-
"normalized": false,
|
| 671 |
-
"rstrip": false,
|
| 672 |
-
"single_word": false,
|
| 673 |
-
"special": true
|
| 674 |
-
},
|
| 675 |
-
"128084": {
|
| 676 |
-
"content": "<|reserved_special_token_76|>",
|
| 677 |
-
"lstrip": false,
|
| 678 |
-
"normalized": false,
|
| 679 |
-
"rstrip": false,
|
| 680 |
-
"single_word": false,
|
| 681 |
-
"special": true
|
| 682 |
-
},
|
| 683 |
-
"128085": {
|
| 684 |
-
"content": "<|reserved_special_token_77|>",
|
| 685 |
-
"lstrip": false,
|
| 686 |
-
"normalized": false,
|
| 687 |
-
"rstrip": false,
|
| 688 |
-
"single_word": false,
|
| 689 |
-
"special": true
|
| 690 |
-
},
|
| 691 |
-
"128086": {
|
| 692 |
-
"content": "<|reserved_special_token_78|>",
|
| 693 |
-
"lstrip": false,
|
| 694 |
-
"normalized": false,
|
| 695 |
-
"rstrip": false,
|
| 696 |
-
"single_word": false,
|
| 697 |
-
"special": true
|
| 698 |
-
},
|
| 699 |
-
"128087": {
|
| 700 |
-
"content": "<|reserved_special_token_79|>",
|
| 701 |
-
"lstrip": false,
|
| 702 |
-
"normalized": false,
|
| 703 |
-
"rstrip": false,
|
| 704 |
-
"single_word": false,
|
| 705 |
-
"special": true
|
| 706 |
-
},
|
| 707 |
-
"128088": {
|
| 708 |
-
"content": "<|reserved_special_token_80|>",
|
| 709 |
-
"lstrip": false,
|
| 710 |
-
"normalized": false,
|
| 711 |
-
"rstrip": false,
|
| 712 |
-
"single_word": false,
|
| 713 |
-
"special": true
|
| 714 |
-
},
|
| 715 |
-
"128089": {
|
| 716 |
-
"content": "<|reserved_special_token_81|>",
|
| 717 |
-
"lstrip": false,
|
| 718 |
-
"normalized": false,
|
| 719 |
-
"rstrip": false,
|
| 720 |
-
"single_word": false,
|
| 721 |
-
"special": true
|
| 722 |
-
},
|
| 723 |
-
"128090": {
|
| 724 |
-
"content": "<|reserved_special_token_82|>",
|
| 725 |
-
"lstrip": false,
|
| 726 |
-
"normalized": false,
|
| 727 |
-
"rstrip": false,
|
| 728 |
-
"single_word": false,
|
| 729 |
-
"special": true
|
| 730 |
-
},
|
| 731 |
-
"128091": {
|
| 732 |
-
"content": "<|reserved_special_token_83|>",
|
| 733 |
-
"lstrip": false,
|
| 734 |
-
"normalized": false,
|
| 735 |
-
"rstrip": false,
|
| 736 |
-
"single_word": false,
|
| 737 |
-
"special": true
|
| 738 |
-
},
|
| 739 |
-
"128092": {
|
| 740 |
-
"content": "<|reserved_special_token_84|>",
|
| 741 |
-
"lstrip": false,
|
| 742 |
-
"normalized": false,
|
| 743 |
-
"rstrip": false,
|
| 744 |
-
"single_word": false,
|
| 745 |
-
"special": true
|
| 746 |
-
},
|
| 747 |
-
"128093": {
|
| 748 |
-
"content": "<|reserved_special_token_85|>",
|
| 749 |
-
"lstrip": false,
|
| 750 |
-
"normalized": false,
|
| 751 |
-
"rstrip": false,
|
| 752 |
-
"single_word": false,
|
| 753 |
-
"special": true
|
| 754 |
-
},
|
| 755 |
-
"128094": {
|
| 756 |
-
"content": "<|reserved_special_token_86|>",
|
| 757 |
-
"lstrip": false,
|
| 758 |
-
"normalized": false,
|
| 759 |
-
"rstrip": false,
|
| 760 |
-
"single_word": false,
|
| 761 |
-
"special": true
|
| 762 |
-
},
|
| 763 |
-
"128095": {
|
| 764 |
-
"content": "<|reserved_special_token_87|>",
|
| 765 |
-
"lstrip": false,
|
| 766 |
-
"normalized": false,
|
| 767 |
-
"rstrip": false,
|
| 768 |
-
"single_word": false,
|
| 769 |
-
"special": true
|
| 770 |
-
},
|
| 771 |
-
"128096": {
|
| 772 |
-
"content": "<|reserved_special_token_88|>",
|
| 773 |
-
"lstrip": false,
|
| 774 |
-
"normalized": false,
|
| 775 |
-
"rstrip": false,
|
| 776 |
-
"single_word": false,
|
| 777 |
-
"special": true
|
| 778 |
-
},
|
| 779 |
-
"128097": {
|
| 780 |
-
"content": "<|reserved_special_token_89|>",
|
| 781 |
-
"lstrip": false,
|
| 782 |
-
"normalized": false,
|
| 783 |
-
"rstrip": false,
|
| 784 |
-
"single_word": false,
|
| 785 |
-
"special": true
|
| 786 |
-
},
|
| 787 |
-
"128098": {
|
| 788 |
-
"content": "<|reserved_special_token_90|>",
|
| 789 |
-
"lstrip": false,
|
| 790 |
-
"normalized": false,
|
| 791 |
-
"rstrip": false,
|
| 792 |
-
"single_word": false,
|
| 793 |
-
"special": true
|
| 794 |
-
},
|
| 795 |
-
"128099": {
|
| 796 |
-
"content": "<|reserved_special_token_91|>",
|
| 797 |
-
"lstrip": false,
|
| 798 |
-
"normalized": false,
|
| 799 |
-
"rstrip": false,
|
| 800 |
-
"single_word": false,
|
| 801 |
-
"special": true
|
| 802 |
-
},
|
| 803 |
-
"128100": {
|
| 804 |
-
"content": "<|reserved_special_token_92|>",
|
| 805 |
-
"lstrip": false,
|
| 806 |
-
"normalized": false,
|
| 807 |
-
"rstrip": false,
|
| 808 |
-
"single_word": false,
|
| 809 |
-
"special": true
|
| 810 |
-
},
|
| 811 |
-
"128101": {
|
| 812 |
-
"content": "<|reserved_special_token_93|>",
|
| 813 |
-
"lstrip": false,
|
| 814 |
-
"normalized": false,
|
| 815 |
-
"rstrip": false,
|
| 816 |
-
"single_word": false,
|
| 817 |
-
"special": true
|
| 818 |
-
},
|
| 819 |
-
"128102": {
|
| 820 |
-
"content": "<|reserved_special_token_94|>",
|
| 821 |
-
"lstrip": false,
|
| 822 |
-
"normalized": false,
|
| 823 |
-
"rstrip": false,
|
| 824 |
-
"single_word": false,
|
| 825 |
-
"special": true
|
| 826 |
-
},
|
| 827 |
-
"128103": {
|
| 828 |
-
"content": "<|reserved_special_token_95|>",
|
| 829 |
-
"lstrip": false,
|
| 830 |
-
"normalized": false,
|
| 831 |
-
"rstrip": false,
|
| 832 |
-
"single_word": false,
|
| 833 |
-
"special": true
|
| 834 |
-
},
|
| 835 |
-
"128104": {
|
| 836 |
-
"content": "<|reserved_special_token_96|>",
|
| 837 |
-
"lstrip": false,
|
| 838 |
-
"normalized": false,
|
| 839 |
-
"rstrip": false,
|
| 840 |
-
"single_word": false,
|
| 841 |
-
"special": true
|
| 842 |
-
},
|
| 843 |
-
"128105": {
|
| 844 |
-
"content": "<|reserved_special_token_97|>",
|
| 845 |
-
"lstrip": false,
|
| 846 |
-
"normalized": false,
|
| 847 |
-
"rstrip": false,
|
| 848 |
-
"single_word": false,
|
| 849 |
-
"special": true
|
| 850 |
-
},
|
| 851 |
-
"128106": {
|
| 852 |
-
"content": "<|reserved_special_token_98|>",
|
| 853 |
-
"lstrip": false,
|
| 854 |
-
"normalized": false,
|
| 855 |
-
"rstrip": false,
|
| 856 |
-
"single_word": false,
|
| 857 |
-
"special": true
|
| 858 |
-
},
|
| 859 |
-
"128107": {
|
| 860 |
-
"content": "<|reserved_special_token_99|>",
|
| 861 |
-
"lstrip": false,
|
| 862 |
-
"normalized": false,
|
| 863 |
-
"rstrip": false,
|
| 864 |
-
"single_word": false,
|
| 865 |
-
"special": true
|
| 866 |
-
},
|
| 867 |
-
"128108": {
|
| 868 |
-
"content": "<|reserved_special_token_100|>",
|
| 869 |
-
"lstrip": false,
|
| 870 |
-
"normalized": false,
|
| 871 |
-
"rstrip": false,
|
| 872 |
-
"single_word": false,
|
| 873 |
-
"special": true
|
| 874 |
-
},
|
| 875 |
-
"128109": {
|
| 876 |
-
"content": "<|reserved_special_token_101|>",
|
| 877 |
-
"lstrip": false,
|
| 878 |
-
"normalized": false,
|
| 879 |
-
"rstrip": false,
|
| 880 |
-
"single_word": false,
|
| 881 |
-
"special": true
|
| 882 |
-
},
|
| 883 |
-
"128110": {
|
| 884 |
-
"content": "<|reserved_special_token_102|>",
|
| 885 |
-
"lstrip": false,
|
| 886 |
-
"normalized": false,
|
| 887 |
-
"rstrip": false,
|
| 888 |
-
"single_word": false,
|
| 889 |
-
"special": true
|
| 890 |
-
},
|
| 891 |
-
"128111": {
|
| 892 |
-
"content": "<|reserved_special_token_103|>",
|
| 893 |
-
"lstrip": false,
|
| 894 |
-
"normalized": false,
|
| 895 |
-
"rstrip": false,
|
| 896 |
-
"single_word": false,
|
| 897 |
-
"special": true
|
| 898 |
-
},
|
| 899 |
-
"128112": {
|
| 900 |
-
"content": "<|reserved_special_token_104|>",
|
| 901 |
-
"lstrip": false,
|
| 902 |
-
"normalized": false,
|
| 903 |
-
"rstrip": false,
|
| 904 |
-
"single_word": false,
|
| 905 |
-
"special": true
|
| 906 |
-
},
|
| 907 |
-
"128113": {
|
| 908 |
-
"content": "<|reserved_special_token_105|>",
|
| 909 |
-
"lstrip": false,
|
| 910 |
-
"normalized": false,
|
| 911 |
-
"rstrip": false,
|
| 912 |
-
"single_word": false,
|
| 913 |
-
"special": true
|
| 914 |
-
},
|
| 915 |
-
"128114": {
|
| 916 |
-
"content": "<|reserved_special_token_106|>",
|
| 917 |
-
"lstrip": false,
|
| 918 |
-
"normalized": false,
|
| 919 |
-
"rstrip": false,
|
| 920 |
-
"single_word": false,
|
| 921 |
-
"special": true
|
| 922 |
-
},
|
| 923 |
-
"128115": {
|
| 924 |
-
"content": "<|reserved_special_token_107|>",
|
| 925 |
-
"lstrip": false,
|
| 926 |
-
"normalized": false,
|
| 927 |
-
"rstrip": false,
|
| 928 |
-
"single_word": false,
|
| 929 |
-
"special": true
|
| 930 |
-
},
|
| 931 |
-
"128116": {
|
| 932 |
-
"content": "<|reserved_special_token_108|>",
|
| 933 |
-
"lstrip": false,
|
| 934 |
-
"normalized": false,
|
| 935 |
-
"rstrip": false,
|
| 936 |
-
"single_word": false,
|
| 937 |
-
"special": true
|
| 938 |
-
},
|
| 939 |
-
"128117": {
|
| 940 |
-
"content": "<|reserved_special_token_109|>",
|
| 941 |
-
"lstrip": false,
|
| 942 |
-
"normalized": false,
|
| 943 |
-
"rstrip": false,
|
| 944 |
-
"single_word": false,
|
| 945 |
-
"special": true
|
| 946 |
-
},
|
| 947 |
-
"128118": {
|
| 948 |
-
"content": "<|reserved_special_token_110|>",
|
| 949 |
-
"lstrip": false,
|
| 950 |
-
"normalized": false,
|
| 951 |
-
"rstrip": false,
|
| 952 |
-
"single_word": false,
|
| 953 |
-
"special": true
|
| 954 |
-
},
|
| 955 |
-
"128119": {
|
| 956 |
-
"content": "<|reserved_special_token_111|>",
|
| 957 |
-
"lstrip": false,
|
| 958 |
-
"normalized": false,
|
| 959 |
-
"rstrip": false,
|
| 960 |
-
"single_word": false,
|
| 961 |
-
"special": true
|
| 962 |
-
},
|
| 963 |
-
"128120": {
|
| 964 |
-
"content": "<|reserved_special_token_112|>",
|
| 965 |
-
"lstrip": false,
|
| 966 |
-
"normalized": false,
|
| 967 |
-
"rstrip": false,
|
| 968 |
-
"single_word": false,
|
| 969 |
-
"special": true
|
| 970 |
-
},
|
| 971 |
-
"128121": {
|
| 972 |
-
"content": "<|reserved_special_token_113|>",
|
| 973 |
-
"lstrip": false,
|
| 974 |
-
"normalized": false,
|
| 975 |
-
"rstrip": false,
|
| 976 |
-
"single_word": false,
|
| 977 |
-
"special": true
|
| 978 |
-
},
|
| 979 |
-
"128122": {
|
| 980 |
-
"content": "<|reserved_special_token_114|>",
|
| 981 |
-
"lstrip": false,
|
| 982 |
-
"normalized": false,
|
| 983 |
-
"rstrip": false,
|
| 984 |
-
"single_word": false,
|
| 985 |
-
"special": true
|
| 986 |
-
},
|
| 987 |
-
"128123": {
|
| 988 |
-
"content": "<|reserved_special_token_115|>",
|
| 989 |
-
"lstrip": false,
|
| 990 |
-
"normalized": false,
|
| 991 |
-
"rstrip": false,
|
| 992 |
-
"single_word": false,
|
| 993 |
-
"special": true
|
| 994 |
-
},
|
| 995 |
-
"128124": {
|
| 996 |
-
"content": "<|reserved_special_token_116|>",
|
| 997 |
-
"lstrip": false,
|
| 998 |
-
"normalized": false,
|
| 999 |
-
"rstrip": false,
|
| 1000 |
-
"single_word": false,
|
| 1001 |
-
"special": true
|
| 1002 |
-
},
|
| 1003 |
-
"128125": {
|
| 1004 |
-
"content": "<|reserved_special_token_117|>",
|
| 1005 |
-
"lstrip": false,
|
| 1006 |
-
"normalized": false,
|
| 1007 |
-
"rstrip": false,
|
| 1008 |
-
"single_word": false,
|
| 1009 |
-
"special": true
|
| 1010 |
-
},
|
| 1011 |
-
"128126": {
|
| 1012 |
-
"content": "<|reserved_special_token_118|>",
|
| 1013 |
-
"lstrip": false,
|
| 1014 |
-
"normalized": false,
|
| 1015 |
-
"rstrip": false,
|
| 1016 |
-
"single_word": false,
|
| 1017 |
-
"special": true
|
| 1018 |
-
},
|
| 1019 |
-
"128127": {
|
| 1020 |
-
"content": "<|reserved_special_token_119|>",
|
| 1021 |
-
"lstrip": false,
|
| 1022 |
-
"normalized": false,
|
| 1023 |
-
"rstrip": false,
|
| 1024 |
-
"single_word": false,
|
| 1025 |
-
"special": true
|
| 1026 |
-
},
|
| 1027 |
-
"128128": {
|
| 1028 |
-
"content": "<|reserved_special_token_120|>",
|
| 1029 |
-
"lstrip": false,
|
| 1030 |
-
"normalized": false,
|
| 1031 |
-
"rstrip": false,
|
| 1032 |
-
"single_word": false,
|
| 1033 |
-
"special": true
|
| 1034 |
-
},
|
| 1035 |
-
"128129": {
|
| 1036 |
-
"content": "<|reserved_special_token_121|>",
|
| 1037 |
-
"lstrip": false,
|
| 1038 |
-
"normalized": false,
|
| 1039 |
-
"rstrip": false,
|
| 1040 |
-
"single_word": false,
|
| 1041 |
-
"special": true
|
| 1042 |
-
},
|
| 1043 |
-
"128130": {
|
| 1044 |
-
"content": "<|reserved_special_token_122|>",
|
| 1045 |
-
"lstrip": false,
|
| 1046 |
-
"normalized": false,
|
| 1047 |
-
"rstrip": false,
|
| 1048 |
-
"single_word": false,
|
| 1049 |
-
"special": true
|
| 1050 |
-
},
|
| 1051 |
-
"128131": {
|
| 1052 |
-
"content": "<|reserved_special_token_123|>",
|
| 1053 |
-
"lstrip": false,
|
| 1054 |
-
"normalized": false,
|
| 1055 |
-
"rstrip": false,
|
| 1056 |
-
"single_word": false,
|
| 1057 |
-
"special": true
|
| 1058 |
-
},
|
| 1059 |
-
"128132": {
|
| 1060 |
-
"content": "<|reserved_special_token_124|>",
|
| 1061 |
-
"lstrip": false,
|
| 1062 |
-
"normalized": false,
|
| 1063 |
-
"rstrip": false,
|
| 1064 |
-
"single_word": false,
|
| 1065 |
-
"special": true
|
| 1066 |
-
},
|
| 1067 |
-
"128133": {
|
| 1068 |
-
"content": "<|reserved_special_token_125|>",
|
| 1069 |
-
"lstrip": false,
|
| 1070 |
-
"normalized": false,
|
| 1071 |
-
"rstrip": false,
|
| 1072 |
-
"single_word": false,
|
| 1073 |
-
"special": true
|
| 1074 |
-
},
|
| 1075 |
-
"128134": {
|
| 1076 |
-
"content": "<|reserved_special_token_126|>",
|
| 1077 |
-
"lstrip": false,
|
| 1078 |
-
"normalized": false,
|
| 1079 |
-
"rstrip": false,
|
| 1080 |
-
"single_word": false,
|
| 1081 |
-
"special": true
|
| 1082 |
-
},
|
| 1083 |
-
"128135": {
|
| 1084 |
-
"content": "<|reserved_special_token_127|>",
|
| 1085 |
-
"lstrip": false,
|
| 1086 |
-
"normalized": false,
|
| 1087 |
-
"rstrip": false,
|
| 1088 |
-
"single_word": false,
|
| 1089 |
-
"special": true
|
| 1090 |
-
},
|
| 1091 |
-
"128136": {
|
| 1092 |
-
"content": "<|reserved_special_token_128|>",
|
| 1093 |
-
"lstrip": false,
|
| 1094 |
-
"normalized": false,
|
| 1095 |
-
"rstrip": false,
|
| 1096 |
-
"single_word": false,
|
| 1097 |
-
"special": true
|
| 1098 |
-
},
|
| 1099 |
-
"128137": {
|
| 1100 |
-
"content": "<|reserved_special_token_129|>",
|
| 1101 |
-
"lstrip": false,
|
| 1102 |
-
"normalized": false,
|
| 1103 |
-
"rstrip": false,
|
| 1104 |
-
"single_word": false,
|
| 1105 |
-
"special": true
|
| 1106 |
-
},
|
| 1107 |
-
"128138": {
|
| 1108 |
-
"content": "<|reserved_special_token_130|>",
|
| 1109 |
-
"lstrip": false,
|
| 1110 |
-
"normalized": false,
|
| 1111 |
-
"rstrip": false,
|
| 1112 |
-
"single_word": false,
|
| 1113 |
-
"special": true
|
| 1114 |
-
},
|
| 1115 |
-
"128139": {
|
| 1116 |
-
"content": "<|reserved_special_token_131|>",
|
| 1117 |
-
"lstrip": false,
|
| 1118 |
-
"normalized": false,
|
| 1119 |
-
"rstrip": false,
|
| 1120 |
-
"single_word": false,
|
| 1121 |
-
"special": true
|
| 1122 |
-
},
|
| 1123 |
-
"128140": {
|
| 1124 |
-
"content": "<|reserved_special_token_132|>",
|
| 1125 |
-
"lstrip": false,
|
| 1126 |
-
"normalized": false,
|
| 1127 |
-
"rstrip": false,
|
| 1128 |
-
"single_word": false,
|
| 1129 |
-
"special": true
|
| 1130 |
-
},
|
| 1131 |
-
"128141": {
|
| 1132 |
-
"content": "<|reserved_special_token_133|>",
|
| 1133 |
-
"lstrip": false,
|
| 1134 |
-
"normalized": false,
|
| 1135 |
-
"rstrip": false,
|
| 1136 |
-
"single_word": false,
|
| 1137 |
-
"special": true
|
| 1138 |
-
},
|
| 1139 |
-
"128142": {
|
| 1140 |
-
"content": "<|reserved_special_token_134|>",
|
| 1141 |
-
"lstrip": false,
|
| 1142 |
-
"normalized": false,
|
| 1143 |
-
"rstrip": false,
|
| 1144 |
-
"single_word": false,
|
| 1145 |
-
"special": true
|
| 1146 |
-
},
|
| 1147 |
-
"128143": {
|
| 1148 |
-
"content": "<|reserved_special_token_135|>",
|
| 1149 |
-
"lstrip": false,
|
| 1150 |
-
"normalized": false,
|
| 1151 |
-
"rstrip": false,
|
| 1152 |
-
"single_word": false,
|
| 1153 |
-
"special": true
|
| 1154 |
-
},
|
| 1155 |
-
"128144": {
|
| 1156 |
-
"content": "<|reserved_special_token_136|>",
|
| 1157 |
-
"lstrip": false,
|
| 1158 |
-
"normalized": false,
|
| 1159 |
-
"rstrip": false,
|
| 1160 |
-
"single_word": false,
|
| 1161 |
-
"special": true
|
| 1162 |
-
},
|
| 1163 |
-
"128145": {
|
| 1164 |
-
"content": "<|reserved_special_token_137|>",
|
| 1165 |
-
"lstrip": false,
|
| 1166 |
-
"normalized": false,
|
| 1167 |
-
"rstrip": false,
|
| 1168 |
-
"single_word": false,
|
| 1169 |
-
"special": true
|
| 1170 |
-
},
|
| 1171 |
-
"128146": {
|
| 1172 |
-
"content": "<|reserved_special_token_138|>",
|
| 1173 |
-
"lstrip": false,
|
| 1174 |
-
"normalized": false,
|
| 1175 |
-
"rstrip": false,
|
| 1176 |
-
"single_word": false,
|
| 1177 |
-
"special": true
|
| 1178 |
-
},
|
| 1179 |
-
"128147": {
|
| 1180 |
-
"content": "<|reserved_special_token_139|>",
|
| 1181 |
-
"lstrip": false,
|
| 1182 |
-
"normalized": false,
|
| 1183 |
-
"rstrip": false,
|
| 1184 |
-
"single_word": false,
|
| 1185 |
-
"special": true
|
| 1186 |
-
},
|
| 1187 |
-
"128148": {
|
| 1188 |
-
"content": "<|reserved_special_token_140|>",
|
| 1189 |
-
"lstrip": false,
|
| 1190 |
-
"normalized": false,
|
| 1191 |
-
"rstrip": false,
|
| 1192 |
-
"single_word": false,
|
| 1193 |
-
"special": true
|
| 1194 |
-
},
|
| 1195 |
-
"128149": {
|
| 1196 |
-
"content": "<|reserved_special_token_141|>",
|
| 1197 |
-
"lstrip": false,
|
| 1198 |
-
"normalized": false,
|
| 1199 |
-
"rstrip": false,
|
| 1200 |
-
"single_word": false,
|
| 1201 |
-
"special": true
|
| 1202 |
-
},
|
| 1203 |
-
"128150": {
|
| 1204 |
-
"content": "<|reserved_special_token_142|>",
|
| 1205 |
-
"lstrip": false,
|
| 1206 |
-
"normalized": false,
|
| 1207 |
-
"rstrip": false,
|
| 1208 |
-
"single_word": false,
|
| 1209 |
-
"special": true
|
| 1210 |
-
},
|
| 1211 |
-
"128151": {
|
| 1212 |
-
"content": "<|reserved_special_token_143|>",
|
| 1213 |
-
"lstrip": false,
|
| 1214 |
-
"normalized": false,
|
| 1215 |
-
"rstrip": false,
|
| 1216 |
-
"single_word": false,
|
| 1217 |
-
"special": true
|
| 1218 |
-
},
|
| 1219 |
-
"128152": {
|
| 1220 |
-
"content": "<|reserved_special_token_144|>",
|
| 1221 |
-
"lstrip": false,
|
| 1222 |
-
"normalized": false,
|
| 1223 |
-
"rstrip": false,
|
| 1224 |
-
"single_word": false,
|
| 1225 |
-
"special": true
|
| 1226 |
-
},
|
| 1227 |
-
"128153": {
|
| 1228 |
-
"content": "<|reserved_special_token_145|>",
|
| 1229 |
-
"lstrip": false,
|
| 1230 |
-
"normalized": false,
|
| 1231 |
-
"rstrip": false,
|
| 1232 |
-
"single_word": false,
|
| 1233 |
-
"special": true
|
| 1234 |
-
},
|
| 1235 |
-
"128154": {
|
| 1236 |
-
"content": "<|reserved_special_token_146|>",
|
| 1237 |
-
"lstrip": false,
|
| 1238 |
-
"normalized": false,
|
| 1239 |
-
"rstrip": false,
|
| 1240 |
-
"single_word": false,
|
| 1241 |
-
"special": true
|
| 1242 |
-
},
|
| 1243 |
-
"128155": {
|
| 1244 |
-
"content": "<|reserved_special_token_147|>",
|
| 1245 |
-
"lstrip": false,
|
| 1246 |
-
"normalized": false,
|
| 1247 |
-
"rstrip": false,
|
| 1248 |
-
"single_word": false,
|
| 1249 |
-
"special": true
|
| 1250 |
-
},
|
| 1251 |
-
"128156": {
|
| 1252 |
-
"content": "<|reserved_special_token_148|>",
|
| 1253 |
-
"lstrip": false,
|
| 1254 |
-
"normalized": false,
|
| 1255 |
-
"rstrip": false,
|
| 1256 |
-
"single_word": false,
|
| 1257 |
-
"special": true
|
| 1258 |
-
},
|
| 1259 |
-
"128157": {
|
| 1260 |
-
"content": "<|reserved_special_token_149|>",
|
| 1261 |
-
"lstrip": false,
|
| 1262 |
-
"normalized": false,
|
| 1263 |
-
"rstrip": false,
|
| 1264 |
-
"single_word": false,
|
| 1265 |
-
"special": true
|
| 1266 |
-
},
|
| 1267 |
-
"128158": {
|
| 1268 |
-
"content": "<|reserved_special_token_150|>",
|
| 1269 |
-
"lstrip": false,
|
| 1270 |
-
"normalized": false,
|
| 1271 |
-
"rstrip": false,
|
| 1272 |
-
"single_word": false,
|
| 1273 |
-
"special": true
|
| 1274 |
-
},
|
| 1275 |
-
"128159": {
|
| 1276 |
-
"content": "<|reserved_special_token_151|>",
|
| 1277 |
-
"lstrip": false,
|
| 1278 |
-
"normalized": false,
|
| 1279 |
-
"rstrip": false,
|
| 1280 |
-
"single_word": false,
|
| 1281 |
-
"special": true
|
| 1282 |
-
},
|
| 1283 |
-
"128160": {
|
| 1284 |
-
"content": "<|reserved_special_token_152|>",
|
| 1285 |
-
"lstrip": false,
|
| 1286 |
-
"normalized": false,
|
| 1287 |
-
"rstrip": false,
|
| 1288 |
-
"single_word": false,
|
| 1289 |
-
"special": true
|
| 1290 |
-
},
|
| 1291 |
-
"128161": {
|
| 1292 |
-
"content": "<|reserved_special_token_153|>",
|
| 1293 |
-
"lstrip": false,
|
| 1294 |
-
"normalized": false,
|
| 1295 |
-
"rstrip": false,
|
| 1296 |
-
"single_word": false,
|
| 1297 |
-
"special": true
|
| 1298 |
-
},
|
| 1299 |
-
"128162": {
|
| 1300 |
-
"content": "<|reserved_special_token_154|>",
|
| 1301 |
-
"lstrip": false,
|
| 1302 |
-
"normalized": false,
|
| 1303 |
-
"rstrip": false,
|
| 1304 |
-
"single_word": false,
|
| 1305 |
-
"special": true
|
| 1306 |
-
},
|
| 1307 |
-
"128163": {
|
| 1308 |
-
"content": "<|reserved_special_token_155|>",
|
| 1309 |
-
"lstrip": false,
|
| 1310 |
-
"normalized": false,
|
| 1311 |
-
"rstrip": false,
|
| 1312 |
-
"single_word": false,
|
| 1313 |
-
"special": true
|
| 1314 |
-
},
|
| 1315 |
-
"128164": {
|
| 1316 |
-
"content": "<|reserved_special_token_156|>",
|
| 1317 |
-
"lstrip": false,
|
| 1318 |
-
"normalized": false,
|
| 1319 |
-
"rstrip": false,
|
| 1320 |
-
"single_word": false,
|
| 1321 |
-
"special": true
|
| 1322 |
-
},
|
| 1323 |
-
"128165": {
|
| 1324 |
-
"content": "<|reserved_special_token_157|>",
|
| 1325 |
-
"lstrip": false,
|
| 1326 |
-
"normalized": false,
|
| 1327 |
-
"rstrip": false,
|
| 1328 |
-
"single_word": false,
|
| 1329 |
-
"special": true
|
| 1330 |
-
},
|
| 1331 |
-
"128166": {
|
| 1332 |
-
"content": "<|reserved_special_token_158|>",
|
| 1333 |
-
"lstrip": false,
|
| 1334 |
-
"normalized": false,
|
| 1335 |
-
"rstrip": false,
|
| 1336 |
-
"single_word": false,
|
| 1337 |
-
"special": true
|
| 1338 |
-
},
|
| 1339 |
-
"128167": {
|
| 1340 |
-
"content": "<|reserved_special_token_159|>",
|
| 1341 |
-
"lstrip": false,
|
| 1342 |
-
"normalized": false,
|
| 1343 |
-
"rstrip": false,
|
| 1344 |
-
"single_word": false,
|
| 1345 |
-
"special": true
|
| 1346 |
-
},
|
| 1347 |
-
"128168": {
|
| 1348 |
-
"content": "<|reserved_special_token_160|>",
|
| 1349 |
-
"lstrip": false,
|
| 1350 |
-
"normalized": false,
|
| 1351 |
-
"rstrip": false,
|
| 1352 |
-
"single_word": false,
|
| 1353 |
-
"special": true
|
| 1354 |
-
},
|
| 1355 |
-
"128169": {
|
| 1356 |
-
"content": "<|reserved_special_token_161|>",
|
| 1357 |
-
"lstrip": false,
|
| 1358 |
-
"normalized": false,
|
| 1359 |
-
"rstrip": false,
|
| 1360 |
-
"single_word": false,
|
| 1361 |
-
"special": true
|
| 1362 |
-
},
|
| 1363 |
-
"128170": {
|
| 1364 |
-
"content": "<|reserved_special_token_162|>",
|
| 1365 |
-
"lstrip": false,
|
| 1366 |
-
"normalized": false,
|
| 1367 |
-
"rstrip": false,
|
| 1368 |
-
"single_word": false,
|
| 1369 |
-
"special": true
|
| 1370 |
-
},
|
| 1371 |
-
"128171": {
|
| 1372 |
-
"content": "<|reserved_special_token_163|>",
|
| 1373 |
-
"lstrip": false,
|
| 1374 |
-
"normalized": false,
|
| 1375 |
-
"rstrip": false,
|
| 1376 |
-
"single_word": false,
|
| 1377 |
-
"special": true
|
| 1378 |
-
},
|
| 1379 |
-
"128172": {
|
| 1380 |
-
"content": "<|reserved_special_token_164|>",
|
| 1381 |
-
"lstrip": false,
|
| 1382 |
-
"normalized": false,
|
| 1383 |
-
"rstrip": false,
|
| 1384 |
-
"single_word": false,
|
| 1385 |
-
"special": true
|
| 1386 |
-
},
|
| 1387 |
-
"128173": {
|
| 1388 |
-
"content": "<|reserved_special_token_165|>",
|
| 1389 |
-
"lstrip": false,
|
| 1390 |
-
"normalized": false,
|
| 1391 |
-
"rstrip": false,
|
| 1392 |
-
"single_word": false,
|
| 1393 |
-
"special": true
|
| 1394 |
-
},
|
| 1395 |
-
"128174": {
|
| 1396 |
-
"content": "<|reserved_special_token_166|>",
|
| 1397 |
-
"lstrip": false,
|
| 1398 |
-
"normalized": false,
|
| 1399 |
-
"rstrip": false,
|
| 1400 |
-
"single_word": false,
|
| 1401 |
-
"special": true
|
| 1402 |
-
},
|
| 1403 |
-
"128175": {
|
| 1404 |
-
"content": "<|reserved_special_token_167|>",
|
| 1405 |
-
"lstrip": false,
|
| 1406 |
-
"normalized": false,
|
| 1407 |
-
"rstrip": false,
|
| 1408 |
-
"single_word": false,
|
| 1409 |
-
"special": true
|
| 1410 |
-
},
|
| 1411 |
-
"128176": {
|
| 1412 |
-
"content": "<|reserved_special_token_168|>",
|
| 1413 |
-
"lstrip": false,
|
| 1414 |
-
"normalized": false,
|
| 1415 |
-
"rstrip": false,
|
| 1416 |
-
"single_word": false,
|
| 1417 |
-
"special": true
|
| 1418 |
-
},
|
| 1419 |
-
"128177": {
|
| 1420 |
-
"content": "<|reserved_special_token_169|>",
|
| 1421 |
-
"lstrip": false,
|
| 1422 |
-
"normalized": false,
|
| 1423 |
-
"rstrip": false,
|
| 1424 |
-
"single_word": false,
|
| 1425 |
-
"special": true
|
| 1426 |
-
},
|
| 1427 |
-
"128178": {
|
| 1428 |
-
"content": "<|reserved_special_token_170|>",
|
| 1429 |
-
"lstrip": false,
|
| 1430 |
-
"normalized": false,
|
| 1431 |
-
"rstrip": false,
|
| 1432 |
-
"single_word": false,
|
| 1433 |
-
"special": true
|
| 1434 |
-
},
|
| 1435 |
-
"128179": {
|
| 1436 |
-
"content": "<|reserved_special_token_171|>",
|
| 1437 |
-
"lstrip": false,
|
| 1438 |
-
"normalized": false,
|
| 1439 |
-
"rstrip": false,
|
| 1440 |
-
"single_word": false,
|
| 1441 |
-
"special": true
|
| 1442 |
-
},
|
| 1443 |
-
"128180": {
|
| 1444 |
-
"content": "<|reserved_special_token_172|>",
|
| 1445 |
-
"lstrip": false,
|
| 1446 |
-
"normalized": false,
|
| 1447 |
-
"rstrip": false,
|
| 1448 |
-
"single_word": false,
|
| 1449 |
-
"special": true
|
| 1450 |
-
},
|
| 1451 |
-
"128181": {
|
| 1452 |
-
"content": "<|reserved_special_token_173|>",
|
| 1453 |
-
"lstrip": false,
|
| 1454 |
-
"normalized": false,
|
| 1455 |
-
"rstrip": false,
|
| 1456 |
-
"single_word": false,
|
| 1457 |
-
"special": true
|
| 1458 |
-
},
|
| 1459 |
-
"128182": {
|
| 1460 |
-
"content": "<|reserved_special_token_174|>",
|
| 1461 |
-
"lstrip": false,
|
| 1462 |
-
"normalized": false,
|
| 1463 |
-
"rstrip": false,
|
| 1464 |
-
"single_word": false,
|
| 1465 |
-
"special": true
|
| 1466 |
-
},
|
| 1467 |
-
"128183": {
|
| 1468 |
-
"content": "<|reserved_special_token_175|>",
|
| 1469 |
-
"lstrip": false,
|
| 1470 |
-
"normalized": false,
|
| 1471 |
-
"rstrip": false,
|
| 1472 |
-
"single_word": false,
|
| 1473 |
-
"special": true
|
| 1474 |
-
},
|
| 1475 |
-
"128184": {
|
| 1476 |
-
"content": "<|reserved_special_token_176|>",
|
| 1477 |
-
"lstrip": false,
|
| 1478 |
-
"normalized": false,
|
| 1479 |
-
"rstrip": false,
|
| 1480 |
-
"single_word": false,
|
| 1481 |
-
"special": true
|
| 1482 |
-
},
|
| 1483 |
-
"128185": {
|
| 1484 |
-
"content": "<|reserved_special_token_177|>",
|
| 1485 |
-
"lstrip": false,
|
| 1486 |
-
"normalized": false,
|
| 1487 |
-
"rstrip": false,
|
| 1488 |
-
"single_word": false,
|
| 1489 |
-
"special": true
|
| 1490 |
-
},
|
| 1491 |
-
"128186": {
|
| 1492 |
-
"content": "<|reserved_special_token_178|>",
|
| 1493 |
-
"lstrip": false,
|
| 1494 |
-
"normalized": false,
|
| 1495 |
-
"rstrip": false,
|
| 1496 |
-
"single_word": false,
|
| 1497 |
-
"special": true
|
| 1498 |
-
},
|
| 1499 |
-
"128187": {
|
| 1500 |
-
"content": "<|reserved_special_token_179|>",
|
| 1501 |
-
"lstrip": false,
|
| 1502 |
-
"normalized": false,
|
| 1503 |
-
"rstrip": false,
|
| 1504 |
-
"single_word": false,
|
| 1505 |
-
"special": true
|
| 1506 |
-
},
|
| 1507 |
-
"128188": {
|
| 1508 |
-
"content": "<|reserved_special_token_180|>",
|
| 1509 |
-
"lstrip": false,
|
| 1510 |
-
"normalized": false,
|
| 1511 |
-
"rstrip": false,
|
| 1512 |
-
"single_word": false,
|
| 1513 |
-
"special": true
|
| 1514 |
-
},
|
| 1515 |
-
"128189": {
|
| 1516 |
-
"content": "<|reserved_special_token_181|>",
|
| 1517 |
-
"lstrip": false,
|
| 1518 |
-
"normalized": false,
|
| 1519 |
-
"rstrip": false,
|
| 1520 |
-
"single_word": false,
|
| 1521 |
-
"special": true
|
| 1522 |
-
},
|
| 1523 |
-
"128190": {
|
| 1524 |
-
"content": "<|reserved_special_token_182|>",
|
| 1525 |
-
"lstrip": false,
|
| 1526 |
-
"normalized": false,
|
| 1527 |
-
"rstrip": false,
|
| 1528 |
-
"single_word": false,
|
| 1529 |
-
"special": true
|
| 1530 |
-
},
|
| 1531 |
-
"128191": {
|
| 1532 |
-
"content": "<|reserved_special_token_183|>",
|
| 1533 |
-
"lstrip": false,
|
| 1534 |
-
"normalized": false,
|
| 1535 |
-
"rstrip": false,
|
| 1536 |
-
"single_word": false,
|
| 1537 |
-
"special": true
|
| 1538 |
-
},
|
| 1539 |
-
"128192": {
|
| 1540 |
-
"content": "<|reserved_special_token_184|>",
|
| 1541 |
-
"lstrip": false,
|
| 1542 |
-
"normalized": false,
|
| 1543 |
-
"rstrip": false,
|
| 1544 |
-
"single_word": false,
|
| 1545 |
-
"special": true
|
| 1546 |
-
},
|
| 1547 |
-
"128193": {
|
| 1548 |
-
"content": "<|reserved_special_token_185|>",
|
| 1549 |
-
"lstrip": false,
|
| 1550 |
-
"normalized": false,
|
| 1551 |
-
"rstrip": false,
|
| 1552 |
-
"single_word": false,
|
| 1553 |
-
"special": true
|
| 1554 |
-
},
|
| 1555 |
-
"128194": {
|
| 1556 |
-
"content": "<|reserved_special_token_186|>",
|
| 1557 |
-
"lstrip": false,
|
| 1558 |
-
"normalized": false,
|
| 1559 |
-
"rstrip": false,
|
| 1560 |
-
"single_word": false,
|
| 1561 |
-
"special": true
|
| 1562 |
-
},
|
| 1563 |
-
"128195": {
|
| 1564 |
-
"content": "<|reserved_special_token_187|>",
|
| 1565 |
-
"lstrip": false,
|
| 1566 |
-
"normalized": false,
|
| 1567 |
-
"rstrip": false,
|
| 1568 |
-
"single_word": false,
|
| 1569 |
-
"special": true
|
| 1570 |
-
},
|
| 1571 |
-
"128196": {
|
| 1572 |
-
"content": "<|reserved_special_token_188|>",
|
| 1573 |
-
"lstrip": false,
|
| 1574 |
-
"normalized": false,
|
| 1575 |
-
"rstrip": false,
|
| 1576 |
-
"single_word": false,
|
| 1577 |
-
"special": true
|
| 1578 |
-
},
|
| 1579 |
-
"128197": {
|
| 1580 |
-
"content": "<|reserved_special_token_189|>",
|
| 1581 |
-
"lstrip": false,
|
| 1582 |
-
"normalized": false,
|
| 1583 |
-
"rstrip": false,
|
| 1584 |
-
"single_word": false,
|
| 1585 |
-
"special": true
|
| 1586 |
-
},
|
| 1587 |
-
"128198": {
|
| 1588 |
-
"content": "<|reserved_special_token_190|>",
|
| 1589 |
-
"lstrip": false,
|
| 1590 |
-
"normalized": false,
|
| 1591 |
-
"rstrip": false,
|
| 1592 |
-
"single_word": false,
|
| 1593 |
-
"special": true
|
| 1594 |
-
},
|
| 1595 |
-
"128199": {
|
| 1596 |
-
"content": "<|reserved_special_token_191|>",
|
| 1597 |
-
"lstrip": false,
|
| 1598 |
-
"normalized": false,
|
| 1599 |
-
"rstrip": false,
|
| 1600 |
-
"single_word": false,
|
| 1601 |
-
"special": true
|
| 1602 |
-
},
|
| 1603 |
-
"128200": {
|
| 1604 |
-
"content": "<|reserved_special_token_192|>",
|
| 1605 |
-
"lstrip": false,
|
| 1606 |
-
"normalized": false,
|
| 1607 |
-
"rstrip": false,
|
| 1608 |
-
"single_word": false,
|
| 1609 |
-
"special": true
|
| 1610 |
-
},
|
| 1611 |
-
"128201": {
|
| 1612 |
-
"content": "<|reserved_special_token_193|>",
|
| 1613 |
-
"lstrip": false,
|
| 1614 |
-
"normalized": false,
|
| 1615 |
-
"rstrip": false,
|
| 1616 |
-
"single_word": false,
|
| 1617 |
-
"special": true
|
| 1618 |
-
},
|
| 1619 |
-
"128202": {
|
| 1620 |
-
"content": "<|reserved_special_token_194|>",
|
| 1621 |
-
"lstrip": false,
|
| 1622 |
-
"normalized": false,
|
| 1623 |
-
"rstrip": false,
|
| 1624 |
-
"single_word": false,
|
| 1625 |
-
"special": true
|
| 1626 |
-
},
|
| 1627 |
-
"128203": {
|
| 1628 |
-
"content": "<|reserved_special_token_195|>",
|
| 1629 |
-
"lstrip": false,
|
| 1630 |
-
"normalized": false,
|
| 1631 |
-
"rstrip": false,
|
| 1632 |
-
"single_word": false,
|
| 1633 |
-
"special": true
|
| 1634 |
-
},
|
| 1635 |
-
"128204": {
|
| 1636 |
-
"content": "<|reserved_special_token_196|>",
|
| 1637 |
-
"lstrip": false,
|
| 1638 |
-
"normalized": false,
|
| 1639 |
-
"rstrip": false,
|
| 1640 |
-
"single_word": false,
|
| 1641 |
-
"special": true
|
| 1642 |
-
},
|
| 1643 |
-
"128205": {
|
| 1644 |
-
"content": "<|reserved_special_token_197|>",
|
| 1645 |
-
"lstrip": false,
|
| 1646 |
-
"normalized": false,
|
| 1647 |
-
"rstrip": false,
|
| 1648 |
-
"single_word": false,
|
| 1649 |
-
"special": true
|
| 1650 |
-
},
|
| 1651 |
-
"128206": {
|
| 1652 |
-
"content": "<|reserved_special_token_198|>",
|
| 1653 |
-
"lstrip": false,
|
| 1654 |
-
"normalized": false,
|
| 1655 |
-
"rstrip": false,
|
| 1656 |
-
"single_word": false,
|
| 1657 |
-
"special": true
|
| 1658 |
-
},
|
| 1659 |
-
"128207": {
|
| 1660 |
-
"content": "<|reserved_special_token_199|>",
|
| 1661 |
-
"lstrip": false,
|
| 1662 |
-
"normalized": false,
|
| 1663 |
-
"rstrip": false,
|
| 1664 |
-
"single_word": false,
|
| 1665 |
-
"special": true
|
| 1666 |
-
},
|
| 1667 |
-
"128208": {
|
| 1668 |
-
"content": "<|reserved_special_token_200|>",
|
| 1669 |
-
"lstrip": false,
|
| 1670 |
-
"normalized": false,
|
| 1671 |
-
"rstrip": false,
|
| 1672 |
-
"single_word": false,
|
| 1673 |
-
"special": true
|
| 1674 |
-
},
|
| 1675 |
-
"128209": {
|
| 1676 |
-
"content": "<|reserved_special_token_201|>",
|
| 1677 |
-
"lstrip": false,
|
| 1678 |
-
"normalized": false,
|
| 1679 |
-
"rstrip": false,
|
| 1680 |
-
"single_word": false,
|
| 1681 |
-
"special": true
|
| 1682 |
-
},
|
| 1683 |
-
"128210": {
|
| 1684 |
-
"content": "<|reserved_special_token_202|>",
|
| 1685 |
-
"lstrip": false,
|
| 1686 |
-
"normalized": false,
|
| 1687 |
-
"rstrip": false,
|
| 1688 |
-
"single_word": false,
|
| 1689 |
-
"special": true
|
| 1690 |
-
},
|
| 1691 |
-
"128211": {
|
| 1692 |
-
"content": "<|reserved_special_token_203|>",
|
| 1693 |
-
"lstrip": false,
|
| 1694 |
-
"normalized": false,
|
| 1695 |
-
"rstrip": false,
|
| 1696 |
-
"single_word": false,
|
| 1697 |
-
"special": true
|
| 1698 |
-
},
|
| 1699 |
-
"128212": {
|
| 1700 |
-
"content": "<|reserved_special_token_204|>",
|
| 1701 |
-
"lstrip": false,
|
| 1702 |
-
"normalized": false,
|
| 1703 |
-
"rstrip": false,
|
| 1704 |
-
"single_word": false,
|
| 1705 |
-
"special": true
|
| 1706 |
-
},
|
| 1707 |
-
"128213": {
|
| 1708 |
-
"content": "<|reserved_special_token_205|>",
|
| 1709 |
-
"lstrip": false,
|
| 1710 |
-
"normalized": false,
|
| 1711 |
-
"rstrip": false,
|
| 1712 |
-
"single_word": false,
|
| 1713 |
-
"special": true
|
| 1714 |
-
},
|
| 1715 |
-
"128214": {
|
| 1716 |
-
"content": "<|reserved_special_token_206|>",
|
| 1717 |
-
"lstrip": false,
|
| 1718 |
-
"normalized": false,
|
| 1719 |
-
"rstrip": false,
|
| 1720 |
-
"single_word": false,
|
| 1721 |
-
"special": true
|
| 1722 |
-
},
|
| 1723 |
-
"128215": {
|
| 1724 |
-
"content": "<|reserved_special_token_207|>",
|
| 1725 |
-
"lstrip": false,
|
| 1726 |
-
"normalized": false,
|
| 1727 |
-
"rstrip": false,
|
| 1728 |
-
"single_word": false,
|
| 1729 |
-
"special": true
|
| 1730 |
-
},
|
| 1731 |
-
"128216": {
|
| 1732 |
-
"content": "<|reserved_special_token_208|>",
|
| 1733 |
-
"lstrip": false,
|
| 1734 |
-
"normalized": false,
|
| 1735 |
-
"rstrip": false,
|
| 1736 |
-
"single_word": false,
|
| 1737 |
-
"special": true
|
| 1738 |
-
},
|
| 1739 |
-
"128217": {
|
| 1740 |
-
"content": "<|reserved_special_token_209|>",
|
| 1741 |
-
"lstrip": false,
|
| 1742 |
-
"normalized": false,
|
| 1743 |
-
"rstrip": false,
|
| 1744 |
-
"single_word": false,
|
| 1745 |
-
"special": true
|
| 1746 |
-
},
|
| 1747 |
-
"128218": {
|
| 1748 |
-
"content": "<|reserved_special_token_210|>",
|
| 1749 |
-
"lstrip": false,
|
| 1750 |
-
"normalized": false,
|
| 1751 |
-
"rstrip": false,
|
| 1752 |
-
"single_word": false,
|
| 1753 |
-
"special": true
|
| 1754 |
-
},
|
| 1755 |
-
"128219": {
|
| 1756 |
-
"content": "<|reserved_special_token_211|>",
|
| 1757 |
-
"lstrip": false,
|
| 1758 |
-
"normalized": false,
|
| 1759 |
-
"rstrip": false,
|
| 1760 |
-
"single_word": false,
|
| 1761 |
-
"special": true
|
| 1762 |
-
},
|
| 1763 |
-
"128220": {
|
| 1764 |
-
"content": "<|reserved_special_token_212|>",
|
| 1765 |
-
"lstrip": false,
|
| 1766 |
-
"normalized": false,
|
| 1767 |
-
"rstrip": false,
|
| 1768 |
-
"single_word": false,
|
| 1769 |
-
"special": true
|
| 1770 |
-
},
|
| 1771 |
-
"128221": {
|
| 1772 |
-
"content": "<|reserved_special_token_213|>",
|
| 1773 |
-
"lstrip": false,
|
| 1774 |
-
"normalized": false,
|
| 1775 |
-
"rstrip": false,
|
| 1776 |
-
"single_word": false,
|
| 1777 |
-
"special": true
|
| 1778 |
-
},
|
| 1779 |
-
"128222": {
|
| 1780 |
-
"content": "<|reserved_special_token_214|>",
|
| 1781 |
-
"lstrip": false,
|
| 1782 |
-
"normalized": false,
|
| 1783 |
-
"rstrip": false,
|
| 1784 |
-
"single_word": false,
|
| 1785 |
-
"special": true
|
| 1786 |
-
},
|
| 1787 |
-
"128223": {
|
| 1788 |
-
"content": "<|reserved_special_token_215|>",
|
| 1789 |
-
"lstrip": false,
|
| 1790 |
-
"normalized": false,
|
| 1791 |
-
"rstrip": false,
|
| 1792 |
-
"single_word": false,
|
| 1793 |
-
"special": true
|
| 1794 |
-
},
|
| 1795 |
-
"128224": {
|
| 1796 |
-
"content": "<|reserved_special_token_216|>",
|
| 1797 |
-
"lstrip": false,
|
| 1798 |
-
"normalized": false,
|
| 1799 |
-
"rstrip": false,
|
| 1800 |
-
"single_word": false,
|
| 1801 |
-
"special": true
|
| 1802 |
-
},
|
| 1803 |
-
"128225": {
|
| 1804 |
-
"content": "<|reserved_special_token_217|>",
|
| 1805 |
-
"lstrip": false,
|
| 1806 |
-
"normalized": false,
|
| 1807 |
-
"rstrip": false,
|
| 1808 |
-
"single_word": false,
|
| 1809 |
-
"special": true
|
| 1810 |
-
},
|
| 1811 |
-
"128226": {
|
| 1812 |
-
"content": "<|reserved_special_token_218|>",
|
| 1813 |
-
"lstrip": false,
|
| 1814 |
-
"normalized": false,
|
| 1815 |
-
"rstrip": false,
|
| 1816 |
-
"single_word": false,
|
| 1817 |
-
"special": true
|
| 1818 |
-
},
|
| 1819 |
-
"128227": {
|
| 1820 |
-
"content": "<|reserved_special_token_219|>",
|
| 1821 |
-
"lstrip": false,
|
| 1822 |
-
"normalized": false,
|
| 1823 |
-
"rstrip": false,
|
| 1824 |
-
"single_word": false,
|
| 1825 |
-
"special": true
|
| 1826 |
-
},
|
| 1827 |
-
"128228": {
|
| 1828 |
-
"content": "<|reserved_special_token_220|>",
|
| 1829 |
-
"lstrip": false,
|
| 1830 |
-
"normalized": false,
|
| 1831 |
-
"rstrip": false,
|
| 1832 |
-
"single_word": false,
|
| 1833 |
-
"special": true
|
| 1834 |
-
},
|
| 1835 |
-
"128229": {
|
| 1836 |
-
"content": "<|reserved_special_token_221|>",
|
| 1837 |
-
"lstrip": false,
|
| 1838 |
-
"normalized": false,
|
| 1839 |
-
"rstrip": false,
|
| 1840 |
-
"single_word": false,
|
| 1841 |
-
"special": true
|
| 1842 |
-
},
|
| 1843 |
-
"128230": {
|
| 1844 |
-
"content": "<|reserved_special_token_222|>",
|
| 1845 |
-
"lstrip": false,
|
| 1846 |
-
"normalized": false,
|
| 1847 |
-
"rstrip": false,
|
| 1848 |
-
"single_word": false,
|
| 1849 |
-
"special": true
|
| 1850 |
-
},
|
| 1851 |
-
"128231": {
|
| 1852 |
-
"content": "<|reserved_special_token_223|>",
|
| 1853 |
-
"lstrip": false,
|
| 1854 |
-
"normalized": false,
|
| 1855 |
-
"rstrip": false,
|
| 1856 |
-
"single_word": false,
|
| 1857 |
-
"special": true
|
| 1858 |
-
},
|
| 1859 |
-
"128232": {
|
| 1860 |
-
"content": "<|reserved_special_token_224|>",
|
| 1861 |
-
"lstrip": false,
|
| 1862 |
-
"normalized": false,
|
| 1863 |
-
"rstrip": false,
|
| 1864 |
-
"single_word": false,
|
| 1865 |
-
"special": true
|
| 1866 |
-
},
|
| 1867 |
-
"128233": {
|
| 1868 |
-
"content": "<|reserved_special_token_225|>",
|
| 1869 |
-
"lstrip": false,
|
| 1870 |
-
"normalized": false,
|
| 1871 |
-
"rstrip": false,
|
| 1872 |
-
"single_word": false,
|
| 1873 |
-
"special": true
|
| 1874 |
-
},
|
| 1875 |
-
"128234": {
|
| 1876 |
-
"content": "<|reserved_special_token_226|>",
|
| 1877 |
-
"lstrip": false,
|
| 1878 |
-
"normalized": false,
|
| 1879 |
-
"rstrip": false,
|
| 1880 |
-
"single_word": false,
|
| 1881 |
-
"special": true
|
| 1882 |
-
},
|
| 1883 |
-
"128235": {
|
| 1884 |
-
"content": "<|reserved_special_token_227|>",
|
| 1885 |
-
"lstrip": false,
|
| 1886 |
-
"normalized": false,
|
| 1887 |
-
"rstrip": false,
|
| 1888 |
-
"single_word": false,
|
| 1889 |
-
"special": true
|
| 1890 |
-
},
|
| 1891 |
-
"128236": {
|
| 1892 |
-
"content": "<|reserved_special_token_228|>",
|
| 1893 |
-
"lstrip": false,
|
| 1894 |
-
"normalized": false,
|
| 1895 |
-
"rstrip": false,
|
| 1896 |
-
"single_word": false,
|
| 1897 |
-
"special": true
|
| 1898 |
-
},
|
| 1899 |
-
"128237": {
|
| 1900 |
-
"content": "<|reserved_special_token_229|>",
|
| 1901 |
-
"lstrip": false,
|
| 1902 |
-
"normalized": false,
|
| 1903 |
-
"rstrip": false,
|
| 1904 |
-
"single_word": false,
|
| 1905 |
-
"special": true
|
| 1906 |
-
},
|
| 1907 |
-
"128238": {
|
| 1908 |
-
"content": "<|reserved_special_token_230|>",
|
| 1909 |
-
"lstrip": false,
|
| 1910 |
-
"normalized": false,
|
| 1911 |
-
"rstrip": false,
|
| 1912 |
-
"single_word": false,
|
| 1913 |
-
"special": true
|
| 1914 |
-
},
|
| 1915 |
-
"128239": {
|
| 1916 |
-
"content": "<|reserved_special_token_231|>",
|
| 1917 |
-
"lstrip": false,
|
| 1918 |
-
"normalized": false,
|
| 1919 |
-
"rstrip": false,
|
| 1920 |
-
"single_word": false,
|
| 1921 |
-
"special": true
|
| 1922 |
-
},
|
| 1923 |
-
"128240": {
|
| 1924 |
-
"content": "<|reserved_special_token_232|>",
|
| 1925 |
-
"lstrip": false,
|
| 1926 |
-
"normalized": false,
|
| 1927 |
-
"rstrip": false,
|
| 1928 |
-
"single_word": false,
|
| 1929 |
-
"special": true
|
| 1930 |
-
},
|
| 1931 |
-
"128241": {
|
| 1932 |
-
"content": "<|reserved_special_token_233|>",
|
| 1933 |
-
"lstrip": false,
|
| 1934 |
-
"normalized": false,
|
| 1935 |
-
"rstrip": false,
|
| 1936 |
-
"single_word": false,
|
| 1937 |
-
"special": true
|
| 1938 |
-
},
|
| 1939 |
-
"128242": {
|
| 1940 |
-
"content": "<|reserved_special_token_234|>",
|
| 1941 |
-
"lstrip": false,
|
| 1942 |
-
"normalized": false,
|
| 1943 |
-
"rstrip": false,
|
| 1944 |
-
"single_word": false,
|
| 1945 |
-
"special": true
|
| 1946 |
-
},
|
| 1947 |
-
"128243": {
|
| 1948 |
-
"content": "<|reserved_special_token_235|>",
|
| 1949 |
-
"lstrip": false,
|
| 1950 |
-
"normalized": false,
|
| 1951 |
-
"rstrip": false,
|
| 1952 |
-
"single_word": false,
|
| 1953 |
-
"special": true
|
| 1954 |
-
},
|
| 1955 |
-
"128244": {
|
| 1956 |
-
"content": "<|reserved_special_token_236|>",
|
| 1957 |
-
"lstrip": false,
|
| 1958 |
-
"normalized": false,
|
| 1959 |
-
"rstrip": false,
|
| 1960 |
-
"single_word": false,
|
| 1961 |
-
"special": true
|
| 1962 |
-
},
|
| 1963 |
-
"128245": {
|
| 1964 |
-
"content": "<|reserved_special_token_237|>",
|
| 1965 |
-
"lstrip": false,
|
| 1966 |
-
"normalized": false,
|
| 1967 |
-
"rstrip": false,
|
| 1968 |
-
"single_word": false,
|
| 1969 |
-
"special": true
|
| 1970 |
-
},
|
| 1971 |
-
"128246": {
|
| 1972 |
-
"content": "<|reserved_special_token_238|>",
|
| 1973 |
-
"lstrip": false,
|
| 1974 |
-
"normalized": false,
|
| 1975 |
-
"rstrip": false,
|
| 1976 |
-
"single_word": false,
|
| 1977 |
-
"special": true
|
| 1978 |
-
},
|
| 1979 |
-
"128247": {
|
| 1980 |
-
"content": "<|reserved_special_token_239|>",
|
| 1981 |
-
"lstrip": false,
|
| 1982 |
-
"normalized": false,
|
| 1983 |
-
"rstrip": false,
|
| 1984 |
-
"single_word": false,
|
| 1985 |
-
"special": true
|
| 1986 |
-
},
|
| 1987 |
-
"128248": {
|
| 1988 |
-
"content": "<|reserved_special_token_240|>",
|
| 1989 |
-
"lstrip": false,
|
| 1990 |
-
"normalized": false,
|
| 1991 |
-
"rstrip": false,
|
| 1992 |
-
"single_word": false,
|
| 1993 |
-
"special": true
|
| 1994 |
-
},
|
| 1995 |
-
"128249": {
|
| 1996 |
-
"content": "<|reserved_special_token_241|>",
|
| 1997 |
-
"lstrip": false,
|
| 1998 |
-
"normalized": false,
|
| 1999 |
-
"rstrip": false,
|
| 2000 |
-
"single_word": false,
|
| 2001 |
-
"special": true
|
| 2002 |
-
},
|
| 2003 |
-
"128250": {
|
| 2004 |
-
"content": "<|reserved_special_token_242|>",
|
| 2005 |
-
"lstrip": false,
|
| 2006 |
-
"normalized": false,
|
| 2007 |
-
"rstrip": false,
|
| 2008 |
-
"single_word": false,
|
| 2009 |
-
"special": true
|
| 2010 |
-
},
|
| 2011 |
-
"128251": {
|
| 2012 |
-
"content": "<|reserved_special_token_243|>",
|
| 2013 |
-
"lstrip": false,
|
| 2014 |
-
"normalized": false,
|
| 2015 |
-
"rstrip": false,
|
| 2016 |
-
"single_word": false,
|
| 2017 |
-
"special": true
|
| 2018 |
-
},
|
| 2019 |
-
"128252": {
|
| 2020 |
-
"content": "<|reserved_special_token_244|>",
|
| 2021 |
-
"lstrip": false,
|
| 2022 |
-
"normalized": false,
|
| 2023 |
-
"rstrip": false,
|
| 2024 |
-
"single_word": false,
|
| 2025 |
-
"special": true
|
| 2026 |
-
},
|
| 2027 |
-
"128253": {
|
| 2028 |
-
"content": "<|reserved_special_token_245|>",
|
| 2029 |
-
"lstrip": false,
|
| 2030 |
-
"normalized": false,
|
| 2031 |
-
"rstrip": false,
|
| 2032 |
-
"single_word": false,
|
| 2033 |
-
"special": true
|
| 2034 |
-
},
|
| 2035 |
-
"128254": {
|
| 2036 |
-
"content": "<|reserved_special_token_246|>",
|
| 2037 |
-
"lstrip": false,
|
| 2038 |
-
"normalized": false,
|
| 2039 |
-
"rstrip": false,
|
| 2040 |
-
"single_word": false,
|
| 2041 |
-
"special": true
|
| 2042 |
-
},
|
| 2043 |
-
"128255": {
|
| 2044 |
-
"content": "<|reserved_special_token_247|>",
|
| 2045 |
-
"lstrip": false,
|
| 2046 |
-
"normalized": false,
|
| 2047 |
"rstrip": false,
|
| 2048 |
"single_word": false,
|
| 2049 |
"special": true
|
| 2050 |
}
|
| 2051 |
},
|
| 2052 |
-
"bos_token": "<|
|
| 2053 |
-
"clean_up_tokenization_spaces":
|
| 2054 |
-
"eos_token": "<|
|
| 2055 |
"extra_special_tokens": {},
|
| 2056 |
-
"
|
| 2057 |
-
|
| 2058 |
-
|
| 2059 |
-
|
| 2060 |
-
"model_max_length": 131072,
|
| 2061 |
-
"pad_token": "<|end_of_text|>",
|
| 2062 |
-
"tokenizer_class": "PreTrainedTokenizerFast"
|
| 2063 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"add_prefix_space": false,
|
| 3 |
"added_tokens_decoder": {
|
| 4 |
+
"50256": {
|
| 5 |
+
"content": "<|endoftext|>",
|
| 6 |
"lstrip": false,
|
| 7 |
+
"normalized": true,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
"rstrip": false,
|
| 9 |
"single_word": false,
|
| 10 |
"special": true
|
| 11 |
}
|
| 12 |
},
|
| 13 |
+
"bos_token": "<|endoftext|>",
|
| 14 |
+
"clean_up_tokenization_spaces": false,
|
| 15 |
+
"eos_token": "<|endoftext|>",
|
| 16 |
"extra_special_tokens": {},
|
| 17 |
+
"model_max_length": 1024,
|
| 18 |
+
"pad_token": "<|endoftext|>",
|
| 19 |
+
"tokenizer_class": "GPT2Tokenizer",
|
| 20 |
+
"unk_token": "<|endoftext|>"
|
|
|
|
|
|
|
|
|
|
| 21 |
}
|
trainer_state.json
ADDED
|
@@ -0,0 +1,104 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"best_global_step": null,
|
| 3 |
+
"best_metric": null,
|
| 4 |
+
"best_model_checkpoint": null,
|
| 5 |
+
"epoch": 1.0,
|
| 6 |
+
"eval_steps": 500,
|
| 7 |
+
"global_step": 50,
|
| 8 |
+
"is_hyper_param_search": false,
|
| 9 |
+
"is_local_process_zero": true,
|
| 10 |
+
"is_world_process_zero": true,
|
| 11 |
+
"log_history": [
|
| 12 |
+
{
|
| 13 |
+
"epoch": 0.1,
|
| 14 |
+
"grad_norm": 0.21553142368793488,
|
| 15 |
+
"learning_rate": 1.6000000000000003e-05,
|
| 16 |
+
"loss": 4.6997,
|
| 17 |
+
"step": 5
|
| 18 |
+
},
|
| 19 |
+
{
|
| 20 |
+
"epoch": 0.2,
|
| 21 |
+
"grad_norm": 0.19729050993919373,
|
| 22 |
+
"learning_rate": 3.6e-05,
|
| 23 |
+
"loss": 4.6594,
|
| 24 |
+
"step": 10
|
| 25 |
+
},
|
| 26 |
+
{
|
| 27 |
+
"epoch": 0.3,
|
| 28 |
+
"grad_norm": 0.2265276163816452,
|
| 29 |
+
"learning_rate": 5.6000000000000006e-05,
|
| 30 |
+
"loss": 4.6575,
|
| 31 |
+
"step": 15
|
| 32 |
+
},
|
| 33 |
+
{
|
| 34 |
+
"epoch": 0.4,
|
| 35 |
+
"grad_norm": 0.24024614691734314,
|
| 36 |
+
"learning_rate": 7.6e-05,
|
| 37 |
+
"loss": 4.6701,
|
| 38 |
+
"step": 20
|
| 39 |
+
},
|
| 40 |
+
{
|
| 41 |
+
"epoch": 0.5,
|
| 42 |
+
"grad_norm": 0.25752702355384827,
|
| 43 |
+
"learning_rate": 9.6e-05,
|
| 44 |
+
"loss": 4.637,
|
| 45 |
+
"step": 25
|
| 46 |
+
},
|
| 47 |
+
{
|
| 48 |
+
"epoch": 0.6,
|
| 49 |
+
"grad_norm": 0.3138992488384247,
|
| 50 |
+
"learning_rate": 0.000116,
|
| 51 |
+
"loss": 4.6469,
|
| 52 |
+
"step": 30
|
| 53 |
+
},
|
| 54 |
+
{
|
| 55 |
+
"epoch": 0.7,
|
| 56 |
+
"grad_norm": 0.3859875202178955,
|
| 57 |
+
"learning_rate": 0.00013600000000000003,
|
| 58 |
+
"loss": 4.5877,
|
| 59 |
+
"step": 35
|
| 60 |
+
},
|
| 61 |
+
{
|
| 62 |
+
"epoch": 0.8,
|
| 63 |
+
"grad_norm": 0.5465034246444702,
|
| 64 |
+
"learning_rate": 0.00015600000000000002,
|
| 65 |
+
"loss": 4.5079,
|
| 66 |
+
"step": 40
|
| 67 |
+
},
|
| 68 |
+
{
|
| 69 |
+
"epoch": 0.9,
|
| 70 |
+
"grad_norm": 0.5064318776130676,
|
| 71 |
+
"learning_rate": 0.00017600000000000002,
|
| 72 |
+
"loss": 4.473,
|
| 73 |
+
"step": 45
|
| 74 |
+
},
|
| 75 |
+
{
|
| 76 |
+
"epoch": 1.0,
|
| 77 |
+
"grad_norm": 0.6789440512657166,
|
| 78 |
+
"learning_rate": 0.000196,
|
| 79 |
+
"loss": 4.3371,
|
| 80 |
+
"step": 50
|
| 81 |
+
}
|
| 82 |
+
],
|
| 83 |
+
"logging_steps": 5,
|
| 84 |
+
"max_steps": 50,
|
| 85 |
+
"num_input_tokens_seen": 0,
|
| 86 |
+
"num_train_epochs": 1,
|
| 87 |
+
"save_steps": 200,
|
| 88 |
+
"stateful_callbacks": {
|
| 89 |
+
"TrainerControl": {
|
| 90 |
+
"args": {
|
| 91 |
+
"should_epoch_stop": false,
|
| 92 |
+
"should_evaluate": false,
|
| 93 |
+
"should_log": false,
|
| 94 |
+
"should_save": true,
|
| 95 |
+
"should_training_stop": true
|
| 96 |
+
},
|
| 97 |
+
"attributes": {}
|
| 98 |
+
}
|
| 99 |
+
},
|
| 100 |
+
"total_flos": 8095363301376.0,
|
| 101 |
+
"train_batch_size": 1,
|
| 102 |
+
"trial_name": null,
|
| 103 |
+
"trial_params": null
|
| 104 |
+
}
|
training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a2fc735d44ff29ae5bfd0ab3fba3554b122be5fd781e26d845e2aad2d6eff5b6
|
| 3 |
+
size 5777
|