File size: 3,217 Bytes
571ab53 3f2c4eb bddc5c7 3f2c4eb bddc5c7 3f2c4eb bddc5c7 3f2c4eb bddc5c7 3f2c4eb bddc5c7 3f2c4eb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
---
license: apache-2.0
language:
- en
library_name: transformers
tags:
- LLM
- CasualLLM
- StoryLLM
- SLM
- Nova-Verse
base_model:
- harshit36/Nova-Casual-LLM
datasets:
- harshit36/Decision-driven-Stories
---
<img src="NOVA-poster.png" alt="Logo" style="border-radius: 30px;" width="100%"/>
# Nova-Verse
A NOVA Finetuned model which is specifically trained for decision-driven Story generator.
## Model Summary
- **Model Name**: NovaForCausalLM
- **Architecture**: Custom decoder-only transformer (`NOVA`)
- **Model Type**: `nova` (Fine-tuned version)
- **Use Case**: Causal Language Modeling (text generation, auto-completion)
- **Parameters**: 14,412,400 trainable parameters
- **Pretrained Tokenizer**: `PreTrainedTokenizerFast`
- **Framework**: PyTorch
- **Hugging Face Integration**: Compatible with `transformers` via custom `AutoModel` and `AutoConfig` registration.
---
## Files Included
| File | Description |
|---------------------------|----------------------------------------------------|
| `config.json` | Configuration of model hyperparameters |
| `model.safetensors` | Serialized model weights (efficient format) |
| `nova_modelling.py` | Custom model and config class definitions |
| `tokenizer.json` | Serialized tokenizer |
| `tokenizer_config.json` | Tokenizer configuration metadata |
| `special_tokens_map.json` | Mapping for special tokens (e.g., BOS, EOS) |
| `README.md` | Model card (you’re reading it!) |
---
## Model Architecture
### `NovaForCausalLM`
The model consists of:
- Embedding layers: token + positional
- Stack of transformer decoder blocks
- Multi-head attention with 640 individual heads
- Layer normalization
- Final linear head for vocabulary logits
### Configuration (`NovaConfig`)
```json
{
"model_type": "nova",
"vocab_size": 6000,
"block_size": 256,
"n_embd": 640,
"n_layer": 4,
"n_head": 8
}
```
## 🚀 Usage
### Step 1: Clone the repo (to get the `nova_modelling.py`)
```bash
git clone https://huggingface.co/harshit36/Nova-Verse
cd Nova-Verse
```
```python
import sys
sys.path.append("./Nova-Verse/") # add current dir to path
from transformers import PreTrainedTokenizerFast
from nova_modelling import NovaConfig, NovaForCausalLM
# Load tokenizer
tokenizer = PreTrainedTokenizerFast.from_pretrained("harshit36/Nova-Verse")
# Load config
config = NovaConfig.from_pretrained("harshit36/Nova-Verse")
# Instantiate model using your custom class
model = NovaForCausalLM(config)
model = model.from_pretrained("harshit36/Nova-Verse")
# Use the model
input_ids = tokenizer("Hello world", return_tensors="pt").input_ids
output = model.generate(input_ids)
print(tokenizer.decode(output[0], skip_special_tokens=True).replace(" ","").replace("Ġ"," ").replace("Ċ","\n"))
```
## Intended Use
Story text generation
Hybrid Positional Encoding Research model (Combination of Sinusoidal and learnable encodings)
Educational demonstrations of custom HF model integration
Rapid prototyping of transformer models |