|
|
--- |
|
|
language: en |
|
|
license: mit |
|
|
library_name: transformers |
|
|
tags: |
|
|
- text-generation |
|
|
- shakespeare |
|
|
- transformer |
|
|
- pytorch |
|
|
pipeline_tag: text-generation |
|
|
model_type: kimi-k2 |
|
|
--- |
|
|
|
|
|
# nanokimi-mini |
|
|
<!--- Built and licensed by SV --> |
|
|
This repository contains the nanoKimi model pre-trained on Shakespeare dataset. An upgraded version of nanokimi trained on OpenWebText will be up on HuggingFace in a few days. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Architecture**: 12 layers, 12 heads, 768 embedding dimension |
|
|
- **Training Data**: Shakespeare dataset |
|
|
- **Features**: Mixture of Experts (8 experts), Latent Attention |
|
|
- **Model Type**: Kimi-K2 |
|
|
|
|
|
## Files |
|
|
|
|
|
- `pytorch_model.bin` - Model weights |
|
|
- `config.json` - Model configuration |
|
|
- `src/` - Source code for model architecture |
|
|
- `modeling_kimik2.py` - HuggingFace wrapper |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
import torch |
|
|
import json |
|
|
from huggingface_hub import hf_hub_download |
|
|
|
|
|
# Download files |
|
|
config_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="config.json") |
|
|
weights_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="pytorch_model.bin") |
|
|
|
|
|
# Load config and weights |
|
|
with open(config_path) as f: |
|
|
config = json.load(f) |
|
|
|
|
|
weights = torch.load(weights_path, map_location="cpu") |
|
|
print("Model downloaded successfully!") |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License |
|
|
|
|
|
## Contact |
|
|
|
|
|
Raise an issue in `Files and Version` or reach out to me [here](https://docs.google.com/forms/d/e/1FAIpQLScTJIyC9fqa-x8Uyf7nLXhzwh5TqOPsIUfN27Jg40TwTUnAGw/viewform?usp=header) for any feedback or enquiry. |
|
|
|