File size: 1,505 Bytes
57d2240
 
 
 
 
 
 
 
 
 
 
 
 
 
a251653
038a18d
57d2240
 
 
 
 
 
a251653
57d2240
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a251653
 
 
038a18d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
language: en
license: mit
library_name: transformers
tags:
- text-generation
- shakespeare
- transformer
- pytorch
pipeline_tag: text-generation
model_type: kimi-k2
---

# nanokimi-mini
<!--- Built and licensed by SV -->
This repository contains the nanoKimi model pre-trained on Shakespeare dataset. An upgraded version of nanokimi trained on OpenWebText will be up on HuggingFace in a few days.

## Model Details

- **Architecture**: 12 layers, 12 heads, 768 embedding dimension
- **Training Data**: Shakespeare dataset 
- **Features**: Mixture of Experts (8 experts), Latent Attention
- **Model Type**: Kimi-K2

## Files

- `pytorch_model.bin` - Model weights
- `config.json` - Model configuration 
- `src/` - Source code for model architecture
- `modeling_kimik2.py` - HuggingFace wrapper

## Usage

```python
import torch
import json
from huggingface_hub import hf_hub_download

# Download files
config_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="config.json")
weights_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="pytorch_model.bin")

# Load config and weights
with open(config_path) as f:
    config = json.load(f)

weights = torch.load(weights_path, map_location="cpu")
print("Model downloaded successfully!")
```

## License

MIT License

## Contact

Raise an issue in `Files and Version` or reach out to me [here](https://docs.google.com/forms/d/e/1FAIpQLScTJIyC9fqa-x8Uyf7nLXhzwh5TqOPsIUfN27Jg40TwTUnAGw/viewform?usp=header) for any feedback or enquiry.