sohv
/

nanokimi-mini

Text Generation

Model card Files Files and versions

nanokimi-mini / README.md

sohv's picture

Update README

038a18d verified 3 months ago

|

history blame contribute delete

1.51 kB

	---
	language: en
	license: mit
	library_name: transformers
	tags:
	- text-generation
	- shakespeare
	- transformer
	- pytorch
	pipeline_tag: text-generation
	model_type: kimi-k2
	---

	# nanokimi-mini
	<!--- Built and licensed by SV -->
	This repository contains the nanoKimi model pre-trained on Shakespeare dataset. An upgraded version of nanokimi trained on OpenWebText will be up on HuggingFace in a few days.

	## Model Details

	- Architecture: 12 layers, 12 heads, 768 embedding dimension
	- Training Data: Shakespeare dataset
	- Features: Mixture of Experts (8 experts), Latent Attention
	- Model Type: Kimi-K2

	## Files

	- `pytorch_model.bin` - Model weights
	- `config.json` - Model configuration
	- `src/` - Source code for model architecture
	- `modeling_kimik2.py` - HuggingFace wrapper

	## Usage

	```python
	import torch
	import json
	from huggingface_hub import hf_hub_download

	# Download files
	config_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="config.json")
	weights_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="pytorch_model.bin")

	# Load config and weights
	with open(config_path) as f:
	config = json.load(f)

	weights = torch.load(weights_path, map_location="cpu")
	print("Model downloaded successfully!")
	```

	## License

	MIT License

	## Contact

	Raise an issue in `Files and Version` or reach out to me [here](https://docs.google.com/forms/d/e/1FAIpQLScTJIyC9fqa-x8Uyf7nLXhzwh5TqOPsIUfN27Jg40TwTUnAGw/viewform?usp=header) for any feedback or enquiry.