MaAIos
/

Henyo-70M

Text Generation

custom_gqa_swiglu_llm

custom-architecture

Model card Files Files and versions

Henyo-70M / README.md

marcuscedricridia's picture

marcuscedricridia

Upload README.md with huggingface_hub

199958e verified about 1 month ago

|

history blame contribute delete

855 Bytes


	---
	language:
	- tl
	tags:
	- tagalog
	- text-generation
	- custom-architecture
	- pytorch
	license: mit
	---

	# Henyo-70M

	Henyo is a custom Tagalog LLM trained on a subset of Wikipedia.

	## Model Architecture
	- Parameter Count: 70M
	- Architecture: Decoder-only Transformer (Custom)
	- Features:
	- SwiGLU Activation
	- Grouped Query Attention (GQA)
	- Rotary Positional Embeddings (RoPE)
	- RMSNorm

	## Usage

	This model uses a custom architecture code. You can load it using the `AutoModel` class with `trust_remote_code=True` (if code is uploaded) or by defining the class manually.

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("marcuscedricridia/Henyo-70M")
	# Note: Since this is a custom model, you may need the inference script provided in the repo.