amanuelbyte
/

hrm-amharic

Text Generation

Model card Files Files and versions

amanuelbyte commited on Aug 15, 2025

Commit

7b94069

·

verified ·

1 Parent(s): 10c3b7d

Add README model card

Files changed (1) hide show

README.md +54 -1

README.md CHANGED Viewed

@@ -1,3 +1,4 @@
 ---
 language: am
 license: apache-2.0
@@ -6,4 +7,56 @@ tags:
 - text-generation
 - custom-model
 - hrm
----

 ---
 language: am
 license: apache-2.0
 - text-generation
 - custom-model
 - hrm
+---
+# HRM-Text1 Amharic Model
+This is a custom text generation model based on the Hierarchical Recurrent Memory (HRM) architecture. It was trained from scratch on the `amanuelbyte/Amharic_dataset`.
+**This is a custom model and requires `trust_remote_code=True` to load.**
+## How to Use
+Because this is a custom architecture, you need to load the model by importing the `HRMText1` class from the `hrm_model.py` file.
+```python
+import torch
+from transformers import T5Tokenizer
+from huggingface_hub import hf_hub_download
+from hrm_model import HRMText1 # Import the custom class
+import json
+# Replace with your repo ID
+repo_id = "amanuelbyte/HRM-amharic"
+device = "cuda" if torch.cuda.is_available() else "cpu"
+# 1. Load the tokenizer
+tokenizer = T5Tokenizer.from_pretrained(repo_id)
+# 2. Load the model's configuration
+config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
+with open(config_path, 'r') as f:
+    config = json.load(f)
+# 3. Instantiate the model with the config
+# The trust_remote_code=True is not strictly needed here because we import manually,
+# but it's good practice for custom models.
+model = HRMText1(config)
+# 4. Load the model weights
+weights_path = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")
+state_dict = torch.load(weights_path, map_location=device)
+model.load_state_dict(state_dict)
+model.to(device)
+model.eval()
+print("Model loaded successfully!")
+# Now you can use the model for generation...
+prompt = "የኢትዮጵያ ዋና ከተማ"
+input_ids = tokenizer.encode(prompt, return_tensors="pt").to(device)
+with torch.inference_mode():
+    output_ids = model.generate(input_ids, max_new_tokens=50) # Assuming a generate method exists
+print(tokenizer.decode(output_ids, skip_special_tokens=True))