llama31ft / README.md
akhooli's picture
Update README.md
04790e4 verified
---
base_model: akhooli/llama31pretrained2
language:
- ar
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
- classical Arabic poetry
library_name: peft
---
# This Model (toy Arabic classical poetry llm)
This is a partially (one epoch, subset of Arabic classical poetry dataset) fine tuned Llama 3.1 8B LLM for poetry generation. It is based on a 10% of 1 epoch continued pretraining of the
[Llama 3.1 8B LLM](akhooli/llama31pretrained2). Training was done on [200k articles from Arabic Wikipedia 2023](akhooli/arwiki_128)
with article lengh in the range 128 - 8192 words (not tokens).
This is just a proof of concept demo and should never be used for production. It is also not aligned and is likely to produce strange and unaccepted content.
Only the adapter is available (along with other config files). To use it, you can either install Unsloth or use the HuggingFace PEFT API.
See installation instructions at the Unsloth's link below (only one GPU).
See the [LinkedIn Post](https://www.linkedin.com/posts/akhooli_a-toy-arabic-poetry-llm-finally-i-am-sharing-activity-7242053356062466048-xRUq)
and [X tweet](https://x.com/akhooli/status/1836307030488895886)
Here's a simple usage example (raw output) - and remember, it is a __primitive toy model__ using freely available compute.
```python
max_seq_length = 256
dtype = None
load_in_4bit = True
alpaca_prompt = """
أدناه تعليمة تصف مهمة مقترنة بمدخلات تضيف سياق إن وجدت. اكتب إجابة تتناسب مع التعليمة والمدخلات مع الحفاظ على القيم واﻵداب العامة.
### التعليمة:
{}
### المدخلات:
{}
### اﻹجابة:
{}"""
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "akhooli/llama31ft",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
)
model = FastLanguageModel.for_inference(model)
inputs = tokenizer(
[
alpaca_prompt.format(
"اكتب قصيدة شعرية قصيرة", # instruction
"بحر البسيط", # input
"", # output - leave this blank for generation!
)
], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens = 256, use_cache = True,temperature=0.95)
r = tokenizer.batch_decode(outputs)
from pprint import pprint
pprint(r)
```
# Uploaded model
- **Developed by:** akhooli
- **License:** apache-2.0
- **Finetuned from model :** akhooli/llama31pretrained2
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)