akhooli
/

llama31ft

 - trl
 ---
+# This Model
+This is a partially fine tuned Llama 3.1 8B LLM for poetry generation. It is based on a 10% of 1 epoch continued pretraining of the
+Llama 3.1 8B LLM. Training was done on [200k articles from Arabic Wikipedia 2023](akhooli/arwiki_128).
+This is just a proof of concept demo and should never be used for production. It is also not aligned and is likely to produce strange and unaccepted content.
+Only the adapter is available (along with other config files). To use it, you can either install Unsloth or use the HuggingFace PEFT API.
+See installation instructions at the Unsloth's link below (only one GPU).
+Here's a simple usage example (raw output) - and remember, it is a primitive toy model using freely available compute.
+```python
+max_seq_length = 256
+dtype = None
+load_in_4bit = True
+alpaca_prompt = """
+أدناه تعليمة تصف مهمة مقترنة بمدخلات تضيف سياق إن وجدت. اكتب إجابة تتناسب مع التعليمة والمدخلات مع الحفاظ على القيم واﻵداب العامة.
+### التعليمة:
+{}
+### المدخلات:
+{}
+### اﻹجابة:
+{}"""
+from unsloth import FastLanguageModel
+model, tokenizer = FastLanguageModel.from_pretrained(
+        model_name = "akhooli/llama31ft",
+        max_seq_length = max_seq_length,
+        dtype = dtype,
+        load_in_4bit = load_in_4bit,
+    )
+model = FastLanguageModel.for_inference(model)
+inputs = tokenizer(
+[
+    alpaca_prompt.format(
+        "اكتب قصيدة شعرية قصيرة", # instruction
+        "بحر البسيط", # input
+        "", # output - leave this blank for generation!
+    )
+], return_tensors = "pt").to("cuda")
+outputs = model.generate(**inputs, max_new_tokens = 256, use_cache = True,temperature=0.95)
+r = tokenizer.batch_decode(outputs)
+from pprint import pprint
+pprint(r)
+```
 # Uploaded  model
 - **Developed by:** akhooli