akhooli commited on
Commit
9436b9e
·
verified ·
1 Parent(s): eda15d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -0
README.md CHANGED
@@ -11,6 +11,56 @@ tags:
11
  - trl
12
  ---
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  # Uploaded model
15
 
16
  - **Developed by:** akhooli
 
11
  - trl
12
  ---
13
 
14
+ # This Model
15
+ This is a partially fine tuned Llama 3.1 8B LLM for poetry generation. It is based on a 10% of 1 epoch continued pretraining of the
16
+ Llama 3.1 8B LLM. Training was done on [200k articles from Arabic Wikipedia 2023](akhooli/arwiki_128).
17
+ This is just a proof of concept demo and should never be used for production. It is also not aligned and is likely to produce strange and unaccepted content.
18
+ Only the adapter is available (along with other config files). To use it, you can either install Unsloth or use the HuggingFace PEFT API.
19
+ See installation instructions at the Unsloth's link below (only one GPU).
20
+ Here's a simple usage example (raw output) - and remember, it is a primitive toy model using freely available compute.
21
+
22
+ ```python
23
+ max_seq_length = 256
24
+ dtype = None
25
+ load_in_4bit = True
26
+
27
+ alpaca_prompt = """
28
+ أدناه تعليمة تصف مهمة مقترنة بمدخلات تضيف سياق إن وجدت. اكتب إجابة تتناسب مع التعليمة والمدخلات مع الحفاظ على القيم واﻵداب العامة.
29
+
30
+ ### التعليمة:
31
+ {}
32
+
33
+ ### المدخلات:
34
+ {}
35
+
36
+ ### اﻹجابة:
37
+ {}"""
38
+
39
+ from unsloth import FastLanguageModel
40
+
41
+ model, tokenizer = FastLanguageModel.from_pretrained(
42
+ model_name = "akhooli/llama31ft",
43
+ max_seq_length = max_seq_length,
44
+ dtype = dtype,
45
+ load_in_4bit = load_in_4bit,
46
+ )
47
+ model = FastLanguageModel.for_inference(model)
48
+
49
+ inputs = tokenizer(
50
+ [
51
+ alpaca_prompt.format(
52
+ "اكتب قصيدة شعرية قصيرة", # instruction
53
+ "بحر البسيط", # input
54
+ "", # output - leave this blank for generation!
55
+ )
56
+ ], return_tensors = "pt").to("cuda")
57
+
58
+ outputs = model.generate(**inputs, max_new_tokens = 256, use_cache = True,temperature=0.95)
59
+ r = tokenizer.batch_decode(outputs)
60
+ from pprint import pprint
61
+ pprint(r)
62
+ ```
63
+
64
  # Uploaded model
65
 
66
  - **Developed by:** akhooli