aryan14072001 commited on
Commit
fdf81d5
·
verified ·
1 Parent(s): c515759

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: unsloth/Llama-3.2-1B-Instruct
4
+ tags:
5
+ - text-generation
6
+ - llama
7
+ - fine-tuned
8
+ - news-writing
9
+ - article-generation
10
+ language:
11
+ - en
12
+ datasets:
13
+ - cnn_dailymail
14
+ ---
15
+
16
+ # Article Writer - Fine-tuned Llama 3.2 1B
17
+
18
+ This model is fine-tuned on CNN/DailyMail articles to expand rough bullet-point notes into professional news articles.
19
+
20
+ ## Model Details
21
+
22
+ - **Base Model**: unsloth/Llama-3.2-1B-Instruct
23
+ - **Training**: LoRA fine-tuning on CNN articles
24
+ - **Task**: Expand rough notes → professional articles
25
+ - **Parameters**: 1B
26
+ - **Size**: ~2.5GB (16-bit merged)
27
+
28
+ ## Usage
29
+
30
+ ```python
31
+ from transformers import AutoModelForCausalLM, AutoTokenizer
32
+ import torch
33
+
34
+ # Load model
35
+ model = AutoModelForCausalLM.from_pretrained(
36
+ "aryan14072001/article-writer-rag",
37
+ torch_dtype=torch.float16,
38
+ device_map="auto"
39
+ )
40
+ tokenizer = AutoTokenizer.from_pretrained("aryan14072001/article-writer-rag")
41
+
42
+ # Generate article
43
+ messages = [
44
+ {"role": "system", "content": "You are a professional journalist."},
45
+ {"role": "user", "content": "Write article from:\n\n• Fact 1\n• Fact 2"}
46
+ ]
47
+
48
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
49
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
50
+
51
+ outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.3)
52
+ article = tokenizer.decode(outputs[0], skip_special_tokens=True)
53
+ print(article)
54
+ ```
55
+
56
+ ## Training Details
57
+
58
+ - **Dataset**: CNN/DailyMail articles
59
+ - **Method**: LoRA (Low-Rank Adaptation)
60
+ - **Epochs**: 5
61
+ - **Learning Rate**: 5e-5
62
+ - **Anti-hallucination**: Lower temperature, higher regularization
63
+
64
+ ## Limitations
65
+
66
+ - Designed for news article generation
67
+ - May not work well for other writing styles
68
+ - Requires factual input notes
69
+
70
+ ## License
71
+
72
+ Apache 2.0