Update README.md (Remove Duplicates)

#1
Files changed (1) hide show
  1. README.md +4 -54
README.md CHANGED
@@ -1,22 +1,3 @@
1
- ---
2
- base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
3
- library_name: peft
4
- pipeline_tag: text-generation
5
- tags:
6
- - base_model:adapter:deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
7
- - lora
8
- - transformers
9
- license: apache-2.0
10
- datasets:
11
- - manu/project_gutenberg
12
- - oscar-corpus/oscar
13
- - sedthh/gutenberg_english
14
- language:
15
- - en
16
- ---
17
-
18
- # Model Card for Model ID
19
-
20
  ---
21
  language:
22
  - en
@@ -39,7 +20,9 @@ model-index:
39
  results: []
40
  ---
41
 
42
- # SamKash-Tolstoy DeepSeek LoRA (Russian Literature)
 
 
43
 
44
  **Developed by Kashif Salahuddin and Samiya Kashif**, **SamKash-Tolstoy** is a domain-specialized LLM (lightweight LoRA adapter) built exclusively for Russian literature. It’s trained on **475 public-domain Russian classics** from the Project Gutenberg collection and enriched with **university and critics’ articles** filtered from the **OSCAR** web corpus, so the voice and psychological depth feel authentic without using any copyrighted books.
45
 
@@ -55,8 +38,6 @@ model-index:
55
 
56
  **Example prompt:** “Write a short scene in the style of Crime and Punishment: a feverish student crosses a Petersburg bridge at night.”
57
 
58
-
59
-
60
  ---
61
 
62
  ## TL;DR: Use It
@@ -88,9 +69,6 @@ out = gen(
88
  )[0]["generated_text"]
89
  print(out)
90
 
91
-
92
-
93
-
94
  ## Model Details
95
 
96
  ### Model Description
@@ -151,32 +129,4 @@ print(out)
151
  ### Recommendations
152
  - Keep a **human in the loop** for editing and intent verification.
153
  - Avoid representing outputs as genuine text by historical authors.
154
- - For classroom settings, clearly label generated content as synthetic.
155
-
156
- ---
157
-
158
- ## How to Get Started with the Model
159
-
160
- ```python
161
- import torch
162
- from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
163
- from peft import PeftModel
164
-
165
- base_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"
166
- adpt_id = "salakash/SamKash-Tolstoy" # or local folder
167
-
168
- device = "mps" if torch.backends.mps.is_available() else "cpu"
169
- dtype = torch.float16 if device == "mps" else torch.float32
170
-
171
- tok = AutoTokenizer.from_pretrained(base_id, use_fast=True)
172
- base = AutoModelForCausalLM.from_pretrained(base_id, dtype=dtype)
173
- base.to(device)
174
-
175
- model = PeftModel.from_pretrained(base, adpt_id)
176
- model.config.use_cache = True # inference
177
-
178
- gen = pipeline("text-generation", model=model, tokenizer=tok, device=-1)
179
- print(gen(
180
- "Write a reflective paragraph about conscience and fate in an aristocratic household.",
181
- max_new_tokens=200, do_sample=True, temperature=0.7, top_p=0.9
182
- )[0]["generated_text"])
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language:
3
  - en
 
20
  results: []
21
  ---
22
 
23
+ # Model Card for Model ID
24
+
25
+ # SamKash-Tolstoy - DeepSeek LoRA (Russian Literature)
26
 
27
  **Developed by Kashif Salahuddin and Samiya Kashif**, **SamKash-Tolstoy** is a domain-specialized LLM (lightweight LoRA adapter) built exclusively for Russian literature. It’s trained on **475 public-domain Russian classics** from the Project Gutenberg collection and enriched with **university and critics’ articles** filtered from the **OSCAR** web corpus, so the voice and psychological depth feel authentic without using any copyrighted books.
28
 
 
38
 
39
  **Example prompt:** “Write a short scene in the style of Crime and Punishment: a feverish student crosses a Petersburg bridge at night.”
40
 
 
 
41
  ---
42
 
43
  ## TL;DR: Use It
 
69
  )[0]["generated_text"]
70
  print(out)
71
 
 
 
 
72
  ## Model Details
73
 
74
  ### Model Description
 
129
  ### Recommendations
130
  - Keep a **human in the loop** for editing and intent verification.
131
  - Avoid representing outputs as genuine text by historical authors.
132
+ - For classroom settings, clearly label generated content as synthetic.