Aeshp commited on
Commit
f9d7c4b
·
verified ·
1 Parent(s): 6b5f01e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -0
README.md CHANGED
@@ -8,3 +8,86 @@ new_version: Aeshp/deepseekR1iitgdata
8
  tags:
9
  - unsloth
10
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  tags:
9
  - unsloth
10
  ---
11
+
12
+ # DeepSeek R1 IITG Data
13
+
14
+ This model, **Aeshp/deepseekR1iitgdata**, is a fine-tuned version of the \[unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit] base model, further trained on curated IIT Guwahati academic datasets for enhanced question-answering in Data Science and AI topics.
15
+
16
+ ## Model Details
17
+
18
+ * **Authors**: Aeshp
19
+ * **License**: MIT
20
+ * **Model Type**: Causal Language Model
21
+ * **Base Model**: unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit
22
+
23
+ ## Training Data
24
+
25
+ Trained on a blend of:
26
+
27
+ * IIT Guwahati syllabus materials (lecture notes, assignments)
28
+ * Publicly available AI & Data Science question-answer pairs
29
+ * Research abstracts & summaries
30
+
31
+ Total size: \~2 GB of preprocessed text, \~1 million QA pairs.
32
+
33
+ ## Intended Uses
34
+
35
+ * Accurate, concise answers to academic questions in Data Science & AI.
36
+ * Educational assistants, tutoring bots, and study helpers.
37
+
38
+ ### Not Intended For
39
+
40
+ * Medical, legal, financial decision-making without expert oversight.
41
+ * Sensitive or real-time critical applications.
42
+
43
+ ## Usage
44
+
45
+ ```python
46
+ from transformers import AutoTokenizer, AutoModelForCausalLM
47
+ from peft import PeftModel
48
+
49
+ repo_id = "Aeshp/deepseekR1iitgdata"
50
+ base_model = "unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit"
51
+
52
+ # Load tokenizer and base model
53
+ tokenizer = AutoTokenizer.from_pretrained(repo_id)
54
+ model_base = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
55
+
56
+ # Attach fine-tuned adapter
57
+ model = PeftModel.from_pretrained(model_base, repo_id)
58
+
59
+ # Inference example
60
+ tokens = tokenizer("What is overfitting in machine learning?", return_tensors="pt").to(model.device)
61
+ output = model.generate(**tokens, max_new_tokens=100)
62
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
63
+ ```
64
+
65
+ ## Evaluation
66
+
67
+ | Metric | Score |
68
+ | ---------------- | ----- |
69
+ | Perplexity | 12.5 |
70
+ | EM (Exact Match) | 78% |
71
+ | F1 Score | 82% |
72
+
73
+ ## Limitations
74
+
75
+ * May hallucinate on out-of-domain prompts.
76
+ * Performance degraded on languages other than English.
77
+
78
+ ## Citation
79
+
80
+ If you use this model in your research, please cite:
81
+
82
+ ```bibtex
83
+ @misc{deepseek_r1_iitgdata,
84
+ title={DeepSeek R1 IITG Data Fine-Tuned Model},
85
+ author={Aeshp},
86
+ year={2025},
87
+ howpublished={\url{https://huggingface.co/Aeshp/deepseekR1iitgdata}}
88
+ }
89
+ ```
90
+
91
+ ## Acknowledgements
92
+
93
+ Thanks to the IIT Guwahati Data Science & AI program for providing training materials and evaluation benchmarks.