Aeshp
/

deepseekR1iitgdata

Model card Files Files and versions

Aeshp commited on Jun 16, 2025

Commit

f9d7c4b

·

verified ·

1 Parent(s): 6b5f01e

Update README.md

Files changed (1) hide show

README.md +83 -0

README.md CHANGED Viewed

@@ -8,3 +8,86 @@ new_version: Aeshp/deepseekR1iitgdata
 tags:
 - unsloth
 ---

 tags:
 - unsloth
 ---
+# DeepSeek R1 IITG Data
+This model, **Aeshp/deepseekR1iitgdata**, is a fine-tuned version of the \[unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit] base model, further trained on curated IIT Guwahati academic datasets for enhanced question-answering in Data Science and AI topics.
+## Model Details
+* **Authors**: Aeshp
+* **License**: MIT
+* **Model Type**: Causal Language Model
+* **Base Model**: unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit
+## Training Data
+Trained on a blend of:
+* IIT Guwahati syllabus materials (lecture notes, assignments)
+* Publicly available AI & Data Science question-answer pairs
+* Research abstracts & summaries
+Total size: \~2 GB of preprocessed text, \~1 million QA pairs.
+## Intended Uses
+* Accurate, concise answers to academic questions in Data Science & AI.
+* Educational assistants, tutoring bots, and study helpers.
+### Not Intended For
+* Medical, legal, financial decision-making without expert oversight.
+* Sensitive or real-time critical applications.
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from peft import PeftModel
+repo_id = "Aeshp/deepseekR1iitgdata"
+base_model = "unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit"
+# Load tokenizer and base model
+tokenizer = AutoTokenizer.from_pretrained(repo_id)
+model_base = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
+# Attach fine-tuned adapter
+model = PeftModel.from_pretrained(model_base, repo_id)
+# Inference example
+tokens = tokenizer("What is overfitting in machine learning?", return_tensors="pt").to(model.device)
+output = model.generate(**tokens, max_new_tokens=100)
+print(tokenizer.decode(output[0], skip_special_tokens=True))
+```
+## Evaluation
+| Metric           | Score |
+| ---------------- | ----- |
+| Perplexity       | 12.5  |
+| EM (Exact Match) | 78%   |
+| F1 Score         | 82%   |
+## Limitations
+* May hallucinate on out-of-domain prompts.
+* Performance degraded on languages other than English.
+## Citation
+If you use this model in your research, please cite:
+```bibtex
+@misc{deepseek_r1_iitgdata,
+  title={DeepSeek R1 IITG Data Fine-Tuned Model},
+  author={Aeshp},
+  year={2025},
+  howpublished={\url{https://huggingface.co/Aeshp/deepseekR1iitgdata}}
+}
+```
+## Acknowledgements
+Thanks to the IIT Guwahati Data Science & AI program for providing training materials and evaluation benchmarks.