TurkishCodeMan commited on
Commit
9ab5817
·
verified ·
1 Parent(s): 2022996

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -7
README.md CHANGED
@@ -1,22 +1,72 @@
1
  ---
2
  base_model: unsloth/csm-1b
3
  tags:
4
- - text-generation-inference
5
  - transformers
6
  - unsloth
7
  - csm
8
  - trl
 
 
9
  license: apache-2.0
10
  language:
11
  - en
 
 
 
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** TurkishCodeMan
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/csm-1b
19
 
20
- This csm model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  base_model: unsloth/csm-1b
3
  tags:
4
+ - text-to-speech
5
  - transformers
6
  - unsloth
7
  - csm
8
  - trl
9
+ - lora
10
+ - finetuning
11
  license: apache-2.0
12
  language:
13
  - en
14
+ datasets:
15
+ - TurkishCodeMan/tts-medium-clean
16
+ pipeline_tag: text-to-speech
17
  ---
18
 
19
+ # TurkishCodeMan - CSM-1B (LoRA Fine-tuned)
20
 
21
+ ## 📌 Model Summary
22
+ This is a **LoRA fine-tuned** version of [unsloth/csm-1b](https://huggingface.co/unsloth/csm-1b), trained for **text-to-speech (TTS)** tasks.
23
+ The model was trained using [Unsloth](https://github.com/unslothai/unsloth) for 2x faster finetuning and Hugging Face’s [TRL](https://huggingface.co/docs/trl/index) library.
24
 
25
+ - **Base Model:** `unsloth/csm-1b`
26
+ - **Fine-tuning Method:** LoRA
27
+ - **Training Frameworks:** Unsloth, TRL
28
+ - **Dataset:** [TurkishCodeMan/tts-medium-clean](https://huggingface.co/datasets/TurkishCodeMan/tts-medium-clean)
29
+ - **Languages:** English, Turkish
30
+ - **License:** Apache-2.0
31
 
32
+ ---
33
+
34
+ ## 🚀 Intended Use
35
+ - Convert text to high-quality speech.
36
+ - Research and experimentation in TTS models.
37
+ - Transfer learning and downstream fine-tuning.
38
+
39
+ ⚠️ **Not intended** for harmful or malicious use (hate speech, deepfakes, etc.).
40
+
41
+ ---
42
+
43
+ ## 🛠️ Training Details
44
+ - **Method:** LoRA low-rank adaptation on transformer layers.
45
+ - **Batch Size:** 16 (8 × gradient_accumulation=2).
46
+ - **Epochs:** 3
47
+ - **Trainable Parameters:** ~29M of 1.66B (≈1.75% trained).
48
+ - **Hardware:** 1x GPU.
49
+ - **Optimizer:** AdamW.
50
+ - **Learning Rate Schedule:** Linear decay with warmup.
51
+
52
+ ---
53
+
54
+ ## 📊 Dataset
55
+ The model was fine-tuned on **[TurkishCodeMan/tts-medium-clean](https://huggingface.co/datasets/TurkishCodeMan/tts-medium-clean)**.
56
+ This dataset contains clean speech-text pairs suitable for TTS tasks.
57
+
58
+ ---
59
+
60
+ ## 🔧 How to Use
61
+
62
+ ```python
63
+ from transformers import AutoModelForCausalLM, AutoTokenizer
64
+
65
+ model = AutoModelForCausalLM.from_pretrained("TurkishCodeMan/csm-1b-tts-lora")
66
+ tokenizer = AutoTokenizer.from_pretrained("TurkishCodeMan/csm-1b-tts-lora")
67
+
68
+ text = "Hi !"
69
+ inputs = tokenizer(text, return_tensors="pt")
70
+
71
+ outputs = model.generate(**inputs, max_new_tokens=200)
72
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))