khazarai
/

Cardiology-TTS

Model card Files Files and versions

Rustamshry commited on Sep 11, 2025

Commit

6aeb07d

·

verified ·

1 Parent(s): 1888642

Update README.md

Files changed (1) hide show

README.md +56 -1

README.md CHANGED Viewed

@@ -1,6 +1,16 @@
 ---
 base_model: unsloth/csm-1b
 library_name: peft
 ---
 # Model Card for Model ID
@@ -71,7 +81,52 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details

 ---
 base_model: unsloth/csm-1b
 library_name: peft
+license: mit
+datasets:
+- Dev372/Cardiology_Medical_STT_Dataset
+language:
+- en
+pipeline_tag: text-to-speech
+tags:
+- cardiology
+- medical
+- transformers
 ---
 # Model Card for Model ID
 Use the code below to get started with the model.
+```python
+import torch
+from transformers import CsmForConditionalGeneration, AutoProcessor
+import soundfile as sf
+from peft import PeftModel
+model_id = "unsloth/csm-1b"
+device = "cuda" if torch.cuda.is_available() else "cpu"
+processor = AutoProcessor.from_pretrained(model_id)
+base_model = CsmForConditionalGeneration.from_pretrained(model_id, device_map=device)
+model = PeftModel.from_pretrained(base_model, "khazarai/Cardiology-TTS")
+text = "The coronary arteries are patent with no significant stenosis."
+speaker_id = 0
+conversation = [
+    {"role": str(speaker_id), "content": [{"type": "text", "text": text}]},
+]
+audio_values = model.generate(
+    **processor.apply_chat_template(
+        conversation,
+        tokenize=True,
+        return_dict=True,
+    ).to("cuda"),
+    max_new_tokens=200,
+    # play with these parameters to tweak results
+    # depth_decoder_top_k=0,
+    # depth_decoder_top_p=0.9,
+    # depth_decoder_do_sample=True,
+    # depth_decoder_temperature=0.9,
+    # top_k=0,
+    # top_p=1.0,
+    # temperature=0.9,
+    # do_sample=True,
+    #########################################################
+    output_audio=True
+)
+audio = audio_values[0].to(torch.float32).cpu().numpy()
+sf.write("example.wav", audio, 24000)
+```
 ## Training Details