mrdmnd
/

llada-8b-instruct-4bit-gptq

Text Generation

4-bit precision

Model card Files Files and versions

mrdmnd commited on Jun 26, 2025

Commit

e4b8175

·

verified ·

1 Parent(s): bd78f29

Update README.md

Files changed (1) hide show

README.md +5 -11

README.md CHANGED Viewed

@@ -7,31 +7,25 @@ pipeline_tag: text-generation
 Baby's first adventure with the diffusion language model. Had to quantize this so it would fit on a 3080TI - all I've got!
-Used modal to do so:
-```
-"""
-This script uses Modal to quantize the LLaDA family of models.
 First, install modal and log into the CLI.
-```
 uv add modal
 uv run modal login
-```
 Then, add an environment and a volume to the project.
-```
 uv run modal volume create quantized-model-output
-```
 Then, run the quantization script:
-```
 uv run modal run scripts/quantize_llada.py
 ```
-"""
 import modal

 Baby's first adventure with the diffusion language model. Had to quantize this so it would fit on a 3080TI - all I've got!
+Used modal to do so. If you want to replicate what I did, try this:
 First, install modal and log into the CLI.
 uv add modal
 uv run modal login
 Then, add an environment and a volume to the project.
 uv run modal volume create quantized-model-output
 Then, run the quantization script:
 uv run modal run scripts/quantize_llada.py
 ```
+# scripts/quantize_llada.py
 import modal