LaaLM
/

LaaLM-exp-v1

Text Generation

text-generation-inference

Model card Files Files and versions

ereniko commited on Jan 20

Commit

bb90bd8

·

verified ·

1 Parent(s): cf80e2b

Update README.md

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

@@ -145,6 +145,20 @@ print(run_command("rm test.txt"))            # (empty)
 print(run_command("ls"))                     # backup.txt
 ```
 ## Supported Commands
 | Command | Description | Example |

 print(run_command("ls"))                     # backup.txt
 ```
+## Quantized Versions
+GGUF quantizations are available for CPU inference and lower memory usage:
+**[LaaLM-exp-v1-GGUF](https://huggingface.co/ereniko/LaaLM-exp-v1-GGUF)**
+Includes Q2_K through fp16 quantizations (1.27GB - 6.18GB) for use with:
+- llama.cpp
+- Ollama
+- llama-cpp-python
+- Other GGUF-compatible tools
+Recommended: Q4_K_M (1.93GB) for best quality/size balance.
 ## Supported Commands
 | Command | Description | Example |