LaaLM-exp-v1 GGUF

Quantized GGUF versions of LaaLM-exp-v1 for llama.cpp, Ollama, and other GGUF-compatible inference engines.

Model Details

LaaLM-exp-v1 is a 3B parameter model that emulates a Linux terminal through conversation-based state tracking. It maintains filesystem state internally and supports 12 Linux commands with 95.4% benchmark accuracy.

See the main model card for full documentation.

Available Quantizations

Filename Quant Size Use Case
exp-v1-Q2_K.gguf Q2_K 1.27 GB Smallest size, lower quality
exp-v1-Q3_K_S.gguf Q3_K_S 1.45 GB Small, decent quality
exp-v1-Q3_K_M.gguf Q3_K_M 1.59 GB Balanced small size
exp-v1-Q3_K_L.gguf Q3_K_L 1.71 GB Larger Q3 variant
exp-v1-IQ4_XS.gguf IQ4_XS 1.75 GB Importance matrix, high quality
exp-v1-Q4_K_S.gguf Q4_K_S 1.83 GB Good balance
exp-v1-Q4_K_M.gguf Q4_K_M 1.93 GB Recommended
exp-v1-Q5_K_S.gguf Q5_K_S 2.17 GB High quality
exp-v1-Q5_K_M.gguf Q5_K_M 2.22 GB Higher quality
exp-v1-Q6_K.gguf Q6_K 2.54 GB Near-original quality
exp-v1-Q8_0.gguf Q8_0 3.29 GB Maximum quality
exp-v1-fp16.gguf fp16 6.18 GB Original precision

Recommended: Q4_K_M for best quality/size balance.

Usage Examples

llama.cpp

# Download model
huggingface-cli download LaaLM/LaaLM-exp-v1-GGUF exp-v1-Q4_K_M.gguf --local-dir .

# Run inference
./llama-cli -m exp-v1-Q4_K_M.gguf \
  --color \
  -p "You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user

User: pwd
Assistant:"

Ollama

Create Modelfile:

FROM ./exp-v1-Q4_K_M.gguf

SYSTEM """You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user"""

PARAMETER temperature 0
PARAMETER top_p 1

Then:

ollama create laุงู„m-exp-v1 -f Modelfile
ollama run laุงู„m-exp-v1

Example session:

>>> pwd
/home/user

>>> touch test.txt
(empty)

>>> ls
test.txt

>>> echo hello > test.txt
(empty)

>>> cat test.txt
hello

Python (llama-cpp-python)

pip install llama-cpp-python
from llama_cpp import Llama

llm = Llama(
    model_path="exp-v1-Q4_K_M.gguf",
    n_ctx=2048,
    n_threads=8,
    verbose=False
)

# Initialize conversation
system_prompt = """You are a Linux terminal emulator. Initial state:
Current directory: /home/user
Files: (empty)
Environment: USER=user, HOME=/home/user"""

conversation = f"{system_prompt}\n\nUser: pwd\nAssistant:"

output = llm(
    conversation,
    max_tokens=150,
    temperature=0.0,
    stop=["User:", "\n\n"]
)

print(output['choices'][0]['text'])

Supported Commands

  • pwd - Print working directory
  • ls - List files
  • echo - Print text
  • touch - Create empty file
  • cat - Display file contents
  • mkdir - Create directory
  • cd - Change directory
  • rm - Remove file
  • mv - Move/rename file
  • cp - Copy file
  • echo > - Write to file
  • grep - Search in file

Performance

  • Overall Accuracy: 95.4% on benchmarks (tested on original LaaLM-exp-v1; GGUF quantizations may show different quality based on compression level)
  • File Persistence: Tracks files across conversation
  • Error Handling: Proper bash error messages

Quantization Quality Guide

  • Q2_K - Q3_K: May occasionally make mistakes on complex file operations
  • Q4_K_M - Q5_K_M: Near-original quality, recommended for most use cases
  • Q6_K - fp16: Closest to original model performance

System Requirements

Quantization RAM Required Speed
Q2_K ~2 GB Fastest
Q4_K_M ~3 GB Recommended
Q6_K ~4 GB High quality
fp16 ~8 GB Slowest

License

Apache 2.0 (inherited from base model)

Links

Downloads last month
722
GGUF
Model size
3B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LaaLM/LaaLM-exp-v1-GGUF

Base model

Qwen/Qwen2.5-3B
Finetuned
LaaLM/LaaLM-exp-v1
Quantized
(3)
this model