LaaLM
/

LaaLM-v2

+---
+license: mit
+language:
+- en
+tags:
+- linux
+- bash
+- terminal
+- experimental
+- from-scratch
+pipeline_tag: text-generation
+---
+# LaaLM-v2
+**A small language model trained from scratch to be a Linux terminal. That's it.**
+---
+## What is this?
+LaaLM-v2 is a ~100M parameter transformer trained from random weights to simulate bash commands. No pretrained base model, no finetuning - just straight training on terminal data.
+Why? Because LaaLM-exp-v1 used a 3B pretrained model and knew way too much stuff it didn't need (like `sudo`, `apt`, networking). That's 30x more parameters than necessary. This version is built to only know what it needs to know.
+**Current status:** In training. Things might change.
+---
+## What it can do
+Basic file stuff:
+- `pwd`, `ls`, `cd` (including `cd ..`)
+- `touch`, `cat`, `echo`
+- `echo text > file`, `echo text >> file`
+- `mv`, `cp`, `rm` (with `-r` and `-rf`)
+- `mkdir` (with `-p`)
+Text processing:
+- `grep pattern file`
+- `head`, `tail`, `wc`
+- `find`
+Pipes work too:
+- `cat file | grep something`
+- `cat file | head -5`
+- `cat file | wc`
+And it shows its reasoning for complex commands:
+```bash
+$ cat log.txt | grep error
+<REASON>
+STEP1: execute(cat log.txt) -> output="..."
+STEP2: pipe_to(grep error)
+  -> filtered=3 lines
+</REASON>
+error: connection failed
+error: timeout
+error: invalid input
+```
+These are currently what the model data looks like. The actual model may have different styles, but don't really expect a major difference.
+---
+## Training
+Training of LaaLM-v2 is a bit rough.
+For the current status as of 19 February 2026, LaaLM had 3 training rounds with about 17.5k conversation examples.
+But the model suffers from extreme hallucination, which we are trying to solve.
+And experimentation ate a lot of budget from us, so it caused slowdowns.
+For the current training hardware, originally we thought of using the Nvidia RTX PRO 6000, but because of software errors, we used an MI300X. Though we have plans of using TPUs also.
+Though our new training code shows good improvements, we hope for good results after the budget is solved.
+But the current versions like we said are pretty rough for now.
+Our latest tests resulted in this:
+$ ls
+touch log_4
+cat log_4 | grep data
+STEP1: execute(cat log_4) -> output=""
+STEP2: pipe_to(grep data, input="")
+  -> filtered=0 lines
+cat log_4 | grep data
+STEP1: execute(cat log_4) -> output=""
+STEP2: pipe_to(grep data, input="")
+  -> filtered=0 lines
+cat log_4 | head
+This is from LaaLM-v2.
+---
+## How to use it
+*(Code examples coming soon - model is still training and the API might change)*
+Basic idea:
+1. Load the model
+2. Start with a system message about the initial state
+3. Send commands as user messages
+4. Get outputs as assistant messages
+5. Model remembers what you did (files you created, directories you made, etc.)
+Example conversation:
+```bash
+$ touch myfile.txt
+(empty output)
+$ echo hello > myfile.txt
+(empty output)
+$ cat myfile.txt
+hello
+$ ls
+myfile.txt
+$ mv myfile.txt renamed.txt
+(empty output)
+$ cat renamed.txt
+hello
+```
+---
+## Why train from scratch?
+**LaaLM-v1** was a T5 model (encoder-decoder). Worked okay but couldn't really do conversations.
+**LaaLM-exp-v1** was Qwen 3B finetuned. It worked great but it was 3B parameters! It knew about sudo, apt, networking, all this stuff it didn't need. Felt wasteful.
+**LaaLM-v2** is trained from scratch with just what it needs. ~100M parameters, no bloat. Every parameter is there for bash commands, nothing else.
+Yeah it's harder to train from scratch (no pretrained stopping behavior, needs way more data), but the purity is worth it.
+---
+## What it doesn't do
+This isn't a real terminal. It simulates bash behavior but doesn't actually run commands.
+No networking (ssh, wget, curl), no package managers (apt, pip), no system admin stuff (sudo, systemctl), no scripting (loops, variables), no permissions (chmod, users).
+We thought of basic bash but we don't know if we can handle it.
+---
+## Status
+Currently training. If you're seeing this, the model might not be uploaded yet or might change significantly.
+Check back later or watch the repo for updates.
+---
+## License
+MIT