How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="tekkaadan/litcoin-gemma-mobile",
	filename="litcoin-gemma-mobile-Q4_K_M.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

LITCOIN x Gemma

litcoin-gemma-mobile

A phone-sized coding model, fine-tuned on nothing but verified data produced by the LITCOIN network of AI research agents.

This is a LoRA adapter for google/gemma-4-E2B-it, Google's small on-device Gemma variant. On held-out problems graded by real sandbox execution, it took the base model from 17.7% to 36.9% pass@1, more than doubling it.

Results

Held-out problems neither model had trained on, graded by running the code against the real test harness the LITCOIN protocol uses to pay miners. No self-reported scores: a solution counts only if it actually executes and produces the right answer.

Task family Base (gemma-4-E2B-it) litcoin-gemma-mobile
ARC grid reasoning 0.0% 92.9%
Project Euler 41.2% 88.2%
HuggingFace tasks 46.7% 60.0%
LiveCodeBench 3.3% 10.0%
Rosalind bioinformatics 0.0% 10.0%
Codeforces 0.0% 0.0%
Overall 17.7% (25/141) 36.9% (52/141)

A 19.2 point absolute gain, a 108% relative improvement. The largest wins are on tasks with strict input/output conventions (ARC, Project Euler), which the untuned model fails for lack of exposure rather than lack of intelligence. Codeforces (competitive programming) stayed at zero. We report the holdout because the proof is the honesty.

Per-source before and after

A companion 12-billion-parameter model, trained the same way on the same network's data, went from 31% to 53.4%. The smaller model gained more in relative terms (108% vs 72%): less capacity means it fails hardest on exactly the conventions the data teaches. Writeup: litcoin.app/proof.

Use

import torch
from transformers import AutoTokenizer, Gemma4ForConditionalGeneration
from peft import PeftModel

base = "google/gemma-4-E2B-it"   # gated: accept Google's Gemma terms on its HF page first
tok = AutoTokenizer.from_pretrained(base)
model = Gemma4ForConditionalGeneration.from_pretrained(
    base, dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, "tekkaadan/litcoin-gemma-mobile")
model.eval()

GGUF (llama.cpp / Ollama / on-device)

A merged Q4_K_M GGUF (litcoin-gemma-mobile-Q4_K_M.gguf, ~3.4 GB) is included for llama.cpp, Ollama, and on-device runtimes, no separate base download required:

# Ollama
ollama run hf.co/tekkaadan/litcoin-gemma-mobile

# or llama.cpp directly
llama-cli -hf tekkaadan/litcoin-gemma-mobile:Q4_K_M -p "Write a Python function that ..."

Training

  • Base: google/gemma-4-E2B-it (Gemma's "Efficient 2B" on-device variant)
  • Method: QLoRA, 4-bit NF4, r=16, ~58,000 verified solutions across 9 task families, 2 epochs
  • Data: sandbox-verified LITCOIN submissions only. Every example passed execution before it entered the training set. Nothing synthetic, nothing scraped.
  • Provenance: every verified submission is anchored to a public, content-addressed GitLawb repository, so the data's existence and integrity are independently checkable.

License

A derivative of Gemma. Use is governed by the Gemma Terms of Use; these adapter weights are released under the same terms.

Built by the LITCOIN network. litcoin.app

Downloads last month
2
GGUF
Model size
5B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tekkaadan/litcoin-gemma-mobile

Adapter
(99)
this model