mochaV2 ✨

Banner

mochaV2 is the shiny new second version of the Mocha series β€” your go-to sentence completion model. It’s smarter, sharper, and better at making your text sound correct, formal, and factually accurate.

Icon


✨ What has improved?

Compared to Mocha V1, mochaV2 is:

  • βœ… Grammatically stronger β€” smoother, more correct sentences
  • βœ… Factually sharper β€” completions you can trust
  • βœ… Formally polished β€” professional and structured phrasing
  • βœ… Context-aware β€” understands your text better
  • ⚑ Generates more content β€” you get longer, richer completions
  • 🐒 Slightly slower β€” takes a bit longer, but worth it for the quality

Note: The metrics below show numerical and graphical improvements, not the actual sentences generated.


πŸ“Š Metrics

Mocha V1 Mocha V2
Metrics V1 Metrics V2

These side-by-side charts highlight mochaV2’s stronger performance in grammar, factual accuracy, and sentence structure.


πŸš€ How to use

Getting started is easy! Just load the model and tokenizer like this:

import json
import torch
import gradio as gr
from transformers import AutoModelForCausalLM
from huggingface_hub import hf_hub_download
# HF repo containing your model
repo_id = "theguywhosucks/mochaV2"
# Download tokenizer files
itos_file = hf_hub_download(repo_id, "itos.json")
stoi_file = hf_hub_download(repo_id, "stoi.json")
with open(stoi_file) as f:
    stoi = json.load(f)
with open(itos_file) as f:
    itos = json.load(f)
# Convert itos dict -> list if needed
if isinstance(itos, dict):
    itos = [itos[str(i)] for i in range(len(itos))]
# Tokenizer
class SimpleTokenizer:
    def __init__(self, stoi, itos):
        self.stoi = stoi
        self.itos = itos
        self.unk_token = "<unk>" if "<unk>" in stoi else itos[0]
    def encode(self, text):
        return [self.stoi.get(c, self.stoi.get(self.unk_token, 0)) for c in text]
    def decode(self, ids):
        return "".join([self.itos[i] if i < len(self.itos) else self.unk_token for i in ids])
tokenizer = SimpleTokenizer(stoi, itos)
# Load model
device = "cuda" if torch.cuda.is_available() else "cpu"
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype=torch.float32,
    trust_remote_code=True
)
model.to(device)
model.eval()
# Gradio function
def complete_sentence(prompt, max_new_tokens=50, temperature=0.7):
    input_ids = torch.tensor([tokenizer.encode(prompt)]).to(device)
    with torch.no_grad():
        outputs = model.generate(
            input_ids,
            max_new_tokens=max_new_tokens,
            do_sample=True,
            temperature=temperature
        )
    return tokenizer.decode(outputs[0].tolist())
# Launch Gradio app
gr.Interface(
    fn=complete_sentence,
    inputs=[
        gr.Textbox(label="Prompt"),
        gr.Slider(10, 200, value=50, step=10, label="Max new tokens"),
        gr.Slider(0.1, 2.0, value=0.7, step=0.1, label="Temperature")
    ],
    outputs=gr.Textbox(label="Completed Text"),
    title="Mocha Sentence Completion",
    description="Enter a prompt and get AI completions from your model."
).launch()

🎨 Assets

  • trailer.mp4 β€” Sneak peek of mochaV2
  • banner.png β€” Project banner
  • icon.png β€” Project icon
  • metricsV1.png β€” Performance metrics of Mocha V1
  • metricsV2.png β€” Performance metrics of Mocha V2

πŸ“œ License

mochaV2 is released under the Mocha Proprietary License. Usage is subject to the terms of this license.


🎬 Watch the Trailer

Downloads last month
16
Safetensors
Model size
25.3M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train theguywhosucks/mochaV2

Space using theguywhosucks/mochaV2 1

Collection including theguywhosucks/mochaV2