miko-x-multibase-lora-ensemble / README.md

Upload folder using huggingface_hub

1c08404 verified 22 days ago

5.34 kB

	---
	language:
	- en
	license: apache-2.0
	pipeline_tag: text-generation
	tags:
	- x
	- twitter
	- tweet
	- persona
	- router
	- lora
	- peft
	- qlora
	base_model:
	- Qwen/Qwen3-14B
	- google/gemma-2-9b-it
	- meta-llama/Meta-Llama-3.1-8B-Instruct
	- microsoft/Phi-3.5-mini-instruct
	---

	# Miko X Tweet Ensemble — multi-base, router-driven LoRA stack

	> This model has been trained using [Miko](https://x.com/project_miko), the fully autonomous AI agent for [Miko Protocol](https://mikoprotocol.com).

	[![Miko on X](https://img.shields.io/badge/Miko-%40project_miko-blue?logo=x)](https://x.com/project_miko) [![Miko Protocol Website](https://img.shields.io/badge/Website-Miko%20Protocol-lightgrey)](https://mikoprotocol.com)

	<img src="https://ik.imagekit.io/ueitnjew7/miko_huggingface.png">

	## What it is
	Miko is a multi-base, multi-adapter ensemble built for X/Twitter.
	It discovers style clusters from real tweets, fine-tunes one LoRA per style, and routes your prompt to the best-fit style at runtime.

	---

	## Why it’s different
	- Multi-base adapters by design.
	Not tied to a single model family. Style adapters originate from multiple bases:
	- `Qwen/Qwen3-14B`
	- `google/gemma-2-9b-it`
	- `meta-llama/Meta-Llama-3.1-8B-Instruct`
	- `microsoft/Phi-3.5-mini-instruct`

	- X-native behavior.
	Short form, emoji/hashtag cadence, memes/irony, and fast “CT” tone.

	- Router that understands styles.
	Uses Qwen3-14B hidden states with prototype similarity + a small projection head to pick a style before generation.

	---

	## Base models & typical roles (observed tendencies)
	\| Base model \| Typical role / personality \| Good for \|
	\|---\|---\|---\|
	\| Qwen/Qwen3-14B \| Router backbone & fallback generator. Balanced, hashtag-friendly. \| General comments, quick Q/A, mentions \|
	\| mistralai/Mistral-Nemo-Instruct-2407 \| Crisp technical tone, list-y facts, tight bullets. \| Alpha/launch notes, “3-point” updates \|
	\| google/gemma-2-9b-it \| Smooth and narrative; softer, reflective voice. \| Story-like replies, mini-threads \|
	\| meta-llama/Meta-Llama-3.1-8B-Instruct \| Clear directives / neutral composition. \| How-to tweets, best practices \|
	\| microsoft/Phi-3.5-mini-instruct \| Snappy one-liners; memes/emoji friendly. \| Witty hooks, irony, punchy replies \|

	(Roles are tendencies learned from tweet data; they’re not hard rules.)

	## Training data

	Proprietary — Miko Agent Tweet Corpus.
	Tweets authored by the fully-autonomous X (Twitter) agent [Miko(@project_miko)](https://x.com/project_miko), collected from the live account’s public timeline and agent logs under the account owner's control.
	– Domain: Crypto/X discourse (emojis, hashtags, memes, irony)
	– Time window: rolling weekly refreshes (e.g., 7–14 days)
	– Redistribution: the raw dataset is not redistributed; only model weights are shared.
	(Preprocessing: light normalization/filters, deduplication; style clustering via HDBSCAN.)

	---

	## How it works (high-level)
	1. Style discovery — cluster tweet embeddings (e.g., HDBSCAN) to assign style IDs.
	2. Per-style LoRA — train one adapter per style, possibly from different base models.
	3. Routing — Qwen3-14B features → prototype similarity + projection head → pick a style.
	4. Generation — load the chosen base, attach the matching LoRA, generate with a light `<style_{id}>` tag.

	---

	## Quickstart
	```python
	from inference import MikoEnsemble

	ens = MikoEnsemble(".")
	print(ens.generate("CT keeps fading this rally. What's your take?"))
	```

	## Force a style (advanced)
	```python
	def generate_with_style(ens, sid, prompt, **gen):
	styled = f"<style_{sid}>{prompt}"
	model, tok = ens._load_adapter_with_base(sid)
	ipt = tok(styled, return_tensors="pt", truncation=True, max_length=256, padding=True).to(model.device)
	out = model.generate(
	**ipt,
	max_new_tokens=gen.get("max_new_tokens", 120),
	temperature=gen.get("temperature", 0.8),
	do_sample=True,
	top_p=0.95,
	pad_token_id=tok.pad_token_id,
	eos_token_id=tok.eos_token_id,
	)
	return tok.decode(out[0], skip_special_tokens=True).replace(styled, "").strip()
	```

	## VRAM & speed tips
	- 4-bit (nf4, double-quant, bf16 compute) supported; 16–24GB VRAM is enough for one adapter at a time.
	- A small LRU cache keeps recently used styles in memory (default 2).

	---

	## Files
	- `lora_adapters/style_{id}_lora/` — per-style LoRA folder (with its adapter_config.json).
	- `router/router_state.pt` — router head (prototypes + projection).
	- `inference.py` — lazy loader + generator.
	- `README_METADATA.json` — style IDs, number of styles, base list, timestamp.

	---

	## Intended use (tweet personas)
	- Witty/ironic one-liners — hooks, memes, playful replies
	- Tech/alpha notes — launch takeaways, bullet summaries, link threads
	- Narrative reframing — bullish/bearish angles, story-style posts
	- Q&A / reply bots — short, clear responses in mentions/threads

	## Limitation
	- Optimized for tweets & short threads; not a general chatbot.
	- Each base model retains its own license/terms.

	## License
	Apache-2.0.

	## Acknowledgements
	Thanks to the Qwen, Mistral, Gemma-2, Llama-3.1, and Phi-3.5 communities.


	## Changelog
	- 2026-01-03: weekly refresh (days=7); retrained adapters & router.