Text Generation
PEFT
lora
trl
naming
brand-generation
controllable-generation

Nomen-AI

Nomen-AI is a production-ready pipeline for controllable, cross-lingual, morpho-phonetic brand / YouTube channel name synthesis. It is designed to fit a free-tier Google Colab T4 GPU (15GB VRAM) using Qwen2.5-1.5B-Instruct + LoRA.

Current status

Status: code/data/demo ready; GPU training blocked in the agent environment.

Adapter repos are initialized but do not yet contain trained weights:

Public assets

Architecture

  1. CTRL-style control-token instruction SFT:
    • [ROOT:japanese:40+nordic:60]
    • [THEME:gaming]
    • [SYL:3]
    • [LEN:8]
    • [CREATIVE:0.8]
  2. Morpho-phonetic synthetic corpus using 24 language/root families.
  3. DPO anti-generic phase where chosen names are novel and rejected names are derivative (TechHub, Brandify, GetZone).
  4. Inference-time anti-duplication matrix combining fuzzy similarity and character n-gram overlap against known brands.
  5. Creativity knob decoding: low creativity uses contrastive search; high creativity uses min-p sampling + higher temperature.

Supported controls

24 linguistic root families: latin, greek, nordic, germanic, celtic, slavic, japanese, korean, mandarin, hindi, sanskrit, arabic, persian, turkish, swahili, yoruba, hawaiian, maori, finnish, hungarian, italian, spanish, portuguese, hebrew.

Themes: tech, gaming, beauty, vlogging, finance, lifestyle, fashion, food, fitness, music, travel, education, health, crypto, kids, luxury, eco, auto.

Train on Colab T4

git clone https://huggingface.co/krystv/nomen-ai
cd nomen-ai
pip install -q -r requirements.txt
huggingface-cli login
bash scripts/train_all_colab.sh

Or with Make:

make install
make all

Quick inference after training

from nomen_ai.control import ControlVector
from nomen_ai.inference import NomenAI
engine = NomenAI("krystv/nomen-ai-dpo-lora", base_model="Qwen/Qwen2.5-1.5B-Instruct")
cv = ControlVector(roots=["japanese", "nordic"], blend=[40, 60], theme="gaming", syllables=3, char_len=8, creativity=0.8)
print(engine.generate(cv, n=10))

Research basis

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for krystv/nomen-ai

Adapter
(1013)
this model

Datasets used to train krystv/nomen-ai

Space using krystv/nomen-ai 1

Papers for krystv/nomen-ai