GemmAnima Prototype Adapter Bundle

Prototype Notice

This is a v0.1 prototype adapter bundle for the standalone GemmAnima app. It is published so the current local pipeline can be reproduced and tested. It is not a safety-rated assistant, not a production image model, and not a commercial-ready release.

The repository intentionally contains GemmAnima adapters, projectors, bridge checkpoints, metadata, and docs only. Upstream base weights must be downloaded from their original model pages.

Quick Summary

Area	Status
Release type	Public prototype adapter/checkpoint bundle
Base weights	Not mirrored here
Runtime shape	One shared Gemma GGUF plus task LoRA/adapters, Anima base weights, and bridge profiles
Evaluation	Local smoke tests and bridge checks only
Safety	Not safety-rated
License posture	Adapter bundle notice plus upstream base-model restrictions

GemmAnima is a local multimodal adapter bundle that connects a Gemma/TIPO-style language and vision-tagging core to an Anima image-generation core through a HiddenStage bridge.

This repository is intended for GemmAnima-owned adapters, projectors, bridge checkpoints, and runtime metadata for the standalone GemmAnima app. It is not a single monolithic model file, and it should not mirror upstream base model weights.

Status

Release status: v0.1 public prototype adapter/checkpoint release
Safety status: not safety-rated
Evaluation status: partial smoke and bridge checks only
License status: adapter/checkpoint bundle with upstream dependency licenses; see LICENSE_NOTICES.md
Visibility: public adapter-only repository; upstream base model weights are not mirrored here

Do not treat this prototype as a promoted production model without additional evaluation. Do not treat this adapter bundle as relicensing any upstream base model.

Consumer Download

This Hugging Face repository is the adapter/checkpoint bundle. The GitHub repository is the app/source-code surface. On a GemmAnima checkout, let the app plan or download model assets:

python -m gemmanima.cli model-download-plan --json
python -m gemmanima.cli ensure-model-assets --json

The app/source README owns local setup, GUI, development, and runtime details. This model card only describes the files published in the adapter bundle.

Model Parts

This repository should contain only the files produced or adapted by the GemmAnima project. Original base models should be downloaded from their original distribution pages and placed locally according to the standalone app configuration.

1. Gemma Core

The Gemma Core handles chat, language-harness behavior, canonical English Danbooru tag output, and vision tagger language behavior.

GemmAnima files:

File	Role
`text-adapter-model-f16.gguf`	Text/chat LoRA adapter
`vision-tagger-adapter-model-f16.gguf`	Vision/tagger LoRA adapter, refreshed from the mixed-pose-front v2 final prototype
`gemma4-tipo-vision.mmproj-f16.gguf`	Vision projector paired with the mixed-pose-front v2 final prototype

External requirement:

File	How to obtain
`gemma-4-E2B-it-heretic-ara-custom.Q4_K_M.gguf`	Download from `mradermacher/gemma-4-E2B-it-heretic-ara-custom-GGUF`; do not mirror it here unless redistribution is explicitly permitted and intentionally chosen.

Upstream license metadata for the GGUF page currently reports apache-2.0.

The preferred runtime shape is an upstream base GGUF loaded with GemmAnima task-specific LoRA adapters through llama.cpp --lora. Older fully merged GGUFs are not the preferred packaging shape.

2. Anima Image Core

The Anima Image Core handles diffusion sampling and VAE decoding.

GemmAnima files:

This repository does not need to contain Anima Image Core base weights.

External requirements:

File	Role
`split_files/diffusion_models/anima-base-v1.0.safetensors`	Download from `circlestone-labs/Anima`
`split_files/vae/qwen_image_vae.safetensors`	Download from `circlestone-labs/Anima`

The upstream Anima page currently reports circlestone-labs-non-commercial-license and states that Anima is also subject to the NVIDIA Open Model License Agreement where applicable because it is a derivative of NVIDIA Cosmos-Predict2-2B-Text2Image.

Anima text encoder weights are not part of the required standalone runtime. The current in-process renderer uses Anima-compatible tokenizer metadata (t5xxl_ids and t5xxl_weights) for conditioning shape compatibility, not the Anima text encoder weight file.

Supported generation controls in the standalone app:

Type	Supported values
Sampler	`euler`, `euler_ancestral`, `dpmpp_2m`, `dpmpp_2m_sde_gpu`
Scheduler	`normal`, `karras`, `sgm_uniform`
Resolution presets	`1024x1024`, `832x1216`, `768x1344`, custom

3. HiddenStage Bridge

The HiddenStage Bridge connects Gemma hidden-state features to Anima-compatible conditioning.

GemmAnima files:

File	Role
`hiddenstage-planner-adapter.safetensors`	Planner LoRA adapter
`hiddenstage-planner-embed-vision.pt`	Planner vision embedding
`kv_proj_hiddenstage_planner_v2.pt`	HiddenStage bridge checkpoint
`kv_proj_text_delta_300k_from_epoch1_a0p35.pt`	Prototype default quality bridge profile used for normal image generation and style-tag prompts
`kv_proj_text_exact_v27_alpha35.pt`	Prototype bridge profile for signs, labels, captions, and readable-text prompts

The standalone app routes bridge profiles automatically:

Profile	Automatic use
`balanced_pose`	Normal image-generation prompts; routed to `kv_proj_text_delta_300k_from_epoch1_a0p35.pt`
`style_artist`	Style-oriented tags and surface-token-heavy prompts; routed to `kv_proj_text_delta_300k_from_epoch1_a0p35.pt`
`text_exact`	Prompts asking for readable text, signs, labels, captions, or logos
`legacy_mse`	Compatibility baseline and explicit override

Current local bridge metadata:

Metric	Value
Bridge validation MSE	`0.001104317136865575`
Bridge gate	passed
Planner eval loss	`1.0061092711985111`
Planner eval threshold	`1.5`

These are engineering gate metrics and small local smoke tests, not a full end-user quality evaluation.

Uploaded Files

Checksums and byte sizes for every uploaded file are recorded in adapter_manifest_v0.1.json. The main uploaded payload is:

Directory	Files
`gemma_core/`	Text LoRA, vision/tagger LoRA, vision mmproj
`hiddenstage_bridge/`	Planner adapter, planner vision embedding, legacy bridge, and three prototype bridge profiles
repository root	Hugging Face model card, license notices, model source metadata, adapter manifest, version marker

Approximate Size

This adapter repository should be much smaller than the full local runtime because original base weights are expected to come from their original pages:

Part	Approximate upload size
Gemma Core adapters/projector	~1.06 GB
HiddenStage Bridge	~0.40 GB
Anima Image Core base weights	not uploaded here
Total uploaded here	~1.46 GB

For local runtime planning, the full standalone runtime is about 9 GB in decimal units after the user downloads the external base weights separately:

Part	Approximate size
Gemma Core	~4.51 GB
Anima Image Core	~4.44 GB
HiddenStage Bridge	~0.40 GB
Total	~9.34 GB

Exact size depends on final filenames and whether source adapters or compatibility reference models are included.

Intended Use

This bundle is intended for:

Local GemmAnima app runtime testing
Korean or English chat with an explicit language harness
Canonical English Danbooru tag output for tag requests
Chat-driven image-generation request planning
Anima image rendering through the app-controlled preset system

For tag requests, output tags should remain canonical English Danbooru tags even when the user-facing chat language is Korean.

Out of Scope

This bundle is not intended as:

A general-purpose safety-filtered assistant
A fully evaluated public image-generation model
A replacement for downloading or licensing upstream base components
A license override for Gemma, Anima, NVIDIA Cosmos, or any source dataset
A guarantee of pose, anatomy, text rendering, or prompt fidelity

Known Limitations

Safety and content behavior have not been fully evaluated.
Pose understanding remains an active improvement area.
Some broad ComfyUI samplers were intentionally not exposed because they were not part of the app-supported smoke-tested subset.
The app currently owns a curated sampler/scheduler contract rather than exposing every option from ComfyUI.
Bridge profile checkpoints are prototype routing choices and should not be promoted without separate evaluation.
Base model files are external dependencies and should remain linked to their original distribution pages.

Runtime Notes

Recommended local runtime:

Windows + PowerShell
RTX 4070 Ti SUPER as the primary PyTorch/rendering GPU
llama.cpp CUDA build for Gemma Core inference
GemmAnima standalone app for orchestration

Keep RTX 5060 out of PyTorch cache/training paths unless explicitly re-enabled with a compatible PyTorch build.

Release Checklist

For this v0.1 public prototype adapter-only release:

The repository remains adapter/checkpoint-only.
Upstream base weights are referenced, not mirrored.
LICENSE_NOTICES.md is included.
SHA256 checksums are recorded in adapter_manifest_v0.1.json.

Still required before promotion:

Run and publish a small reproducible inference smoke.
Run a safety and content-policy review.
Add representative example outputs only after evaluation.

Example File Layout

.
|-- README.md
|-- LICENSE_NOTICES.md
|-- model_sources.json
|-- adapter_manifest_v0.1.json
|-- gemma_core/
|   |-- text-adapter-model-f16.gguf
|   |-- vision-tagger-adapter-model-f16.gguf
|   `-- gemma4-tipo-vision.mmproj-f16.gguf
`-- hiddenstage_bridge/
    |-- hiddenstage-planner-adapter.safetensors
    |-- hiddenstage-planner-embed-vision.pt
    |-- kv_proj_hiddenstage_planner_v2.pt
    |-- kv_proj_text_delta_300k_from_epoch1_a0p35.pt
    `-- kv_proj_text_exact_v27_alpha35.pt

External base weights should be downloaded separately from their original model pages and referenced by the local GemmAnima app configuration.

The standalone app download plan uses:

python -m gemmanima.cli model-download-plan --json
python -m gemmanima.cli ensure-model-assets --json

Citation and Attribution

This is a composite adapter/checkpoint bundle. It does not relicense upstream base models. Add upstream citations, original download links, and license notices for the base model, image model, VAE, NVIDIA dependency, and any training datasets before a public release.

Downloads last month: 59

GGUF

Model size

24.2M params

Architecture

gemma4

Hardware compatibility

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support