GemmAnima Prototype Adapter Bundle
Korean model card | GitHub app source
Prototype Notice
This is a v0.1 prototype adapter bundle for the standalone GemmAnima app. It is published so the current local pipeline can be reproduced and tested. It is not a safety-rated assistant, not a production image model, and not a commercial-ready release.
The repository intentionally contains GemmAnima adapters, projectors, bridge checkpoints, metadata, and docs only. Upstream base weights must be downloaded from their original model pages.
Quick Summary
| Area | Status |
|---|---|
| Release type | Public prototype adapter/checkpoint bundle |
| Base weights | Not mirrored here |
| Runtime shape | One shared Gemma GGUF plus task LoRA/adapters, Anima base weights, and bridge profiles |
| Evaluation | Local smoke tests and bridge checks only |
| Safety | Not safety-rated |
| License posture | Adapter bundle notice plus upstream base-model restrictions |
GemmAnima is a local multimodal adapter bundle that connects a Gemma/TIPO-style language and vision-tagging core to an Anima image-generation core through a HiddenStage bridge.
This repository is intended for GemmAnima-owned adapters, projectors, bridge checkpoints, and runtime metadata for the standalone GemmAnima app. It is not a single monolithic model file, and it should not mirror upstream base model weights.
Status
- Release status: v0.1 public prototype adapter/checkpoint release
- Safety status: not safety-rated
- Evaluation status: partial smoke and bridge checks only
- License status: adapter/checkpoint bundle with upstream dependency licenses;
see
LICENSE_NOTICES.md - Visibility: public adapter-only repository; upstream base model weights are not mirrored here
Do not treat this prototype as a promoted production model without additional evaluation. Do not treat this adapter bundle as relicensing any upstream base model.
Consumer Download
This Hugging Face repository is the adapter/checkpoint bundle. The GitHub repository is the app/source-code surface. On a GemmAnima checkout, let the app plan or download model assets:
python -m gemmanima.cli model-download-plan --json
python -m gemmanima.cli ensure-model-assets --json
The app/source README owns local setup, GUI, development, and runtime details. This model card only describes the files published in the adapter bundle.
Model Parts
This repository should contain only the files produced or adapted by the GemmAnima project. Original base models should be downloaded from their original distribution pages and placed locally according to the standalone app configuration.
1. Gemma Core
The Gemma Core handles chat, language-harness behavior, canonical English Danbooru tag output, and vision tagger language behavior.
GemmAnima files:
| File | Role |
|---|---|
text-adapter-model-f16.gguf |
Text/chat LoRA adapter |
vision-tagger-adapter-model-f16.gguf |
Vision/tagger LoRA adapter, refreshed from the mixed-pose-front v2 final prototype |
gemma4-tipo-vision.mmproj-f16.gguf |
Vision projector paired with the mixed-pose-front v2 final prototype |
External requirement:
| File | How to obtain |
|---|---|
gemma-4-E2B-it-heretic-ara-custom.Q4_K_M.gguf |
Download from mradermacher/gemma-4-E2B-it-heretic-ara-custom-GGUF; do not mirror it here unless redistribution is explicitly permitted and intentionally chosen. |
Upstream license metadata for the GGUF page currently reports apache-2.0.
The preferred runtime shape is an upstream base GGUF loaded with GemmAnima
task-specific LoRA adapters through llama.cpp --lora. Older fully merged GGUFs
are not the preferred packaging shape.
2. Anima Image Core
The Anima Image Core handles diffusion sampling and VAE decoding.
GemmAnima files:
This repository does not need to contain Anima Image Core base weights.
External requirements:
| File | Role |
|---|---|
split_files/diffusion_models/anima-base-v1.0.safetensors |
Download from circlestone-labs/Anima |
split_files/vae/qwen_image_vae.safetensors |
Download from circlestone-labs/Anima |
The upstream Anima page currently reports
circlestone-labs-non-commercial-license and states that Anima is also subject
to the NVIDIA Open Model License Agreement where applicable because it is a
derivative of NVIDIA Cosmos-Predict2-2B-Text2Image.
Anima text encoder weights are not part of the required standalone runtime. The
current in-process renderer uses Anima-compatible tokenizer metadata
(t5xxl_ids and t5xxl_weights) for conditioning shape compatibility, not the
Anima text encoder weight file.
Supported generation controls in the standalone app:
| Type | Supported values |
|---|---|
| Sampler | euler, euler_ancestral, dpmpp_2m, dpmpp_2m_sde_gpu |
| Scheduler | normal, karras, sgm_uniform |
| Resolution presets | 1024x1024, 832x1216, 768x1344, custom |
3. HiddenStage Bridge
The HiddenStage Bridge connects Gemma hidden-state features to Anima-compatible conditioning.
GemmAnima files:
| File | Role |
|---|---|
hiddenstage-planner-adapter.safetensors |
Planner LoRA adapter |
hiddenstage-planner-embed-vision.pt |
Planner vision embedding |
kv_proj_hiddenstage_planner_v2.pt |
HiddenStage bridge checkpoint |
kv_proj_text_delta_300k_from_epoch1_a0p35.pt |
Prototype default quality bridge profile used for normal image generation and style-tag prompts |
kv_proj_text_exact_v27_alpha35.pt |
Prototype bridge profile for signs, labels, captions, and readable-text prompts |
The standalone app routes bridge profiles automatically:
| Profile | Automatic use |
|---|---|
balanced_pose |
Normal image-generation prompts; routed to kv_proj_text_delta_300k_from_epoch1_a0p35.pt |
style_artist |
Style-oriented tags and surface-token-heavy prompts; routed to kv_proj_text_delta_300k_from_epoch1_a0p35.pt |
text_exact |
Prompts asking for readable text, signs, labels, captions, or logos |
legacy_mse |
Compatibility baseline and explicit override |
Current local bridge metadata:
| Metric | Value |
|---|---|
| Bridge validation MSE | 0.001104317136865575 |
| Bridge gate | passed |
| Planner eval loss | 1.0061092711985111 |
| Planner eval threshold | 1.5 |
These are engineering gate metrics and small local smoke tests, not a full end-user quality evaluation.
Uploaded Files
Checksums and byte sizes for every uploaded file are recorded in
adapter_manifest_v0.1.json. The main uploaded payload is:
| Directory | Files |
|---|---|
gemma_core/ |
Text LoRA, vision/tagger LoRA, vision mmproj |
hiddenstage_bridge/ |
Planner adapter, planner vision embedding, legacy bridge, and three prototype bridge profiles |
| repository root | Hugging Face model card, license notices, model source metadata, adapter manifest, version marker |
Approximate Size
This adapter repository should be much smaller than the full local runtime because original base weights are expected to come from their original pages:
| Part | Approximate upload size |
|---|---|
| Gemma Core adapters/projector | ~1.06 GB |
| HiddenStage Bridge | ~0.40 GB |
| Anima Image Core base weights | not uploaded here |
| Total uploaded here | ~1.46 GB |
For local runtime planning, the full standalone runtime is about 9 GB in decimal units after the user downloads the external base weights separately:
| Part | Approximate size |
|---|---|
| Gemma Core | ~4.51 GB |
| Anima Image Core | ~4.44 GB |
| HiddenStage Bridge | ~0.40 GB |
| Total | ~9.34 GB |
Exact size depends on final filenames and whether source adapters or compatibility reference models are included.
Intended Use
This bundle is intended for:
- Local GemmAnima app runtime testing
- Korean or English chat with an explicit language harness
- Canonical English Danbooru tag output for tag requests
- Chat-driven image-generation request planning
- Anima image rendering through the app-controlled preset system
For tag requests, output tags should remain canonical English Danbooru tags even when the user-facing chat language is Korean.
Out of Scope
This bundle is not intended as:
- A general-purpose safety-filtered assistant
- A fully evaluated public image-generation model
- A replacement for downloading or licensing upstream base components
- A license override for Gemma, Anima, NVIDIA Cosmos, or any source dataset
- A guarantee of pose, anatomy, text rendering, or prompt fidelity
Known Limitations
- Safety and content behavior have not been fully evaluated.
- Pose understanding remains an active improvement area.
- Some broad ComfyUI samplers were intentionally not exposed because they were not part of the app-supported smoke-tested subset.
- The app currently owns a curated sampler/scheduler contract rather than exposing every option from ComfyUI.
- Bridge profile checkpoints are prototype routing choices and should not be promoted without separate evaluation.
- Base model files are external dependencies and should remain linked to their original distribution pages.
Runtime Notes
Recommended local runtime:
- Windows + PowerShell
- RTX 4070 Ti SUPER as the primary PyTorch/rendering GPU
- llama.cpp CUDA build for Gemma Core inference
- GemmAnima standalone app for orchestration
Keep RTX 5060 out of PyTorch cache/training paths unless explicitly re-enabled with a compatible PyTorch build.
Release Checklist
For this v0.1 public prototype adapter-only release:
- The repository remains adapter/checkpoint-only.
- Upstream base weights are referenced, not mirrored.
LICENSE_NOTICES.mdis included.- SHA256 checksums are recorded in
adapter_manifest_v0.1.json.
Still required before promotion:
- Run and publish a small reproducible inference smoke.
- Run a safety and content-policy review.
- Add representative example outputs only after evaluation.
Example File Layout
.
|-- README.md
|-- LICENSE_NOTICES.md
|-- model_sources.json
|-- adapter_manifest_v0.1.json
|-- gemma_core/
| |-- text-adapter-model-f16.gguf
| |-- vision-tagger-adapter-model-f16.gguf
| `-- gemma4-tipo-vision.mmproj-f16.gguf
`-- hiddenstage_bridge/
|-- hiddenstage-planner-adapter.safetensors
|-- hiddenstage-planner-embed-vision.pt
|-- kv_proj_hiddenstage_planner_v2.pt
|-- kv_proj_text_delta_300k_from_epoch1_a0p35.pt
`-- kv_proj_text_exact_v27_alpha35.pt
External base weights should be downloaded separately from their original model pages and referenced by the local GemmAnima app configuration.
The standalone app download plan uses:
python -m gemmanima.cli model-download-plan --json
python -m gemmanima.cli ensure-model-assets --json
Citation and Attribution
This is a composite adapter/checkpoint bundle. It does not relicense upstream base models. Add upstream citations, original download links, and license notices for the base model, image model, VAE, NVIDIA dependency, and any training datasets before a public release.
- Downloads last month
- 59
16-bit