Image-Text-to-Text
Transformers
Safetensors
GGUF
English
minicpmv4_6
minicpm-v
vision-language
multimodal
image-to-text
lora
rune-goblin
runelang
gradio
game-ai
spell-recognition
conversational
Instructions to use ASHu2/goblinV1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ASHu2/goblinV1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="ASHu2/goblinV1") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("ASHu2/goblinV1") model = AutoModelForMultimodalLM.from_pretrained("ASHu2/goblinV1") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - llama-cpp-python
How to use ASHu2/goblinV1 with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="ASHu2/goblinV1", filename="gguf/rune-goblin-v46-Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use ASHu2/goblinV1 with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf ASHu2/goblinV1:Q4_K_M # Run inference directly in the terminal: llama cli -hf ASHu2/goblinV1:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf ASHu2/goblinV1:Q4_K_M # Run inference directly in the terminal: llama cli -hf ASHu2/goblinV1:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf ASHu2/goblinV1:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf ASHu2/goblinV1:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf ASHu2/goblinV1:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf ASHu2/goblinV1:Q4_K_M
Use Docker
docker model run hf.co/ASHu2/goblinV1:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use ASHu2/goblinV1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ASHu2/goblinV1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ASHu2/goblinV1", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/ASHu2/goblinV1:Q4_K_M
- SGLang
How to use ASHu2/goblinV1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ASHu2/goblinV1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ASHu2/goblinV1", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ASHu2/goblinV1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ASHu2/goblinV1", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Ollama
How to use ASHu2/goblinV1 with Ollama:
ollama run hf.co/ASHu2/goblinV1:Q4_K_M
- Unsloth Studio
How to use ASHu2/goblinV1 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ASHu2/goblinV1 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ASHu2/goblinV1 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for ASHu2/goblinV1 to start chatting
- Pi
How to use ASHu2/goblinV1 with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf ASHu2/goblinV1:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "ASHu2/goblinV1:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use ASHu2/goblinV1 with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf ASHu2/goblinV1:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default ASHu2/goblinV1:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- OpenClaw new
How to use ASHu2/goblinV1 with OpenClaw:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf ASHu2/goblinV1:Q4_K_M
Configure OpenClaw
# Install OpenClaw: npm install -g openclaw@latest # Register the local server and set it as the default model: openclaw onboard --non-interactive --mode local \ --auth-choice custom-api-key \ --custom-base-url http://127.0.0.1:8080/v1 \ --custom-model-id "ASHu2/goblinV1:Q4_K_M" \ --custom-provider-id llama-cpp \ --custom-compatibility openai \ --custom-text-input \ --accept-risk \ --skip-health
Run OpenClaw
openclaw agent --local --agent main --message "Hello from Hugging Face"
- Docker Model Runner
How to use ASHu2/goblinV1 with Docker Model Runner:
docker model run hf.co/ASHu2/goblinV1:Q4_K_M
- Lemonade
How to use ASHu2/goblinV1 with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull ASHu2/goblinV1:Q4_K_M
Run and chat with the model
lemonade run user.goblinV1-Q4_K_M
List all available models
lemonade list
| license: apache-2.0 | |
| base_model: openbmb/MiniCPM-V-4.6 | |
| datasets: | |
| - ASHu2/rune_goblin_visual_dataset | |
| language: | |
| - en | |
| library_name: transformers | |
| pipeline_tag: image-text-to-text | |
| tags: | |
| - minicpmv4_6 | |
| - minicpm-v | |
| - vision-language | |
| - multimodal | |
| - image-to-text | |
| - safetensors | |
| - gguf | |
| - lora | |
| - rune-goblin | |
| - runelang | |
| - gradio | |
| - game-ai | |
| - spell-recognition | |
| # GoblinV1 | |
| **GoblinV1** is a fine-tuned vision-language model for **Rune Goblin**, an AI dungeon crawler where players draw spell glyphs and the model interprets those drawings as magic. | |
| The model is fine-tuned from **OpenBMB MiniCPM-V-4.6** to read hand-drawn **RuneLang** glyphs and produce structured JSON describing detected runes, ambiguity, spell metadata, and visual presentation hints. | |
| It is not a generic chatbot. | |
| It is a tiny cursed spell reader. | |
| > Draw better runes, get stronger magic. | |
| > Draw cursed doodles, suffer beautifully. | |
| --- | |
| ## Model Summary | |
| | Field | Value | | |
| |---|---| | |
| | Model name | `ASHu2/goblinV1` | | |
| | Base model | `openbmb/MiniCPM-V-4.6` | | |
| | Model type | Vision-language / image-to-text | | |
| | Architecture | MiniCPM-V 4.6 / `minicpmv4_6` | | |
| | Fine-tuning method | LoRA fine-tuning, merged/exported | | |
| | Primary format | Safetensors | | |
| | Quantized format | GGUF | | |
| | Language | English | | |
| | License | Apache-2.0 | | |
| | Dataset | `ASHu2/rune_goblin_visual_dataset` | | |
| | Primary use case | Hand-drawn rune interpretation for Rune Goblin | | |
| --- | |
| ## What This Model Does | |
| GoblinV1 reads a player-drawn spell image and returns structured spell interpretation metadata. | |
| The model is trained to identify: | |
| - drawn RuneLang glyphs | |
| - ambiguous or messy rune shapes | |
| - confidence of the visual reading | |
| - spell name and spell type | |
| - colors, shape, motion, particles, and sound tags | |
| - presentation hints for game VFX | |
| - weird, funny, cursed spell flavor consistent with Rune Goblin | |
| The model is designed to act as the **rune reader and spell presentation planner**. | |
| The game engine should still own final combat balance, HP changes, boss rules, inventory state, and durable quest state. | |
| --- | |
| ## Rune Goblin | |
| Rune Goblin is a Gradio-based AI dungeon crawler where the player draws their own spells. | |
| Instead of selecting “fireball” from a fixed menu, the player draws symbolic glyphs. GoblinV1 reads the glyphs and converts them into spell interpretation JSON. | |
| Example RuneLang meanings: | |
| | Rune | Meaning | | |
| |---|---| | |
| | Flame | burn, danger, passion | | |
| | Leaf | healing, growth, poison | | |
| | Bone | fear, decay, skeletons | | |
| | Spiral | time, confusion, loops | | |
| | Eye | reveal, inspect, prophecy | | |
| | Mirror | reflect, copy, reverse | | |
| | Circle | shield, trap, containment | | |
| | Broken Mark | curse modifier | | |
| | Bell | summon, alarm, attention | | |
| | Coin | trade, greed, sacrifice | | |
| Example combinations: | |
| | Combination | Meaning | | |
| |---|---| | |
| | Flame + Circle | burning shield | | |
| | Spiral + Eye | prophecy / foresight | | |
| | Bone + Dots | skeleton swarm | | |
| | Mirror + Jagged | reflect damage | | |
| | Leaf + Bone | healing with decay risk | | |
| | Broken + Any Rune | stronger effect with cursed side effect | | |
| --- | |
| ## Expected Input | |
| The model expects an image containing a hand-drawn RuneLang spell. | |
| The image usually contains 1-4 glyphs drawn by the player on a canvas. | |
| The prompt should instruct the model to return JSON only. | |
| Example prompt: | |
| ```text | |
| Look at this drawn RuneLang spell. Identify the runes, ambiguity, confidence, and produce spell presentation metadata: name, type, colors, shape, motion, grandeur, particles, and sound tags. Return valid JSON only. | |
| ``` | |
| --- | |
| ## Expected Output | |
| GoblinV1 should return JSON in this style: | |
| ```json | |
| { | |
| "visual_reading": { | |
| "detected_runes": ["flame", "circle"], | |
| "ambiguous_runes": [], | |
| "confidence": 0.91, | |
| "layout": "left_to_right" | |
| }, | |
| "spell": { | |
| "spell_name": "Ember Lunchbox Ward", | |
| "spell_type": "fire_defense", | |
| "rune_combo": ["flame", "circle"], | |
| "summary": "A circular flame ward forms around the player.", | |
| "colors": ["orange", "red", "gold"], | |
| "shape": "burning circular shield", | |
| "motion": "slow clockwise rotation with pulsing embers", | |
| "grandeur": "medium", | |
| "particles": ["embers", "sparks", "heat shimmer"], | |
| "sound_tags": ["crackle", "whoosh", "low hum"] | |
| } | |
| } | |
| ``` | |
| --- | |
| ## Important Design Boundary | |
| GoblinV1 is intentionally **not** responsible for final game-state authority. | |
| The model may suggest spell interpretation and presentation metadata, but the game engine should decide: | |
| - final HP changes | |
| - damage numbers | |
| - cooldowns | |
| - boss immunity rules | |
| - quest state | |
| - inventory changes | |
| - progression unlocks | |
| - anti-cheat and validation logic | |
| Recommended architecture: | |
| ```text | |
| Player drawing | |
| ↓ | |
| GoblinV1 vision model | |
| ↓ | |
| Rune + spell JSON | |
| ↓ | |
| Game engine validates / clamps / balances | |
| ↓ | |
| Final spell effect is applied | |
| ``` | |
| This keeps Rune Goblin fun and expressive while preventing model hallucinations from breaking combat balance. | |
| --- | |
| ## Training Data | |
| GoblinV1 was fine-tuned on `ASHu2/rune_goblin_visual_dataset`, a custom visual instruction dataset for Rune Goblin. | |
| The dataset contains hand-drawn or synthetic RuneLang spell images paired with structured chat-style targets. | |
| Each example teaches the model to: | |
| - read visual glyphs | |
| - map glyphs to RuneLang symbols | |
| - handle messy or ambiguous drawings | |
| - follow RuneLang combination rules | |
| - return valid JSON | |
| - preserve the game’s cursed-comedic tone | |
| The dataset uses image paths and conversational messages containing a system prompt, user prompt, and assistant JSON target. | |
| --- | |
| ## Training Objective | |
| The objective is not general image captioning. | |
| The model is trained for: | |
| ```text | |
| drawn rune image + instruction → structured spell interpretation JSON | |
| ``` | |
| The model learns the custom RuneLang vocabulary and the relationship between visual glyphs and spell presentation. | |
| --- | |
| ## Model Formats | |
| This repository includes multiple usable formats. | |
| ### Safetensors | |
| Use this for normal Transformers-based inference or deployment. | |
| ### GGUF | |
| A quantized GGUF version is included for local inference experiments with llama.cpp-compatible runtimes. | |
| Example GGUF file: | |
| ```text | |
| gguf/rune-goblin-v46-Q4_K_M.gguf | |
| ``` | |
| If using the GGUF model for vision tasks, make sure your runtime supports MiniCPM-V style multimodal inference and loads the required vision/projector files when needed. | |
| --- | |
| ## Basic Transformers Usage | |
| Install dependencies: | |
| ```bash | |
| pip install torch torchvision pillow transformers accelerate | |
| ``` | |
| Example usage: | |
| ```python | |
| from PIL import Image | |
| import torch | |
| from transformers import AutoProcessor, AutoModelForImageTextToText | |
| model_id = "ASHu2/goblinV1" | |
| processor = AutoProcessor.from_pretrained( | |
| model_id, | |
| trust_remote_code=True | |
| ) | |
| model = AutoModelForImageTextToText.from_pretrained( | |
| model_id, | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto", | |
| trust_remote_code=True | |
| ) | |
| image = Image.open("example_rune.png").convert("RGB") | |
| messages = [ | |
| { | |
| "role": "user", | |
| "content": [ | |
| {"type": "image", "image": image}, | |
| { | |
| "type": "text", | |
| "text": ( | |
| "Look at this drawn RuneLang spell. " | |
| "Identify the runes, ambiguity, confidence, and produce spell presentation metadata. " | |
| "Return valid JSON only." | |
| ) | |
| } | |
| ] | |
| } | |
| ] | |
| inputs = processor.apply_chat_template( | |
| messages, | |
| add_generation_prompt=True, | |
| tokenize=True, | |
| return_dict=True, | |
| return_tensors="pt" | |
| ).to(model.device) | |
| with torch.no_grad(): | |
| output_ids = model.generate( | |
| **inputs, | |
| max_new_tokens=512, | |
| do_sample=False | |
| ) | |
| generated = output_ids[0][inputs["input_ids"].shape[-1]:] | |
| text = processor.decode(generated, skip_special_tokens=True) | |
| print(text) | |
| ``` | |
| --- | |
| ## llama.cpp / GGUF Usage | |
| If using a compatible llama.cpp build: | |
| ```bash | |
| llama-server -hf ASHu2/goblinV1:Q4_K_M | |
| ``` | |
| Or run directly: | |
| ```bash | |
| llama-cli -hf ASHu2/goblinV1:Q4_K_M | |
| ``` | |
| For multimodal inference, ensure your client/runtime supports MiniCPM-V and any required vision projector configuration. | |
| --- | |
| ## Intended Use | |
| GoblinV1 is intended for: | |
| - Rune Goblin gameplay | |
| - hand-drawn spell glyph interpretation | |
| - visual rune recognition | |
| - experimental AI game mechanics | |
| - structured JSON generation from fantasy glyph images | |
| - small-model multimodal game prototypes | |
| --- | |
| ## Out-of-Scope Use | |
| GoblinV1 is not intended for: | |
| - safety-critical image understanding | |
| - medical, legal, financial, or security decisions | |
| - general OCR benchmarking | |
| - real-world symbol recognition systems | |
| - moderation or surveillance | |
| - authoritative factual QA | |
| - replacing deterministic game rules | |
| --- | |
| ## Limitations | |
| GoblinV1 may: | |
| - misread very messy drawings | |
| - confuse visually similar glyphs | |
| - produce malformed JSON in some cases | |
| - invent spell details outside the intended schema | |
| - overfit to Rune Goblin-style symbols | |
| - perform poorly on non-RuneLang images | |
| - require game-engine validation before applying effects | |
| Recommended production safeguards: | |
| - validate output JSON | |
| - retry once on invalid output | |
| - clamp all numeric values in the game engine | |
| - reject unknown runes | |
| - keep final state transitions deterministic | |
| - log ambiguous readings for future dataset improvement | |
| --- | |
| ## Suggested Evaluation | |
| Recommended metrics: | |
| | Metric | Goal | | |
| |---|---| | |
| | Valid JSON rate | >95% | | |
| | Rune detection accuracy | >85% | | |
| | Ambiguity detection quality | Manual review | | |
| | Schema compliance | >95% | | |
| | Unknown rune rejection | High | | |
| | Latency | Playable for Gradio | | |
| | Cursed-fantasy tone consistency | Manual review | | |
| Suggested test cases: | |
| ```text | |
| flame + circle | |
| spiral + eye | |
| bone + dots | |
| mirror + jagged | |
| leaf + bone | |
| broken + coin | |
| messy flame vs leaf | |
| partial / incomplete glyph | |
| empty canvas | |
| ``` | |
| --- | |
| ## Example Output | |
| ```json | |
| { | |
| "visual_reading": { | |
| "detected_runes": ["spiral", "eye", "broken_mark"], | |
| "ambiguous_runes": [ | |
| { | |
| "candidates": ["spiral", "wave"], | |
| "reason": "curved repeated stroke could indicate either looping time or water/emotion" | |
| } | |
| ], | |
| "confidence": 0.82, | |
| "layout": "clustered" | |
| }, | |
| "spell": { | |
| "spell_name": "Cursed Foresight Loop", | |
| "spell_type": "prophecy_curse", | |
| "rune_combo": ["spiral", "eye", "broken_mark"], | |
| "summary": "The spell reveals a possible future, then immediately makes it worse.", | |
| "colors": ["violet", "black", "pale blue"], | |
| "shape": "floating eye inside a cracked spiral", | |
| "motion": "spiral contracts inward while the eye flickers", | |
| "grandeur": "high", | |
| "particles": ["purple sparks", "black motes", "thin time-rings"], | |
| "sound_tags": ["whisper", "glass-crack", "reverse-chime"] | |
| } | |
| } | |
| ``` | |
| --- | |
| ## Deployment Notes | |
| GoblinV1 can be used as the model backend for a Gradio game. | |
| Recommended serving layout: | |
| ```text | |
| Gradio UI / Canvas | |
| ↓ | |
| Image preprocessing | |
| ↓ | |
| GoblinV1 inference | |
| ↓ | |
| JSON parsing + validation | |
| ↓ | |
| Rune Goblin game engine | |
| ↓ | |
| Updated battle state + animation | |
| ``` | |
| For GPU deployment, use the Safetensors model with Transformers. | |
| For lightweight/local experiments, use the GGUF export with a compatible llama.cpp runtime. | |
| --- | |
| ## Related Project | |
| GoblinV1 powers the Rune Goblin dungeon crawler. | |
| Rune Goblin is an AI game where players draw spells, explore maps, fight bosses, trigger cursed outcomes, and unlock stronger effects by drawing clearer runes. | |
| Links: | |
| - Game/demo: https://huggingface.co/spaces/build-small-hackathon/Rune-Goblin | |
| - Model: https://huggingface.co/ASHu2/goblinV1 | |
| - Dataset: https://huggingface.co/datasets/ASHu2/rune_goblin_visual_dataset | |
| --- | |
| ## Citation | |
| If you use this model, please cite: | |
| ```bibtex | |
| @misc{goblinv1_2026, | |
| title = {GoblinV1: A Fine-Tuned MiniCPM-V Rune Reader for Rune Goblin}, | |
| author = {Ashutosh Mishra}, | |
| year = {2026}, | |
| publisher = {Hugging Face}, | |
| howpublished = {https://huggingface.co/ASHu2/goblinV1} | |
| } | |
| ``` | |
| Base model: | |
| ```bibtex | |
| @misc{minicpmv46_2026, | |
| title = {MiniCPM-V 4.6}, | |
| author = {OpenBMB}, | |
| year = {2026}, | |
| publisher = {Hugging Face}, | |
| howpublished = {https://huggingface.co/openbmb/MiniCPM-V-4.6} | |
| } | |
| ``` | |
| --- | |
| ## License | |
| This model is released under the Apache-2.0 license. | |
| Please also follow the license and usage terms of the base model `openbmb/MiniCPM-V-4.6`. | |
| --- | |
| ## Acknowledgements | |
| GoblinV1 is built on top of OpenBMB MiniCPM-V-4.6. | |
| Thanks to the open-source multimodal model community, Hugging Face, Gradio, Modal, llama.cpp, and the cursed little goblin inside every ambiguous doodle. | |