Instructions to use google/gemma-4-31B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-4-31B-it with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="google/gemma-4-31B-it") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/gemma-4-31B-it") model = AutoModelForImageTextToText.from_pretrained("google/gemma-4-31B-it") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- AMD Developer Cloud
- Local Apps
- vLLM
How to use google/gemma-4-31B-it with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "google/gemma-4-31B-it" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-4-31B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/google/gemma-4-31B-it
- SGLang
How to use google/gemma-4-31B-it with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "google/gemma-4-31B-it" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-4-31B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "google/gemma-4-31B-it" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-4-31B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use google/gemma-4-31B-it with Docker Model Runner:
docker model run hf.co/google/gemma-4-31B-it
why print rightarrow
always print "$\rightarrow" in my claude code, use vllm v0.19.1rc1.dev64+g62095e82c in A100 80G x2
Hi @wangtf-Kevin -
If you have a specific question, feedback, or issue related to this model, please feel free to share more details. Weโd be happy to help!
Hi @wangtf-Kevin -
If you have a specific question, feedback, or issue related to this model, please feel free to share more details. Weโd be happy to help!
That was a specific question or issue.
He's asking why your model, in it's CoT block, goes "(\rightarrow)" to mark logic chains like this:
User is asking question (\rightarrow) google staff ignores it (\rightarrow) Model is sad.
It's not a problem per say, but it's weird enough to be worth asking why it's the case, as an emoji or a "->" or literally anything else would have made more sense. Is that a special token that was trained / used for logic chains?
hi @sonali-kumari11 @SerialKicked
I think it may be due to the use of Chinese. Please note the printed information : $\rightarrow$
I use gemma4 to analyze MinerU ๏ผset default Chinese in Claude code cli use vs code terminal, the print like :
ask:
ๆพๅบ mineru-models-download -s modelscope -m all ้่ฆไธ่ฝฝ็ๆๆๆจกๅ
โง Selected 1 lines from docker/china/Dockerfile in Visual Studio Code
anwser:
โ ้่ฟๅๆ mineru/cli/models_download.py ็ๆบ็ ๏ผๆง่ก mineru-models-download -s
modelscope -m all ไผไธ่ฝฝไธค็ฑปๆจกๅ๏ผPipeline ๆจกๅ ๅ VLM ๆจกๅใ
ไปฅไธๆฏ่ฏฆ็ป็ๆจกๅๆธ ๅ๏ผ
- Pipeline ๆจกๅ (Pipeline Models)
ไปฃ็ ไธญ็ download_pipeline_models()
ๅฝๆฐๅฎไนไบ้่ฆไธ่ฝฝ็ๆจกๅๅ่กจใ่ฟไบๆจกๅไธป่ฆ่ด่ดฃ็้ขๅๆใๅ
ฌๅผ่ฏๅซใ่กจๆ ผ่ฏๅซๅ
OCRใ
้่ฆไธ่ฝฝ็ๅ ทไฝๆจกๅ๏ผๅฏนๅบ ModelPath ๆไธพ๏ผๅ ๆฌ๏ผ
- pp_doclayout_v2: ๆๆกฃ็้ขๅๆๆจกๅ (Layout Analysis)
- unimernet_small: ๅ ฌๅผ่ฏๅซๆจกๅ (Formula Recognition)
- pytorch_paddle: OCR ๅบ็กๆจกๅ (OCR)
- slanet_plus: ่กจๆ ผ็ปๆ่ฏๅซๆจกๅ (Table Structure Recognition)
- unet_structure: ่กจๆ ผ็ปๆ่ฏๅซๆจกๅ (Table Structure Recognition)
- paddle_table_cls: ่กจๆ ผๅ็ฑปๆจกๅ (Table Classification)
- paddle_orientation_classification: ๆๆฌๆนๅๅ็ฑปๆจกๅ (Orientation
Classification) - pp_formulanet_plus_m: ๅ ฌๅผ่ฏๅซๅขๅผบๆจกๅ (Formula Recognition)
- VLM ๆจกๅ (VLM Models)
ไปฃ็ ไธญ็ download_vlm_models() ๅฝๆฐ่ด่ดฃไธ่ฝฝ VLM (Vision-Language Model)
็ธๅ
ณ็ๆจกๅใ
- VLM ๆ้ๆไปถ: ่ฏฅๅฝๆฐ่ฐ็จ auto_download_and_get_model_root_path("/",
repo_mode='vlm')๏ผๅฎไผไธ่ฝฝไธไธชๅฎๆด็ VLM
ๆจกๅๆ้ๅบ๏ผ้ๅธธๆฏไธไธชๅคงๅ็ๆจกๅๆไปถๅคน๏ผ๏ผ็จไบ้ซ็ฒพๅบฆ็็ซฏๅฐ็ซฏ่งฃๆใ
ๆป็ป
ๅฝไฝ ่ฟ่ก -m all ๆถ๏ผ็จๅบไผไพๆฌกๆง่กไธ่ฟฐไธคไธชไธ่ฝฝๅฝๆฐใ
ไธ่ฝฝๆต็จๆ่ฆ๏ผ
- Pipeline ็ป $\rightarrow$ ไธ่ฝฝ 8 ไธชไธ้กนๆจกๅ $\rightarrow$ ๅๅ
ฅ mineru.json
็ models-dir.pipelineใ - VLM ็ป $\rightarrow$ ไธ่ฝฝ VLM ๆ้ๅบ $\rightarrow$ ๅๅ
ฅ mineru.json ็
models-dir.vlmใ
ๅฆๆไฝ ๆ็ฎๆๅจไธ่ฝฝ๏ผไฝ ้่ฆ็กฎไฟไธ่ฟฐๆๆๆจกๅ้ฝ่ขซไธ่ฝฝๅฐๆฌๅฐ๏ผๅนถไธๅจ mineru.json
ไธญๆญฃ็กฎ้
็ฝฎ่ฟไธคไธช็ฎๅฝ็่ทฏๅพใ
ask:
โฏ use English Re-answer
answer๏ผ
โ Based on the analysis of the source code in mineru/cli/models_download.py, executing the command mineru-models-download -s modelscope -m all triggers the download of two main categories of models: Pipeline Models and VLM Models.
Here is the detailed breakdown of the models that will be downloaded:
- Pipeline Models
The download_pipeline_models() function defines a list of specialized models used for layout analysis, formula recognition, table recognition, and OCR.
The specific models downloaded (corresponding to the ModelPath enum) are:
- pp_doclayout_v2: Document Layout Analysis model.
- unimernet_small: Formula Recognition model.
- pytorch_paddle: Base OCR models.
- slanet_plus: Table Structure Recognition model.
- unet_structure: Table Structure Recognition model.
- paddle_table_cls: Table Classification model.
- paddle_orientation_classification: Text Orientation Classification model.
- pp_formulanet_plus_m: Enhanced Formula Recognition model.
- VLM Models
The download_vlm_models() function handles the download of the Vision-Language Model (VLM) weights.
- VLM Weight Library: This function calls auto_download_and_get_model_root_path("/", repo_mode='vlm'), which downloads a comprehensive VLM model weights directory used for high-accuracy end-to-end parsing.
Summary Table
โโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Model Category โ Scope โ Primary Purpose โ Config Key in mineru.json โ
โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Pipeline โ 8 Specialized Models โ Layout, OCR, Formula, and Table Recognition โ models-dir.pipeline โ
โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ VLM โ VLM Weight Library โ High-precision end-to-end multimodal parsing โ models-dir.vlm โ
โโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Execution Workflow:
When the -m all flag is used, the program sequentially executes both download_pipeline_models() and download_vlm_models(). Once completed, it automatically updates your mineru.json configuration file with the local paths to these downloaded models.
Interesting how it's removed from your English translated version. I use the model only in English, and while I've seen $\rightarrow$ surface plenty of times in English, it was always in the thinking block, never in the responses.
Now that I think about it, I noticed the rightarrow thing in the early GGUF versions on early llama.cpp implementation, but I don't remember seeing it recently. But I haven't paid that much attention to it either (don't often read the thinking block).
Edit: Correction, I still see it in modern GGUF and backends. Both in response and thinking block. It's just not super common.
It's just a LaTeX expression embedded in markdown.
I guess it's natural for them to use $\rightarrow$ in thinking section since they use markdown in the section.