nielsr HF Staff commited on
Commit
a1d8121
·
verified ·
1 Parent(s): 81815b0

Improve model card: Add metadata, paper link, code link, project blog link and usage instructions

Browse files

This PR enriches the model card for `JacobiForcing_Coder_7B_v1` by adding:
- A link to the paper: [Fast and Accurate Causal Parallel Decoding using Jacobi Forcing](https://huggingface.co/papers/2512.14681).
- The `license: apache-2.0`.
- The `library_name: transformers` as the model is built on Transformers and uses its tokenizer and generation utilities.
- The `pipeline_tag: text-generation` to correctly classify the model's functionality.
- A link to the GitHub repository and project blog.
- A "Usage" section with code examples directly from the GitHub README to demonstrate how to use the model for inference.

This will significantly improve the discoverability and usability of the model.

Files changed (1) hide show
  1. README.md +65 -1
README.md CHANGED
@@ -1,3 +1,67 @@
 
 
 
 
 
 
 
 
 
 
1
  Base Model: [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
2
 
3
- Training Data (Jacobi trajectories): https://huggingface.co/datasets/JacobiForcing/OpenCodeInstruct_training_data_n32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ library_name: transformers
5
+ ---
6
+
7
+ # Jacobi Forcing: Fast and Accurate Causal Parallel Decoding
8
+
9
+ This repository contains the `JacobiForcing_Coder_7B_v1` model, presented in the paper [Fast and Accurate Causal Parallel Decoding using Jacobi Forcing](https://huggingface.co/papers/2512.14681).
10
+
11
  Base Model: [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
12
 
13
+ Training Data (Jacobi trajectories): https://huggingface.co/datasets/JacobiForcing/OpenCodeInstruct_training_data_n32
14
+
15
+ Jacobi Forcing is a novel training technique that converts Large Language Models (LLMs) into native causal parallel decoders. This approach maintains the causal autoregressive backbone and addresses the AR-to-diffusion mismatch by training the model to handle noisy future blocks along its own Jacobi decoding trajectories.
16
+
17
+ It achieves up to $4.5\times$ higher tokens-per-forward and $4\times$ wall-clock speedup on coding and math tasks, while retaining near-AR generation quality.
18
+
19
+ You can find more details on the project blog: [Jacobi Forcing Blog](https://hao-ai-lab.github.io/blogs/jacobi-forcing/)
20
+ The official code repository is available here: [GitHub Repository](https://github.com/hao-ai-lab/JacobiForcing)
21
+
22
+ ## Usage
23
+
24
+ You can try the chatbot demo locally or use the provided Python inference code.
25
+
26
+ ### Local Chatbot Demo
27
+ ```bash
28
+ # modify the script to use your local path
29
+ streamlit run applications/jacobi_model_chat.py
30
+ ```
31
+
32
+ ### Inference with Code
33
+ You can use our provided `eagenerate` function for speedup generation, similar to using `generate` from Hugging Face. Here is an example:
34
+
35
+ ```python
36
+ from eagle.model.ea_model import EaModel
37
+ from fastchat.model import get_conversation_template
38
+ import torch
39
+
40
+ # Assuming base_model_path and EAGLE_model_path are defined
41
+ # For example:
42
+ base_model_path = "Qwen/Qwen2.5-Coder-7B-Instruct"
43
+ EAGLE_model_path = "JacobiForcing/JacobiForcing_Coder_7B_v1" # Or your local path to the weights
44
+
45
+ model = EaModel.from_pretrained(
46
+ base_model_path=base_model_path,
47
+ ea_model_path=EAGLE_model_path,
48
+ torch_dtype=torch.float16,
49
+ low_cpu_mem_usage=True,
50
+ device_map="auto",
51
+ total_token=-1 # Automatically configure draft tokens
52
+ )
53
+ model.eval()
54
+
55
+ your_message="Hello"
56
+ conv = get_conversation_template("vicuna") # Use appropriate conversation template for your base model
57
+ conv.append_message(conv.roles[0], your_message)
58
+ conv.append_message(conv.roles[1], None)
59
+ prompt = conv.get_prompt()
60
+
61
+ input_ids = model.tokenizer([prompt]).input_ids
62
+ input_ids = torch.as_tensor(input_ids).cuda()
63
+
64
+ output_ids = model.eagenerate(input_ids, temperature=0.5, max_new_tokens=512)
65
+ output = model.tokenizer.decode(output_ids[0])
66
+ print(output)
67
+ ```