Update README.md

Browse files

Files changed (1) hide show

README.md +1 -45

README.md CHANGED Viewed

@@ -14,48 +14,4 @@ library_name: transformers
 Used in [STree: Speculative Tree Decoding for Hybrid State-Space Models](https://arxiv.org/abs/2505.14969) as a draft model for speculative decoding for hybrid models.
-For more details on installation, training, and evaluation, please refer to the [GitHub repository](https://github.com/wyc1997/stree).
-## Usage
-You can use `EaModel.from_pretrained` for accelerated text generation, similar to `generate` from Hugging Face Transformers. Here is an example:
-```python
-import torch
-from eagle.model.ea_model import EaModel
-from fastchat.model import get_conversation_template
-# Load the base model and EAGLE acceleration model
-base_model_path = "JunxiongWang/Llama3.2-Mamba2-3B-distill" # Replace with your base model path
-EAGLE_model_path = "ycwu97/mamba2-distilled-small" # Replace with your EAGLE weights path
-model = EaModel.from_pretrained(
-    base_model_path=base_model_path,
-    ea_model_path=EAGLE_model_path,
-    torch_dtype=torch.float16,
-    low_cpu_mem_usage=True,
-    device_map="auto",
-    total_token=-1 # -1 for auto configuration of draft tokens
-)
-model.eval()
-# Prepare your message using a conversation template (e.g., Vicuna)
-your_message="Hello"
-conv = get_conversation_template("vicuna") # Use the correct chat template for your base model
-conv.append_message(conv.roles[0], your_message)
-conv.append_message(conv.roles[1], None)
-prompt = conv.get_prompt()
-# Tokenize the input prompt
-input_ids = model.tokenizer([prompt]).input_ids
-input_ids = torch.as_tensor(input_ids).cuda()
-# Generate output using EAGLE's accelerated decoding
-output_ids = model.eagenerate(input_ids, temperature=0.5, max_new_tokens=512)
-# Decode and print the generated text
-output = model.tokenizer.decode(output_ids[0])
-print(output)
-```
-**Note:** For chat models like Vicuna, LLaMA2-Chat, and LLaMA3-Instruct, you must use the correct chat template to ensure proper model output and EAGLE's performance.


14
15	Used in [STree: Speculative Tree Decoding for Hybrid State-Space Models](https://arxiv.org/abs/2505.14969) as a draft model for speculative decoding for hybrid models.
16
17	+ For more details on installation, training, and evaluation, please refer to the [GitHub repository](https://github.com/wyc1997/stree).