MARC / README.md
peiranW's picture
Update README.md
9f0a248 verified
---
license: mit
---
# MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding
[![Paper](https://img.shields.io/badge/Paper-ICLR%202026-blue)](https://arxiv.org/pdf/2510.07915)
[![Paper](https://img.shields.io/badge/HF_Model-orange)](https://huggingface.co/collections/Memories-ai/marc)
[![Web](https://img.shields.io/badge/🌎_Website-MARC-blue.svg)](https://yunzeliu.github.io/MARC/)
[![Github](https://img.shields.io/badge/github-repo-blue?logo=github)](https://github.com/Gimlettt/MARC)
**MARC** (Memory-Augmented RL Token Compression), accepted at ICLR 2026.
## Quick Start
### Inference Example
```python
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
import torch
# Load model
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
"path/to/model",
torch_dtype=torch.bfloat16,
device_map="auto"
)
processor = AutoProcessor.from_pretrained("path/to/model")
# Prepare video input
messages = [{
"role": "user",
"content": [
{"type": "video", "video": "path/to/video.mp4"},
{"type": "text", "text": "What is happening in this video?"}
]
}]
# Generate with compression
inputs = processor(
messages=messages,
videos=videos,
compress=True, # Enable compression
return_tensors="pt"
).to("cuda")
outputs = model.generate(**inputs, compress=True, max_new_tokens=512)
response = processor.decode(outputs[0], skip_special_tokens=True)
```
See [inference_script/inference_example.py](inference_script/inference_example.py) for a complete example.