File size: 1,231 Bytes
34ea7b1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
license: cc
language:
- en
base_model:
- Qwen/Qwen2.5-1.5B-Instruct
---

todos:
* check numerical output same as original VILA impl
* check training stablitiy
* save_pretrained()

already finished
* AutoModel.from_pretrained() / device_map auto to shard
* loading 
* fix recursive imports
* text conv 
* image + text conv: 
    * .generate() / .generate_content()
    * llava/cli/infer.py
    * tests/bash/test_inference.sh

## NVILA HF Comptatible Mode
Remote model loading example 

```python
from transformers import AutoConfig, AutoModel
from termcolor import colored

model_path = "Efficient-Large-Model/nvila_lite_3b_dev"
print("main_dev.py, loading from ", model_path)

# config = AutoConfig.from_pretrained(model_path, trust_remote_code=True)
# model = AutoModel.from_config(config, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True, device_map="auto")
res = model.generate_content([
    "how are you today?"
])
print(colored(res, "cyan", attrs=["bold"]))

print("---" * 40)

import PIL.Image
response = model.generate_content([
    PIL.Image.open("inference_test/test_data/caption_meat.jpeg"),
    "describe the image?"
])
print(colored(response, "cyan", attrs=["bold"]))
```