nbeerbower commited on
Commit
764ea73
·
verified ·
1 Parent(s): 16dbd4c

Update README: switch install to artemis-vlm v0.1.0 package

Browse files

Drops the old 'pip install merlina; from src.artemis_vlm import ...' hack now that the ArtemisVLM model classes live in the dedicated Schneewolf-Labs/Artemis repo. Also drops the explicit model.all_tied_weights_keys = {} workaround — fixed in artemis-vlm v0.1.0 directly. Switches to AutoModelForCausalLM.from_pretrained() now that __init__.py registers with HF AutoConfig/AutoModel.

Files changed (1) hide show
  1. README.md +21 -18
README.md CHANGED
@@ -76,33 +76,36 @@ and it sets up a real Stage-1 run.
76
 
77
  ## What's next
78
 
79
- - **A3** — full Stage-1 (~1M samples on BLIP3o-Long-Caption) + Stage-2
80
- multimodal instruction FFT with text-rehearsal so the underlying A2
81
- text quality is retained.
82
- - **Artemis** the polished named release after A3.
 
 
 
 
 
 
 
 
 
 
 
 
83
 
84
  ## Usage
85
 
86
  ```python
87
  import torch
88
- from transformers import AutoTokenizer
89
- # Requires the Schneewolf Labs Artemis VLM module:
90
- # pip install merlina # contains src.artemis_vlm
91
- # OR copy src/artemis_vlm.py from
92
- # https://github.com/Schneewolf-Labs/Merlina
93
- from src.artemis_vlm import (
94
- ArtemisVLMForConditionalGeneration,
95
- ArtemisVLMProcessor,
96
- )
97
 
98
- model = ArtemisVLMForConditionalGeneration.from_pretrained(
99
- "schneewolflabs/A3-preview", dtype=torch.bfloat16
100
  ).to("cuda").eval()
101
- # transformers 5.x compat (untied weights — see Merlina #79 follow-up):
102
- model.all_tied_weights_keys = {}
103
 
104
  tok = AutoTokenizer.from_pretrained("schneewolflabs/A3-preview")
105
- processor = ArtemisVLMProcessor(
106
  tokenizer=tok, vision_config=model.visual.config,
107
  min_pixels=32 * 32, max_pixels=512 * 512,
108
  )
 
76
 
77
  ## What's next
78
 
79
+ - **A3** — full Stage-1 (~1M samples on BLIP3o-Long-Caption) currently training on
80
+ a single NVIDIA GB10. A3 is the projector-aligned successor to A3-preview.
81
+ - **Artemis** — Stage-2 (multimodal instruction FFT with text rehearsal so A2's
82
+ reasoning / tool calling / identity survive). The named flagship multimodal
83
+ release after A3.
84
+
85
+ ## Install
86
+
87
+ ```bash
88
+ pip install 'artemis-vlm @ git+https://github.com/Schneewolf-Labs/Artemis.git@v0.1.0'
89
+ ```
90
+
91
+ The [`artemis-vlm`](https://github.com/Schneewolf-Labs/Artemis) package contains
92
+ the model definition, processor, and data collator. On import, it registers
93
+ `artemis_vlm` with HuggingFace AutoConfig and AutoModelForCausalLM so
94
+ `from_pretrained()` resolves without `trust_remote_code`.
95
 
96
  ## Usage
97
 
98
  ```python
99
  import torch
100
+ from transformers import AutoTokenizer, AutoModelForCausalLM
101
+ import artemis_vlm # registers ArtemisVLM with AutoConfig / AutoModel
 
 
 
 
 
 
 
102
 
103
+ model = AutoModelForCausalLM.from_pretrained(
104
+ "schneewolflabs/A3-preview", dtype=torch.bfloat16,
105
  ).to("cuda").eval()
 
 
106
 
107
  tok = AutoTokenizer.from_pretrained("schneewolflabs/A3-preview")
108
+ processor = artemis_vlm.ArtemisVLMProcessor(
109
  tokenizer=tok, vision_config=model.visual.config,
110
  min_pixels=32 * 32, max_pixels=512 * 512,
111
  )