StarWorkXc commited on
Commit
0b3af84
·
verified ·
1 Parent(s): f99877c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +150 -0
README.md ADDED
@@ -0,0 +1,150 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Wall-X: Multimodal Foundation Model for Robotics
2
+
3
+ ## Model Description
4
+
5
+ Wall-X is a multimodal foundation model designed specifically for robotics applications, combining vision, language, and action capabilities. Built upon the Qwen2.5-3B-VL architecture, Wall-X incorporates specialized adaptations for robotic control tasks, enabling seamless integration of visual perception, natural language understanding, and action generation.
6
+
7
+ ## Key Features
8
+
9
+ - **Multimodal Integration**: Processes visual, textual, and proprioceptive information simultaneously
10
+ - **Action Generation**: Specialized for robotic control and manipulation tasks
11
+ - **Flexible Architecture**: Based on Qwen2.5-VL with custom adaptations for robotics
12
+ - **Mixture of Experts**: Utilizes MoE architecture for efficient computation
13
+ - **LeRobot Compatible**: Designed to work with LeRobot datasets and frameworks
14
+
15
+ ## Quick Start
16
+
17
+ ### Installation
18
+
19
+ ```bash
20
+ # Create conda environment
21
+ conda create --name wallx python=3.10
22
+ conda activate wallx
23
+
24
+ # Install base requirements
25
+ pip install torch torchvision transformers
26
+ pip install huggingface_hub
27
+
28
+ # Install Wall-X from GitHub
29
+ git clone https://github.com/X-Square-Robot/wall-x.git
30
+ cd wall-x
31
+ pip install -e .
32
+ ```
33
+
34
+ ### Basic Usage
35
+
36
+ ```python
37
+ import torch
38
+ from wall_x.model.qwen2_5_based.modeling_qwen2_5_vl_act import Qwen2_5_VLMoEForAction
39
+
40
+ # Load the model
41
+ model_path = "X-Square-Robot/wall-oss-flow" # or your local path
42
+ model = Qwen2_5_VLMoEForAction.from_pretrained(model_path)
43
+ model.eval()
44
+
45
+ # Configuration
46
+ device = "cuda" if torch.cuda.is_available() else "cpu"
47
+ model = model.to(device).bfloat16()
48
+
49
+ # Your inference code here...
50
+ ```
51
+
52
+ ## Supervised Fine-Tuning (SFT)
53
+
54
+ For training Wall-X on your robotics datasets, please refer to our comprehensive training guide:
55
+
56
+ **📖 [Training Documentation](https://github.com/X-Square-Robot/wall-x/blob/main/workspace/README.md)**
57
+
58
+ The training process includes:
59
+ - **Dataset Preparation**: How to prepare your robotics datasets in LeRobot format
60
+ - **Configuration Setup**: Detailed configuration for GPU setup, model paths, and robot DOF settings
61
+ - **Training Scripts**: Ready-to-use training scripts with proper hyperparameters
62
+
63
+ ### Quick Training Start
64
+
65
+ ```bash
66
+ # Run training (see workspace/README.md for detailed configuration)
67
+ bash ./workspace/lerobot_example/run.sh
68
+ ```
69
+
70
+ ## Inference
71
+
72
+ For detailed inference examples and model evaluation:
73
+
74
+ **📖 [Inference Documentation](https://github.com/X-Square-Robot/wall-x/blob/main/scripts/)**
75
+
76
+ ### Basic Inference Example
77
+
78
+ ```python
79
+ import torch
80
+ from wall_x.model.qwen2_5_based.modeling_qwen2_5_vl_act import Qwen2_5_VLMoEForAction
81
+
82
+ # Load model
83
+ model_path = "X-Square-Robot/wall-x"
84
+ model = Qwen2_5_VLMoEForAction.from_pretrained(model_path)
85
+ model.eval()
86
+
87
+ # Setup
88
+ batch_size = 1
89
+ seq_length = 50
90
+ device = "cuda" if torch.cuda.is_available() else "cpu"
91
+ model = model.to(device).bfloat16()
92
+
93
+ # Prepare inputs (example with synthetic data)
94
+ torch.manual_seed(0)
95
+ input_ids = torch.randint(0, len(model.processor.tokenizer), (batch_size, seq_length), dtype=torch.long)
96
+ attention_mask = torch.ones((batch_size, seq_length), dtype=torch.long)
97
+ moe_token_types = torch.zeros((batch_size, seq_length), dtype=torch.long)
98
+ position_ids = torch.arange(seq_length, dtype=torch.long).unsqueeze(0).expand(batch_size, -1)
99
+
100
+ # Robotics-specific inputs
101
+ proprioception = torch.randn((batch_size, 1, 20), dtype=torch.float32) # Joint states
102
+ agent_pos_mask = torch.ones((batch_size, 1, 20), dtype=torch.float32)
103
+ dof_mask = torch.ones((batch_size, 32, 20), dtype=torch.float32) # DOF mask
104
+ dataset_names = ["x2_normal"]
105
+
106
+ # Move to device
107
+ inputs = {
108
+ "input_ids": input_ids.to(device),
109
+ "attention_mask": attention_mask.to(device),
110
+ "moe_token_types": moe_token_types.to(device),
111
+ "position_ids": position_ids.to(device),
112
+ "proprioception": proprioception.to(device).bfloat16(),
113
+ "agent_pos_mask": agent_pos_mask.to(device).bfloat16(),
114
+ "dof_mask": dof_mask.to(device).bfloat16(),
115
+ "dataset_names": dataset_names,
116
+ "mode": "validate"
117
+ }
118
+
119
+ # Run inference
120
+ with torch.no_grad():
121
+ outputs = model(**inputs)
122
+ print(f"Output logits shape: {outputs.logits.shape}")
123
+ ```
124
+
125
+ ### Advanced Inference Scripts
126
+
127
+ For production-ready inference and evaluation scripts:
128
+
129
+ ```bash
130
+ # Basic inference test
131
+ python ./scripts/fake_inference.py
132
+
133
+ # Generate open-loop comparison plots
134
+ python ./scripts/draw_openloop_plot.py
135
+ ```
136
+
137
+ **📁 [View all inference scripts](https://github.com/X-Square-Robot/wall-x/tree/main/scripts)**
138
+
139
+ ## Complete Documentation
140
+
141
+ For comprehensive setup, training, and inference instructions:
142
+
143
+ ### 🚀 **[Visit our GitHub Repository](https://github.com/X-Square-Robot/wall-x)**
144
+
145
+ The repository contains:
146
+ - **Detailed Installation Guide**: Complete environment setup with all dependencies
147
+ - **Training Tutorials**: Step-by-step SFT process with LeRobot datasets
148
+ - **Inference Examples**: Multiple inference scripts and evaluation tools
149
+ - **Configuration Templates**: Ready-to-use configs for different robot setups
150
+ - **Troubleshooting Guide**: Common issues and solutions