Sadjad Alikhani
commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -171,7 +171,13 @@ else:
|
|
| 171 |
|
| 172 |
### 8. **Tokenize and Load the Model**
|
| 173 |
|
| 174 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 175 |
|
| 176 |
```python
|
| 177 |
from input_preprocess import tokenizer
|
|
@@ -189,6 +195,8 @@ print(f"Loading the LWM model on {device}...")
|
|
| 189 |
model = lwm.from_pretrained(device=device)
|
| 190 |
```
|
| 191 |
|
|
|
|
|
|
|
| 192 |
---
|
| 193 |
|
| 194 |
### 9. **Perform Inference**
|
|
|
|
| 171 |
|
| 172 |
### 8. **Tokenize and Load the Model**
|
| 173 |
|
| 174 |
+
Before we dive into tokenizing the dataset and loading the model, let's understand how the tokenization process is adapted to the wireless communication context. In this case, **tokenization** refers to segmenting each wireless channel into patches, similar to how Vision Transformers (ViTs) work with images. Each wireless channel is structured as a \(32 \times 32\) matrix, where rows represent antennas and columns represent subcarriers.
|
| 175 |
+
|
| 176 |
+
The tokenization process involves **dividing the channel matrix into patches**, with each patch containing information from 16 consecutive subcarriers. These patches are then **embedded** into a 64-dimensional space, providing the Transformer with a richer context for each patch. In this process, **positional encodings** are added to preserve the structural relationships within the channel, ensuring the Transformer captures both spatial and frequency dependencies.
|
| 177 |
+
|
| 178 |
+
If you choose to apply **Masked Channel Modeling (MCM)** during inference (by setting `gen_raw=False`), LWM will mask certain patches, as it did during pre-training. However, for standard inference, masking isn't necessary unless you want to test LWM's resilience to noisy inputs.
|
| 179 |
+
|
| 180 |
+
Now, let's move on to tokenize the dataset and load the pre-trained LWM model.
|
| 181 |
|
| 182 |
```python
|
| 183 |
from input_preprocess import tokenizer
|
|
|
|
| 195 |
model = lwm.from_pretrained(device=device)
|
| 196 |
```
|
| 197 |
|
| 198 |
+
With this setup, you're ready to pass your tokenized wireless channels through the pre-trained model, extracting rich, context-aware embeddings that are ready for use in downstream tasks.
|
| 199 |
+
|
| 200 |
---
|
| 201 |
|
| 202 |
### 9. **Perform Inference**
|