wi-lab
/

lwm

@@ -1,92 +1,100 @@
-# LWM: Large Wireless Model
-This repository contains the implementation of **LWM** (Large Wireless Model), a pre-trained model for processing and extracting features from wireless communication datasets, specifically DeepMIMO. The instructions below will help you load DeepMIMO data, use the LWM model and weights, tokenize DeepMIMO scenario data, and generate either raw channels or the inferred LWM CLS or channel embeddings.
-## How to Use
-### LWM Inference
-1. **Clone the Repository**
-   Clone the Hugging Face repository to your local machine using the following code:
-    ```python
-    import subprocess
-    import os
-    import sys
-    import importlib.util
-    import torch
-    # Hugging Face public repository URL
-    repo_url = "https://huggingface.co/sadjadalikhani/LWM"
-    # Directory where the repo will be cloned
-    clone_dir = "./LWM"
-    # Step 1: Clone the repository if it hasn't been cloned already
-    if not os.path.exists(clone_dir):
-        print(f"Cloning repository from {repo_url} into {clone_dir}...")
-        result = subprocess.run(["git", "clone", repo_url, clone_dir], capture_output=True, text=True)
-        if result.returncode != 0:
-            print(f"Error cloning repository: {result.stderr}")
-            sys.exit(1)  # Exit on failure
-        print(f"Repository cloned successfully into {clone_dir}")
-    else:
-        print(f"Repository already cloned into {clone_dir}")
-    # Step 2: Add the cloned directory to Python path
-    sys.path.append(clone_dir)
-    # Step 3: Dynamic module import and function exposure
-    def import_functions_from_file(module_name, file_path):
-        try:
-            spec = importlib.util.spec_from_file_location(module_name, file_path)
-            module = importlib.util.module_from_spec(spec)
-            spec.loader.exec_module(module)
-            # Extract functions from the module and make them globally accessible
-            for function_name in dir(module):
-                if callable(getattr(module, function_name)) and not function_name.startswith("__"):
-                    globals()[function_name] = getattr(module, function_name)
-            return module
-        except FileNotFoundError:
-            print(f"Error: {file_path} not found!")
-            sys.exit(1)
-    # Step 4: Import necessary functions
-    import_functions_from_file("lwm_model", os.path.join(clone_dir, "lwm_model.py"))
-    import_functions_from_file("inference", os.path.join(clone_dir, "inference.py"))
-    import_functions_from_file("load_data", os.path.join(clone_dir, "load_data.py"))
-    import_functions_from_file("input_preprocess", os.path.join(clone_dir, "input_preprocess.py"))
-    print("All required functions imported successfully.")
-    ```
-2. **Load the LWM Model**
-   After cloning the repository, you can load the LWM model with the following code:
-    ```python
-    # Step 5: Load the LWM model (with flexibility for the device)
-    device = 'cuda' if torch.cuda.is_available() else 'cpu'
-    print(f"Loading the LWM model on {device}...")
-    model = LWM.from_pretrained(device=device)
-    ```
-3. **Load the DeepMIMO Dataset**
-   Load the DeepMIMO dataset with this code:
-    ```python
-    # Step 6: Load dataset (direct call, no module prefix)
-    print("Loading DeepMIMO dataset...")
-    deepmimo_data = load_DeepMIMO_data()
-    ```
-4. **Tokenize the DeepMIMO Dataset**
-After loading the dataset, you can tokenize the DeepMIMO dataset based on specific scenarios. The table below lists the available scenarios, their corresponding DeepMIMO pages, and relevant details:
 | **Scenario**  | **City**      | **Link to DeepMIMO Page**                                                                                     |
 |---------------|---------------|----------------------------------------------------------------------------------------------------------------|
@@ -104,7 +112,7 @@ After loading the dataset, you can tokenize the DeepMIMO dataset based on specif
 - **Paths**: 20
 #### **Tokenization Code**:
-You can adjust the number of scenarios by changing the `scenario_idxs`. In the example below, scenario 0 and 1 are selected.
 ```python
 # Step 7: Tokenize the dataset
@@ -113,39 +121,42 @@ print("Tokenizing the dataset...")
 preprocessed_chs = tokenizer(deepmimo_data, scenario_idxs, gen_raw=True)
 ```
-- Use the `scenario_idxs` variable to select specific scenarios from the DeepMIMO dataset.
-- The dataset will be tokenized according to the chosen scenarios and preprocessing configurations.
 ---
-This format separates the **scenarios**, **operational settings**, and the **code** clearly, making it more readable. The table provides a structured overview of the available scenarios with direct links to their respective pages on DeepMIMO.
-5. **LWM Inference**
-   Choose the type of data you want to generate from the tokenized dataset, such as `cls_emb`, `channel_emb`, or `raw`:
-    ```python
-    # Step 8: Generate the dataset for inference (direct call, no module prefix)
-    input_type = ['cls_emb', 'channel_emb', 'raw'][1]  # Modify input type as needed
-    dataset = dataset_gen(preprocessed_chs, input_type, model)
-    ```
-### Post-processing for Downstream Task
-1. **Use the Dataset in Downstream Tasks**
-   Finally, you can use the generated raw channels and their inferred LWM embeddings in your downstream tasks:
-    ```python
-    # Step 9: Print results
-    print(f"Dataset generated with shape: {dataset.shape}")
-    print("Inference completed successfully.")
-    ```
 ---
-### Requirements
-- Python 3.x
-- PyTorch
-- Git

+Here’s the enhanced and polished version of the entire `README.md` for your **LWM: Large Wireless Model** repository:
+---
+# 📡 **LWM: Large Wireless Model**
+Welcome to the **LWM** (Large Wireless Model) repository! This project hosts a pre-trained model designed to process and extract features from wireless communication datasets, specifically the **DeepMIMO** dataset. Follow the instructions below to clone the repository, load the data, and perform inference with LWM.
+---
+## 🛠 **How to Use**
+### 1. **Clone the Repository**
+To get started, clone the Hugging Face repository to your local machine with the following Python code:
+```python
+import subprocess
+import os
+import sys
+import importlib.util
+import torch
+# Hugging Face public repository URL
+repo_url = "https://huggingface.co/sadjadalikhani/LWM"
+# Directory where the repo will be cloned
+clone_dir = "./LWM"
+# Step 1: Clone the repository if it hasn't been cloned already
+if not os.path.exists(clone_dir):
+    print(f"Cloning repository from {repo_url} into {clone_dir}...")
+    result = subprocess.run(["git", "clone", repo_url, clone_dir], capture_output=True, text=True)
+    if result.returncode != 0:
+        print(f"Error cloning repository: {result.stderr}")
+        sys.exit(1)
+    print(f"Repository cloned successfully into {clone_dir}")
+else:
+    print(f"Repository already cloned into {clone_dir}")
+# Step 2: Add the cloned directory to Python path
+sys.path.append(clone_dir)
+# Step 3: Import necessary functions
+def import_functions_from_file(module_name, file_path):
+    try:
+        spec = importlib.util.spec_from_file_location(module_name, file_path)
+        module = importlib.util.module_from_spec(spec)
+        spec.loader.exec_module(module)
+        for function_name in dir(module):
+            if callable(getattr(module, function_name)) and not function_name.startswith("__"):
+                globals()[function_name] = getattr(module, function_name)
+        return module
+    except FileNotFoundError:
+        print(f"Error: {file_path} not found!")
+        sys.exit(1)
+# Step 4: Import functions from the repository
+import_functions_from_file("lwm_model", os.path.join(clone_dir, "lwm_model.py"))
+import_functions_from_file("inference", os.path.join(clone_dir, "inference.py"))
+import_functions_from_file("load_data", os.path.join(clone_dir, "load_data.py"))
+import_functions_from_file("input_preprocess", os.path.join(clone_dir, "input_preprocess.py"))
+print("All required functions imported successfully.")
+```
+---
+### 2. **Load the LWM Model**
+Once the repository is cloned, load the pre-trained **LWM** model using the following code:
+```python
+# Step 5: Load the LWM model (with flexibility for the device)
+device = 'cuda' if torch.cuda.is_available() else 'cpu'
+print(f"Loading the LWM model on {device}...")
+model = LWM.from_pretrained(device=device)
+```
+---
+### 3. **Load the DeepMIMO Dataset**
+Load the DeepMIMO dataset using the pre-defined loading function:
+```python
+# Step 6: Load dataset (direct call, no module prefix)
+print("Loading DeepMIMO dataset...")
+deepmimo_data = load_DeepMIMO_data()
+```
+---
+### 4. **Tokenize the DeepMIMO Dataset**
+Tokenize the dataset based on specific scenarios from DeepMIMO. Below is a list of available scenarios and their links for more information:
 | **Scenario**  | **City**      | **Link to DeepMIMO Page**                                                                                     |
 |---------------|---------------|----------------------------------------------------------------------------------------------------------------|
 - **Paths**: 20
 #### **Tokenization Code**:
+Select and tokenize specific scenarios by adjusting the `scenario_idxs`. In the example below, we select the first two scenarios.
 ```python
 # Step 7: Tokenize the dataset
 preprocessed_chs = tokenizer(deepmimo_data, scenario_idxs, gen_raw=True)
 ```
+- The dataset will be tokenized according to the selected scenarios and preprocessing configurations.
 ---
+### 5. **LWM Inference**
+Once the dataset is tokenized, generate either **raw channels** or the **inferred LWM embeddings** by choosing the input type.
+```python
+# Step 8: Generate the dataset for inference
+input_type = ['cls_emb', 'channel_emb', 'raw'][1]  # Modify input type as needed
+dataset = dataset_gen(preprocessed_chs, input_type, model)
+```
+You can choose between:
+- `cls_emb`: LWM CLS token embeddings
+- `channel_emb`: LWM channel embeddings
+- `raw`: Raw wireless channel data
+---
+## 🔄 **Post-processing for Downstream Task**
+### 1. **Use the Dataset in Downstream Tasks**
+Finally, use the generated dataset for your downstream tasks, such as classification, prediction, or analysis.
+```python
+# Step 9: Print results
+print(f"Dataset generated with shape: {dataset.shape}")
+print("Inference completed successfully.")
+```
 ---
+## 📋 **Requirements**
+- **Python 3.x**
+- **PyTorch**