| Since Huggingface has omitted to publish a standalone pytorch SmolLM2_360M_model.py to load and finetune and run inference of the released model weights and config at https://huggingface.co/HuggingFaceTB/SmolLM2-360M/ | |
| I have attempted to construct a pytorch model.py that can load and at least do inference mode using the published weights and config. One a functioning pytorch model.py is built, it may be possible to export a torchscript version of the SmolLM2 model that can be implemented on non-python hardware such as MPUs or Risc machines or Smartphones, in edge devices. The SmolLM2_360M_model.py runs but is unable to load the safetensors data. Here is the encountered error: | |
| C:\Users\User\OneDrive\Desktop\SmolLM2>python SmolLM2_360M_model_debugging.py | |
| Warning: SentencePiece not found, using rudimentary BPE tokenizer. Install SentencePiece for better performance. | |
| A module that was compiled using NumPy 1.x cannot be run in | |
| NumPy 2.1.3 as it may crash. To support both 1.x and 2.x | |
| versions of NumPy, modules must be compiled with NumPy 2.0. | |
| Some module may need to rebuild instead e.g. with 'pybind11>=2.12'. | |
| If you are a user of the module, the easiest solution will be to | |
| downgrade to 'numpy<2' or try to upgrade the affected module. | |
| We expect that some modules will need time to support NumPy 2. | |
| Traceback (most recent call last): File "C:\Users\User\OneDrive\Desktop\SmolLM2\SmolLM2_360M_model_debugging.py", line 470, in <module> | |
| model = SmolLM2_360M(config_path) | |
| File "C:\Users\User\OneDrive\Desktop\SmolLM2\SmolLM2_360M_model_debugging.py", line 243, in __init__ | |
| self.embed_tokens = nn.Embedding(self.vocab_size, self.hidden_size) | |
| File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\sparse.py", line 142, in __init__ | |
| self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs), | |
| C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\sparse.py:142: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:84.) | |
| self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs), | |
| An error occurred while loading weights: File does not contain tensor lm_head.weight | |
| C:\Users\User\OneDrive\Desktop\SmolLM2> | |
| So, what is the story with safetensors "File does not contain tensor lm_head.weight" | |
| Is there a python script for inspecting the safetensors file? | |
| Why does model.safetensors file "not contain tensor lm_head.weight"? | |
| # Help Needed: Building a Standalone PyTorch SmolLM2-360M Model | |
| The Hugging Face Hub hosts the SmolLM2-360M model ([HuggingFaceTB/SmolLM2-360M](https://huggingface.co/HuggingFaceTB/SmolLM2-360M/)), but currently lacks a standalone PyTorch `model.py` file for loading, fine-tuning, and inference. This limits the model's usability outside the Hugging Face ecosystem. | |
| I've started creating a `SmolLM2_360M_model.py` file to address this gap, aiming for compatibility with all SmolLM2 models. The initial goal is to enable inference using the published weights and config. A successful PyTorch implementation would pave the way for exporting a TorchScript version, broadening accessibility to non-Python environments like microcontrollers, RISC-V machines, smartphones, and other edge devices. | |
| **The Challenge:** | |
| While my `SmolLM2_360M_model.py` runs, it encounters problems loading the `safetensors` data. I'm receiving the following error: | |
| ``` | |
| # Insert the full error message here, including traceback. This will help others diagnose the problem quickly. | |
| # For example: | |
| Traceback (most recent call last): | |
| File "SmolLM2_360M_model.py", line 32, in <module> | |
| model.load_state_dict(torch.load("pytorch_model.bin")) | |
| File ".../python3.8/site-packages/torch/serialization.py", line 781, in load | |
| with _open_file_like(f, 'rb') as opened_file: | |
| FileNotFoundError: [Errno 2] No such file or directory: 'pytorch_model.bin' | |
| ``` | |
| **Call to Action:** | |
| I'm seeking assistance from experienced PyTorch developers to debug the loading issue and complete the `SmolLM2_360M_model.py` implementation. Your contributions will significantly expand the potential applications of SmolLM2. | |
| **Specific Areas Where Help is Needed:** | |
| * **Safetensors Loading:** Resolving the error encountered when loading the model weights from the safetensors file. | |
| * **Model Architecture Verification:** Confirming the correctness of the PyTorch model architecture based on the config file. | |
| * **Inference Implementation:** Ensuring the model can perform inference correctly. | |
| * **Fine-tuning Support (Optional):** Adding functionality for fine-tuning the model on downstream tasks. | |
| * **TorchScript Export (Optional):** Enabling export to TorchScript for deployment on resource-constrained devices. | |
| **How to Contribute:** | |
| 1. Fork the repository containing the `SmolLM2_360M_model.py` file. | |
| 2. Debug the code and implement the missing functionality. | |
| 3. Submit a pull request with your changes. | |
| By working together, we can make SmolLM2 more accessible and empower a wider range of users to leverage its capabilities. Thank you for your time and expertise! | |
| P.S. Here's a technical breakdown of the process for creating a TorchScript version of the model and deploying it to various platforms: | |
| **1. TorchScript Creation:** | |
| * **Trace or Script:** TorchScript offers two ways to convert your PyTorch model: tracing and scripting. Tracing records the operations performed on example inputs, creating a static graph. Scripting directly parses the model code, supporting control flow. Scripting is preferred if your model uses dynamic control flow. | |
| ```python | |
| # Tracing Example | |
| example_input = torch.randn(1, 3, 224, 224) # Example input | |
| traced_model = torch.jit.trace(model, example_input) | |
| # Scripting Example | |
| scripted_model = torch.jit.script(model) | |
| ``` | |
| * **Optimization (Optional):** TorchScript provides optimization passes to improve the performance of the exported model. | |
| ```python | |
| optimized_model = torch.jit.optimize_for_inference(scripted_model) | |
| ``` | |
| * **Saving:** Save the TorchScript model to a file. | |
| ```python | |
| torch.jit.save(optimized_model, "smolLM2_360m.pt") | |
| ``` | |
| **2. Deployment to Target Environments:** | |
| * **C++:** LibTorch, the C++ API for PyTorch, can load and execute TorchScript models. Integrate `libTorch` into your C++ application for microcontroller, RISC-V, or other edge device deployments. This typically involves compiling your C++ code and linking against `libTorch`. | |
| * **Android/iOS:** Use the respective PyTorch Mobile libraries for these platforms. These libraries offer optimized runtime environments for executing TorchScript models within mobile applications. | |
| * **Other Edge Devices:** Depending on the device and its capabilities, explore options like using a custom runtime, or if available, a cross-compilation toolchain to target the device from your development environment. | |
| **Example C++ Deployment (Simplified):** | |
| ```c++ | |
| #include <torch/script.h> | |
| int main() { | |
| // Load the TorchScript model | |
| torch::jit::script::Module module = torch::jit::load("smolLM2_360m.pt"); | |
| // Prepare input tensor | |
| // ... (Device-specific input tensor preparation) ... | |
| // Run inference | |
| std::vector<torch::jit::IValue> inputs; | |
| inputs.push_back(input_tensor); // Add input tensor(s) | |
| auto output = module.forward(inputs); | |
| // Process output | |
| // ... (Handle output tensor on the device) ... | |
| return 0; | |
| } | |
| ``` | |
| **Key Considerations:** | |
| * **Hardware Limitations:** Microcontrollers and other edge devices have limited resources. Model size and complexity may need adjustments (quantization, pruning) for optimal performance. | |
| * **Platform-Specific Tooling:** Each target platform has its own build system and toolchain. Familiarize yourself with these tools for successful deployment. | |
| * **Cross-Compilation:** If building directly on the target device isn't feasible, cross-compilation is necessary. This typically involves setting up a cross-compilation toolchain for the target architecture. | |
| * **Debugging:** Debugging on edge devices can be challenging. Thoroughly testing the TorchScript model within a more accessible environment (e.g., your development machine) before deploying is essential. | |
| This expanded explanation provides a more complete roadmap for creating and deploying TorchScript versions of the SmolLM2 model. Remember to consult the official PyTorch and LibTorch documentation for platform-specific instructions and best practices. | |
| --- | |
| license: apache-2.0 | |
| --- | |