doubleblind
/

DeepSeek-R1-Distill-QweNSA-1.5B

Model card Files Files and versions

doubleblind commited on May 23, 2025

Commit

99c33d6

·

verified ·

1 Parent(s): 6dabb43

Update README.md

Files changed (1) hide show

README.md +5 -6

README.md CHANGED Viewed

@@ -14,18 +14,17 @@ This repository contains remote code and weights for a **Native Sparse Attention
 To use this model, please ensure the following dependencies are installed:
-#### 1. Install the required sparse attention library from our custom fork:
 ```bash
 pip install git+https://github.com/fnite1604/native-sparse-attention-pytorch.git
 ```
-#### 2. Install other standard dependencies:
-These are handled automatically by the Transformers library and include:
 ```bash
-pip install transformers torch
 ```
-Note: We recommend using Python 3.8+ and PyTorch 2.0+ for compatibility.
 ### Example Usage
@@ -35,6 +34,6 @@ A `quick_start.py` script is included to help you get started with inference:
 python quick_start.py
 ```
-This will load the model and generate text based on a predefined prompt using Native Sparse Attention.

 To use this model, please ensure the following dependencies are installed:
+#### Install the required Native Sparse Attention library from our custom fork:
 ```bash
 pip install git+https://github.com/fnite1604/native-sparse-attention-pytorch.git
 ```
+#### Install standard dependencies:
 ```bash
+pip install transformers torch ...
 ```
+Note: We recommend using the latest stable of Pytorch (currently 2.7.0) with CUDA 12.6 and the latest available version of Transformers
 ### Example Usage
 python quick_start.py
 ```
+This will load the model and generate text based on a predefined prompt ("What is 1 + 1?") using our Native Sparse Attention enabled reasoning model.