docs: fix decompress example (decompress is local-only; stream from Hub instead)
Browse files
README.md
CHANGED
|
@@ -55,20 +55,25 @@ model = BigSmallStreamingModel.from_pretrained(
|
|
| 55 |
|
| 56 |
Uses up to ~12× less VRAM than standard loading by streaming layers on demand.
|
| 57 |
|
| 58 |
-
## Stream straight from the Hub (no disk)
|
| 59 |
-
|
| 60 |
-
```python
|
| 61 |
-
import bigsmall
|
| 62 |
-
state_dict = bigsmall.stream_from_hub("wpferrell/phi-3.5-mini-instruct-bigsmall", device="cpu")
|
| 63 |
-
```
|
| 64 |
-
|
| 65 |
-
Decompresses directly from the HuggingFace CDN over HTTP range requests. With the default `cache=False`, no `.bs` file is ever written to disk (V10).
|
| 66 |
-
|
| 67 |
## Decompress to safetensors
|
| 68 |
|
| 69 |
-
```
|
| 70 |
-
|
| 71 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
```
|
| 73 |
|
| 74 |
## Original model
|
|
|
|
| 55 |
|
| 56 |
Uses up to ~12× less VRAM than standard loading by streaming layers on demand.
|
| 57 |
|
| 58 |
+
## Stream straight from the Hub (no disk)
|
| 59 |
+
|
| 60 |
+
```python
|
| 61 |
+
import bigsmall
|
| 62 |
+
state_dict = bigsmall.stream_from_hub("wpferrell/phi-3.5-mini-instruct-bigsmall", device="cpu")
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
Decompresses directly from the HuggingFace CDN over HTTP range requests. With the default `cache=False`, no `.bs` file is ever written to disk (V10).
|
| 66 |
+
|
| 67 |
## Decompress to safetensors
|
| 68 |
|
| 69 |
+
```python
|
| 70 |
+
import bigsmall
|
| 71 |
+
from safetensors.torch import save_file
|
| 72 |
+
|
| 73 |
+
# bigsmall decompress works on local .bs files, not Hub repos, so
|
| 74 |
+
# stream the weights from the Hub and write them out as safetensors.
|
| 75 |
+
state_dict = bigsmall.stream_from_hub("wpferrell/phi-3.5-mini-instruct-bigsmall", device="cpu")
|
| 76 |
+
save_file(state_dict, "phi-3.5-mini-instruct-bigsmall.safetensors")
|
| 77 |
```
|
| 78 |
|
| 79 |
## Original model
|