File size: 1,612 Bytes
2467259
 
 
 
 
 
 
 
 
 
 
 
8eecec7
 
f2b39f8
8eecec7
 
 
 
 
2467259
d282cf5
8eecec7
 
 
 
 
 
 
f2b39f8
 
8eecec7
 
 
 
 
 
 
f2b39f8
 
8eecec7
 
 
 
 
 
 
 
 
4f5b807
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---

license: apache-2.0
datasets:
- google-research-datasets/paws
language:
- en
metrics:
- accuracy
base_model:
- HuggingFaceTB/SmolLM2-135M-Instruct
pipeline_tag: text-generation
---

# Introduction

![Running SLMs in web browsers](docs/thumb_small_language_model.jpg)

This repository is part of [playbook for experiments on fine tuning small language models](https://ashishware.com/2025/11/16/slm_in_browser/) using LoRA, exporting them to ONNX and running them locally using ONNX compatibale runtime  like javascript(node js) and WASM (browser)

### Before you start

- Clone the repository https://github.com/code2k13/onnx_javascript_browser_inference
- Copy all files from this repository to the `model_files` directory of the cloned github repository.
- Run `npm install`

### To run NodeJS example (NodeJS + onnx-runtime, server side)

- Simple run `node app.js`
This is what you should see

![NodeJS application showing paraphrasing screen](docs/slm_nodejs.gif)
![NodeJS runtime memory usage](node_runtime_ram.png)

### To run web browser based demo (WASM based in-browser inference)

- Simply access `web.html` from a local server (example `http://localhost:3000/web.html`)

This is what you should see 

![Web browser showing memory usage when running onnx model using WASM](docs/slm_web_wasm.gif)
![Web browser memory usage](docs/wasm_runtime_ram.png)

### Citation

```
@misc{allal2024SmolLM,
      title={SmolLM - blazingly fast and remarkably powerful}, 
      author={Loubna Ben Allal and Anton Lozhkov and Elie Bakouch and Leandro von Werra and Thomas Wolf},
      year={2024},
}
```