Model card: drop pipeline_tag (no HF hosted inference for .pte)
Browse files
README.md
CHANGED
|
@@ -3,7 +3,6 @@ license: other
|
|
| 3 |
license_name: futo-model-weights-license-1.0
|
| 4 |
license_link: LICENSE.md
|
| 5 |
library_name: executorch
|
| 6 |
-
pipeline_tag: token-classification
|
| 7 |
tags:
|
| 8 |
- swipe-typing
|
| 9 |
- gesture-typing
|
|
@@ -23,13 +22,13 @@ Mobile-oriented models for decoding swipe gestures into text.
|
|
| 23 |
<img src="https://huggingface.co/futo-org/futo-swipe/resolve/main/animations/swipe_demo_computer.gif" width="640" alt="Swipe decode of the word 'computer'">
|
| 24 |
</p>
|
| 25 |
|
| 26 |
-
See the paper
|
| 27 |
-
for details.
|
| 28 |
|
| 29 |
## Models
|
| 30 |
|
| 31 |
-
This repository contains 3 models that compose together.
|
| 32 |
-
Only the encoder is required
|
| 33 |
additional refinements, leveraging specific layout and language information.
|
| 34 |
The encoder can decode for **any** keyboard layout,
|
| 35 |
while the decoder is English/QWERTY-only and the language model is English-only.
|
|
@@ -161,8 +160,8 @@ greedy decode: computer
|
|
| 161 |
| **input** | `layout_keys` | `[1, 64, 2]` | Per-key `(x, y)` centers, padded to 64 keys |
|
| 162 |
| **input** | `layout_mask` | `[1, 64]` | Boolean mask of valid keys |
|
| 163 |
| **output** | `log_emissions` | `[1, 32, 65]` | Log-probabilities over 64 keys + blank |
|
| 164 |
-
| **output** | `coefficients` | `[1, 32, 64]` | Spectral coefficients |
|
| 165 |
-
| **output** | `lambda` | `[1, 32, 1]` | *Intention* gate |
|
| 166 |
|
| 167 |
The output time dimension is 32, half the 64 input points. The encoder
|
| 168 |
applies a 2× temporal downsample (a stride-2 adapter) inside the network, so
|
|
|
|
| 3 |
license_name: futo-model-weights-license-1.0
|
| 4 |
license_link: LICENSE.md
|
| 5 |
library_name: executorch
|
|
|
|
| 6 |
tags:
|
| 7 |
- swipe-typing
|
| 8 |
- gesture-typing
|
|
|
|
| 22 |
<img src="https://huggingface.co/futo-org/futo-swipe/resolve/main/animations/swipe_demo_computer.gif" width="640" alt="Swipe decode of the word 'computer'">
|
| 23 |
</p>
|
| 24 |
|
| 25 |
+
See the paper [(*coming soon*)](https://huggingface.co/futo-org/futo-swipe)
|
| 26 |
+
for more details.
|
| 27 |
|
| 28 |
## Models
|
| 29 |
|
| 30 |
+
This repository contains 3 CNN models that compose together.
|
| 31 |
+
Only the encoder is required. The decoder and language model are
|
| 32 |
additional refinements, leveraging specific layout and language information.
|
| 33 |
The encoder can decode for **any** keyboard layout,
|
| 34 |
while the decoder is English/QWERTY-only and the language model is English-only.
|
|
|
|
| 160 |
| **input** | `layout_keys` | `[1, 64, 2]` | Per-key `(x, y)` centers, padded to 64 keys |
|
| 161 |
| **input** | `layout_mask` | `[1, 64]` | Boolean mask of valid keys |
|
| 162 |
| **output** | `log_emissions` | `[1, 32, 65]` | Log-probabilities over 64 keys + blank |
|
| 163 |
+
| **output** | `coefficients` | `[1, 32, 64]` | Spectral coefficients (decoder features) |
|
| 164 |
+
| **output** | `lambda` | `[1, 32, 1]` | *Intention* gate (decoder features) |
|
| 165 |
|
| 166 |
The output time dimension is 32, half the 64 input points. The encoder
|
| 167 |
applies a 2× temporal downsample (a stride-2 adapter) inside the network, so
|