DeepMoji — ONNX
ONNX export of DeepMoji (via torchMoji).
Trained on 1.2 billion tweets to predict 64 emoji classes. The penultimate attention layer produces 2304-d emotional sentence embeddings.
Files
| File | Size | Description |
|---|---|---|
deepmoji_fp32.onnx |
90 MB | fp32, unrolled LSTM — fastest inference |
deepmoji_fp16.onnx |
46 MB | fp16 — ~2× smaller, same predictions |
deepmoji_int8.onnx |
51 MB | int8 dynamic quant, Loop-op LSTM |
vocabulary.json |
1.1 MB | 50 000-token vocabulary |
emoji_codes.json |
1.4 KB | 64 emoji index → char mapping |
Usage
Install the inference library:
pip install deepmoji-onnx
from deepmoji_onnx import DeepMojiONNX
dm = DeepMojiONNX.from_pretrained() # downloads fp32 automatically
dm = DeepMojiONNX.from_pretrained("fp16")
dm = DeepMojiONNX.from_pretrained("int8")
dm.top_emojis(["I love this so much!!!"], k=5)
# [{':hearts:': 0.17, ':blue_heart:': 0.10, ':heart:': 0.08, ...}]
probs = dm.predict(sentences) # (N, 64) softmax
feats = dm.encode(sentences) # (N, 64) — use feature-mode export for 2304-d
Model inputs / outputs
| Name | Shape | Dtype | Description |
|---|---|---|---|
| tokens | (B, T) int64 | int64 | Zero-padded token IDs |
| lengths | (B,) int64 | int64 | Actual sequence lengths |
| output | (B, 64) fp32 | float | Emoji softmax probabilities |
Dynamic axes: batch size and sequence length.
Conversion scripts
See the convert/ folder for the scripts used to generate these files.
# install export deps
pip install deepmoji-onnx[export]
# fp32
python convert/export_fp32.py --weights pytorch_model.bin --vocab vocabulary.json
# fp16
python convert/export_fp16.py --weights pytorch_model.bin --vocab vocabulary.json
# int8
python convert/export_int8.py --weights pytorch_model.bin --vocab vocabulary.json
The original pytorch_model.bin (~86 MB) can be downloaded from:
https://www.dropbox.com/s/q8lax9ary32c7t9/pytorch_model.bin?dl=1
Architecture
tokens (B, T)
└─ Embedding(50000, 256) + tanh
└─ BiLSTM[hard sigmoid](256→1024)
└─ BiLSTM[hard sigmoid](1024→1024)
└─ skip-concat [lstm_1 | lstm_0 | embed] → (B, T, 2304)
└─ Self-attention pool → (B, 2304)
└─ Linear(2304→64) + Softmax → (B, 64)
Hard sigmoid: clamp(0.2x + 0.5, 0, 1) — matches original Keras/torchMoji definition.
Citation
@inproceedings{felbo2017,
title={Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm},
author={Felbo, Bjarke and Mislove, Alan and Søgaard, Anders and Bengio, Samy and Lieber, Iyad},
booktitle={EMNLP},
year={2017}
}
License
ONNX conversion code: Apache 2.0. Original DeepMoji model weights: MIT (bfelbo/DeepMoji).