drawing

Table of Contents

  1. TL;DR
  2. Model Details
  3. Training Details
  4. Usage
  5. Evaluation
  6. Citation

TL;DR

Model Details

Model Description

  • Developed by: https://www.tii.ae
  • Model type: Causal decoder-only
  • Architecture: Hybrid Transformers + Mamba architecture
  • Language(s) (NLP): English
  • Number of Parameters: 90M
  • License: Falcon-LLM License

Training details

For more details about the training protocol of this model, please refer to the Falcon-H1-Tiny technical blogpost.

Usage

Transformers.js

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

npm i @huggingface/transformers

You can then use the model as follows:

import { pipeline, TextStreamer } from "@huggingface/transformers";

// Create a text generation pipeline
const generator = await pipeline(
  "text-generation",
  "onnx-community/Falcon-H1-Tiny-Multilingual-100M-Instruct-ONNX",
  { dtype: "q4", device: "webgpu" },
);

// Define the list of messages
const messages = [
  { role: "user", content: "What's the capital of France?" },
];

// Generate a response
const output = await generator(messages, {
  max_new_tokens: 512,
  do_sample: false,
  streamer: new TextStreamer(generator.tokenizer, {
    skip_prompt: true,
    skip_special_tokens: true,
  }),
});
console.log(output[0].generated_text.at(-1).content);

Evaluation

For detailed evaluation of Falcon-H1-Tiny series, please refer to our technical blogpost

Useful links

Citation

If the Falcon-H1-Tiny family of models were helpful to your work, feel free to give us a cite.

@misc{falcon_h1_tiny,
  title={Falcon-H1-Tiny: A series of extremely small, yet powerful language models redefining capabilities at small scale},
  author={Falcon-LLM Team},
  year={2026}, 
}
Downloads last month
307
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for onnx-community/Falcon-H1-Tiny-Multilingual-100M-Instruct-ONNX

Quantized
(7)
this model