Models for Echo Application

This repository contains LiteRT-compatible language model variants used by the AI engine of the Echo application.
All models here are optimized and validated specifically for LiteRT adaptations of the framework on which the application AI engine is built.

The models listed below are standard, stable, and fully working variants used for chat functionality.

Repository link:
https://huggingface.co/ANISH-j/models-for-echo-application/tree/main

Supported Model Variants

1. `Gemma3-1B-IT_multi-prefill-seq_q4_ekv4096.litertlm`

Model family: Gemma 3
Size: 1B parameters
Quantization: Q4
Format: LiteRT model (.litertlm)
KV Cache: Extended KV (4096)
Features:
- Multi-prefill sequence support
- Optimized memory usage
- Efficient long-context chat handling

Recommended for:
Chat scenarios requiring longer conversational context with optimized KV-cache performance.

2. `gemma3-1b-it-int4.task`

Model family: Gemma 3
Size: 1B parameters
Quantization: INT4
Format: LiteRT task model (.task)
Features:
- Low-latency inference
- Compact model size
- Stable real-time chat performance

Recommended for:
Low-resource or latency-sensitive chat applications.

Framework Compatibility

Compatible with LiteRT runtime
Tested with the Echo application AI engine
Designed for instruction-tuned (IT) chat behavior
Not intended for direct PyTorch or TensorFlow usage without conversion

Repository Structure

models-for-echo-application/ ├── Gemma3-1B-IT_multi-prefill-seq_q4_ekv4096.litertlm ├── gemma3-1b-it-int4.task └── README.md

License

Licensed under the Apache License 2.0.
You may use, modify, and distribute these models in compliance with the license.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support