ANISH-j's picture
Update README.md (#1)
dd48e03 verified
metadata
license: apache-2.0

Models for Echo Application

This repository contains LiteRT-compatible language model variants used by the AI engine of the Echo application.
All models here are optimized and validated specifically for LiteRT adaptations of the framework on which the application AI engine is built.

The models listed below are standard, stable, and fully working variants used for chat functionality.

Repository link:
https://huggingface.co/ANISH-j/models-for-echo-application/tree/main


Supported Model Variants

1. Gemma3-1B-IT_multi-prefill-seq_q4_ekv4096.litertlm

  • Model family: Gemma 3
  • Size: 1B parameters
  • Quantization: Q4
  • Format: LiteRT model (.litertlm)
  • KV Cache: Extended KV (4096)
  • Features:
    • Multi-prefill sequence support
    • Optimized memory usage
    • Efficient long-context chat handling

Recommended for:
Chat scenarios requiring longer conversational context with optimized KV-cache performance.


2. gemma3-1b-it-int4.task

  • Model family: Gemma 3
  • Size: 1B parameters
  • Quantization: INT4
  • Format: LiteRT task model (.task)
  • Features:
    • Low-latency inference
    • Compact model size
    • Stable real-time chat performance

Recommended for:
Low-resource or latency-sensitive chat applications.


Framework Compatibility

  • Compatible with LiteRT runtime
  • Tested with the Echo application AI engine
  • Designed for instruction-tuned (IT) chat behavior
  • Not intended for direct PyTorch or TensorFlow usage without conversion

Repository Structure

models-for-echo-application/ β”œβ”€β”€ Gemma3-1B-IT_multi-prefill-seq_q4_ekv4096.litertlm β”œβ”€β”€ gemma3-1b-it-int4.task └── README.md


License

Licensed under the Apache License 2.0.
You may use, modify, and distribute these models in compliance with the license.