sleeping4cat's picture
Update README.md
1d4dda8 verified
metadata
license: mit
base_model:
  - Qwen/Qwen2-Math-1.5B
language:
  - en
pipeline_tag: text-generation

Outlook

We have quantised the model in 8-bit to make it inferenceable in low-end GPU cards at scale. It was achieved thanks to llama.cpp library.