TMElyralab
/

lyraLLMs

Model card Files Files and versions

lyraLLMs / lyrallms /README.md

carsonhxsu

# This is a combination of 22 commits.

8453337 almost 2 years ago

|

history blame contribute delete

692 Bytes

	## `lyrallms` 能力矩阵

	\| \|Attn方法\| \|MEMOPT模式\| \|KVCache精度\| \|
	\|:----\|:----\|:----\|:----\|:----\|:----\|:----\|
	\| \|Unfused\|FlashAttn2\|W4A16\|W8A16\|FP16\|INT8\|
	\|LLaMA\|✅\|✅\|✅\|✅\|✅\|✅\|
	\|XVERSE\|✅\|✅\|✅\|✅\|✅\|✅\|
	\|Baichuan 1/2 (7B及13B)\|✅\|❌\|✅\|✅\|✅\|❌\|
	\|ChatGLM\|✅\|❌\|❌\|✅\|✅\|❌\|
	\|BELLE\|✅\|❌\|❌\|✅\|✅\|❌\|

	## `lyrallms` 使用

	### 校准 (Calibration)

	参考`calibration`文件夹下的[README.md](./calibration/README.md) 。

	### Python转换及调用加速模型

	#### LLaMA

	参考`LyraLlamaPy`文件夹下的[README.md](./LyraLlamaPy/README.md) 。

	#### Baichuan

	参考`LyraBaichuanPy`文件夹下的[README.md](./LyraLlamaPy/README.md) 。