amazon
/

FalconLite2

Text Generation

Model card Files Files and versions

chenwuml commited on Nov 2, 2023

Commit

06ffb6e

·

1 Parent(s): 2b8ab24

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -30,7 +30,7 @@ FalconLite2 evolves from [FalconLite](https://huggingface.co/amazon/FalconLite),
 ## Deploy FalconLite2 on EC2 ##
 SSH login to an AWS `g5.12x` instance with the [Deep Learning AMI](https://aws.amazon.com/releasenotes/aws-deep-learning-ami-gpu-pytorch-2-0-ubuntu-20-04/).
-### Start TGI server
 ```bash
 git clone https://github.com/awslabs/extending-the-context-length-of-open-source-llms.git falconlite-dev
 cd falconlite-dev/falconlite2
@@ -67,7 +67,9 @@ python falconlite_client.py -l
 **Important** - When using FalconLite2 for inference for the first time, it may require a brief 'warm-up' period that can take 10s of seconds. However, subsequent inferences should be faster and return results in a more timely manner. This warm-up period is normal and should not affect the overall performance of the system once the initialisation period has been completed.
 ## Deploy FalconLite2 on Amazon SageMaker ##
-To deploy FalconLite2 on a SageMaker endpoint, please follow [this notebook](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/falconlite2/sm_deploy.ipynb) running on a SageMaker Notebook instance (e.g. `g5.xlarge`).
 ## Evalution Result ##
 We evaluated FalconLite2 against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer contexts.

 ## Deploy FalconLite2 on EC2 ##
 SSH login to an AWS `g5.12x` instance with the [Deep Learning AMI](https://aws.amazon.com/releasenotes/aws-deep-learning-ami-gpu-pytorch-2-0-ubuntu-20-04/).
+### Start TGI server-1.0.3
 ```bash
 git clone https://github.com/awslabs/extending-the-context-length-of-open-source-llms.git falconlite-dev
 cd falconlite-dev/falconlite2
 **Important** - When using FalconLite2 for inference for the first time, it may require a brief 'warm-up' period that can take 10s of seconds. However, subsequent inferences should be faster and return results in a more timely manner. This warm-up period is normal and should not affect the overall performance of the system once the initialisation period has been completed.
 ## Deploy FalconLite2 on Amazon SageMaker ##
+To deploy FalconLite2 on a SageMaker endpoint with TGI-1.0.3, please follow [this notebook](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/falconlite2/sm_deploy.ipynb) running on a SageMaker Notebook instance (e.g. `g5.xlarge`).
+To deploy FalconLite2 on a SageMaker endpoint with TGI-1.1.0, please follow [this notebook](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/falconlite2-tgi1.1.0/sm_deploy.ipynb) running on a SageMaker Notebook instance (e.g. `g5.xlarge`).
 ## Evalution Result ##
 We evaluated FalconLite2 against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer contexts.