| # Chill Watcher | |
| consider deploy on: | |
| - huggingface inference point | |
| - replicate api | |
| - lightning.ai | |
| # platform comparison | |
| > all support autoscaling | |
| |platform|prediction speed|charges|deploy handiness| | |
| |-|-|-|-| | |
| |huggingface|fast:20s|high:$0.6/hr (without autoscaling)|easy:git push| | |
| |replicate|fast if used frequently: 30s, slow if needs initialization: 5min|low: $0.02 per generation|difficult: build image and upload| | |
| |lightning.ai|fast with app running: 20s, slow if idle: XXs|low: free $30 per month, $0.18 per init, $0.02 per run|easy: one command| | |
| # platform deploy options | |
| ## huggingface | |
| > [docs](https://huggingface.co/docs/inference-endpoints/guides/custom_handler) | |
| - requirements: use pip packages in `requirements.txt` | |
| - `init()` and `predict()` function: use `handler.py`, implement the `EndpointHandler` class | |
| - more: modify `handler.py` for requests and inference and explore more highly-customized features | |
| - deploy: git (lfs) push to huggingface repository(the whole directory including models and weights, etc.), and use inference endpoints to deploy. Click and deploy automaticly, very simple. | |
| - call api: use the url provide by inference endpoints after endpoint is ready(build, initialize and in a "running" state), make a post request to the url using request schema definied in the `handler.py` | |
| ## replicate | |
| > [docs](https://replicate.com/docs/guides/push-a-model) | |
| - requirements: specify all requirements(pip packages, system packages, python version, cuda, etc.) in `cog.yaml` | |
| - `init()` and `predict()` function: use `predict.py`, implement the `Predictor` class | |
| - more: modify `predict.py` | |
| - deploy: | |
| 1. get a linux GPU machine with 60GB disk space; | |
| 2. install [cog](https://replicate.com/docs/guides/push-a-model) and [docker](https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository) | |
| 3. `git pull` the current repository from huggingface, including large model files | |
| 4. after `predict.py` and `cog.yaml` is correctly coded, run `cog login`, `cog push`, then cog will build a docker image locally and push the image to replicate. As the image could take 30GB or so disk space, it would cost a lot network bandwidth. | |
| - call api: if everything runs successfully and the docker image is pushed to replicate, you will see a web-ui and an API example directly in your replicate repository | |
| ## lightning.ai | |
| > docs: [code](https://lightning.ai/docs/app/stable/levels/basic/real_lightning_component_implementations.html), [deploy](https://lightning.ai/docs/app/stable/workflows/run_app_on_cloud/) | |
| - requirements: | |
| - pip packages are listed in `requirements_lightning.txt`, because some requirements are different from those in huggingface. Rename it to `requirements.txt` | |
| - other pip packages, system packages and some big model weight files download commands, can be listed using a custom build config. Checkout `class CustomBuildConfig(BuildConfig)` in `app.py`. In a custom build config you can use many linux commands such as `wget` and `sudo apt-get update`. The custom build config will be executed on the `__init__()` of the `PythonServer` class | |
| - `init()` and `predict()` function: use `app.py`, implement the `PythonServer` class. Note: | |
| - some packages haven't been installed when the file is called(these packages may be installed when `__init__()` is called), so some import code should be in the function, not at the top of the file, or you may get import errors. | |
| - you can't save your own value to `PythonServer.self` unless it's predifined in the variables, so don't assign any self-defined variables to `self` | |
| - if you use the custom build config, you should implement `PythonServer`'s `__init()__` yourself, so don't forget to use the correct function signature | |
| - more: ... | |
| - deploy: | |
| - `pip install lightning` | |
| - prepare the directory on your local computer(no need to have a GPU) | |
| - list big files in the `.lightningignore` file to avoid big file upload and save deploy time cost | |
| - run `lightning run app app.py --cloud` in the local terminal, and it will upload the files in the directory to lightning cloud, and start deploying on the cloud | |
| - check error logs on the web-ui, use `all logs` | |
| - call api: only if the app starts successfully, you can see a valid url in the `settings` page of the web-ui. Open that url, and you can see the api | |
| ### some stackoverflow: | |
| install docker: | |
| - https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository | |
| install git-lfs: | |
| - https://github.com/git-lfs/git-lfs/blob/main/INSTALLING.md | |
| linux: | |
| ``` | |
| curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash | |
| sudo apt-get install git-lfs | |
| ``` | |
| --- | |
| license: apache-2.0 | |
| --- | |