Spaces:
Paused
Paused
| title: LLMServer | |
| emoji: πΉ | |
| colorFrom: indigo | |
| colorTo: purple | |
| sdk: docker | |
| pinned: false | |
| # LLM Server | |
| This repository contains a FastAPI-based server that serves open-source Large Language Models from Hugging Face. | |
| ## Getting Started | |
| These instructions will help you set up and run the project on your local machine. | |
| ### Prerequisites | |
| - Python 3.10 or higher | |
| - Git | |
| ### Cloning the Repository | |
| Choose one of the following methods to clone the repository: | |
| #### HTTPS | |
| ```bash | |
| git clone https://huggingface.co/spaces/TeamGenKI/LLMServer | |
| cd project-name | |
| ``` | |
| #### SSH | |
| ```bash | |
| git clone git@hf.co:spaces/TeamGenKI/LLMServer | |
| cd project-name | |
| ``` | |
| ### Setting Up the Virtual Environment | |
| #### Windows | |
| ```bash | |
| # Create virtual environment | |
| python -m venv myenv | |
| # Activate virtual environment | |
| myenv\Scripts\activate | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| ``` | |
| #### Linux | |
| ```bash | |
| # Create virtual environment | |
| python -m venv myenv | |
| # Activate virtual environment | |
| source myenv/bin/activate | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| ``` | |
| #### macOS | |
| ```bash | |
| # Create virtual environment | |
| python3 -m venv myenv | |
| # Activate virtual environment | |
| source myenv/bin/activate | |
| # Install dependencies | |
| pip3 install -r requirements.txt | |
| ``` | |
| ### Running the Application | |
| Once you have set up your environment and installed the dependencies, you can start the FastAPI application: | |
| ```bash | |
| uvicorn main.app:app --reload | |
| ``` | |
| The API will be available at `http://localhost:8001` | |
| ### API Documentation | |
| Once the application is running, you can access: | |
| - Interactive API documentation (Swagger UI) at `http://localhost:8000/docs` | |
| - Alternative API documentation (ReDoc) at `http://localhost:8000/redoc` | |
| ### Deactivating the Virtual Environment | |
| When you're done working on the project, you can deactivate the virtual environment: | |
| ```bash | |
| deactivate | |
| ``` | |
| ## Contributing | |
| [Add contributing guidelines here] | |
| ## License | |
| [Add license information here] | |
| ## Project Structure | |
| ``` | |
| . | |
| βββ Dockerfile | |
| βββ main | |
| β βββ api.py | |
| β βββ app.py | |
| β βββ config.yaml | |
| β βββ env_template | |
| β βββ __init__.py | |
| β βββ logs | |
| β β βββ llm_api.log | |
| β βββ models | |
| β βββ __pycache__ | |
| β β βββ api.cpython-39.pyc | |
| β β βββ app.cpython-39.pyc | |
| β β βββ __init__.cpython-39.pyc | |
| β β βββ routes.cpython-39.pyc | |
| β βββ routes.py | |
| β βββ test_locally.py | |
| β βββ utils | |
| β βββ errors.py | |
| β βββ helpers.py | |
| β βββ __init__.py | |
| β βββ logging.py | |
| β βββ __pycache__ | |
| β β βββ helpers.cpython-39.pyc | |
| β β βββ __init__.cpython-39.pyc | |
| β β βββ logging.cpython-39.pyc | |
| β β βββ validation.cpython-39.pyc | |
| β βββ validation.py | |
| βββ README.md | |
| βββ requirements.txt | |
| ``` | |
| ERROR: | |
| INFO: 127.0.0.1:60874 - "POST /api/v1/model/download?model_name=microsoft%2FPhi-3.5-mini-instruct HTTP/1.1" 200 OK | |
| 2025-01-13 16:18:45,409 - api_routes - INFO - Received request to initialize model: microsoft/Phi-3.5-mini-instruct | |
| 2025-01-13 16:18:45,409 - llm_api - INFO - Initializing generation model: microsoft/Phi-3.5-mini-instruct | |
| 2025-01-13 16:18:45,412 - llm_api - INFO - Loading model from local path: main/models/Phi-3.5-mini-instruct | |
| The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead. | |
| Could not find the bitsandbytes CUDA binary at PosixPath('/home/aurelio/Desktop/Projects/LLMServer/myenv/lib/python3.13/site-packages/bitsandbytes/libbitsandbytes_cuda124.so') | |
| g++ (GCC) 14.2.1 20240910 | |
| Copyright (C) 2024 Free Software Foundation, Inc. | |
| This is free software; see the source for copying conditions. There is NO | |
| warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. | |
| 2025-01-13 16:18:45,982 - llm_api - ERROR - Failed to initialize generation model microsoft/Phi-3.5-mini-instruct: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): | |
| Dynamo is not supported on Python 3.13+ | |
| 2025-01-13 16:18:45,982 - api_routes - ERROR - Error initializing model: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): | |
| Dynamo is not supported on Python 3.13+ | |
| INFO: 127.0.0.1:38330 - "POST /api/v1/model/initialize?model_name=microsoft%2FPhi-3.5-mini-instruct HTTP/1.1" 500 Internal Server Error | |