--- title: Text2vector emoji: 📊 colorFrom: purple colorTo: green sdk: docker pinned: false short_description: Create a vector embedding from text --- # Embedding API API to call an embedding model ([intfloat/multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large)) for generating multilingual text embeddings.
The embedding model takes a text string and converts it into 1024 dimension vector.
Using a `POST` request to the `/embed` endpoint with a list of texts, the API returns their corresponding embeddings.
A maximum of 2000 characters per text is enforced to avoid truncation, and thereby loss of information, by the tokenizer.
Each text must start with either "query: " or "passage: ".
The API is deployed at a Hugging Face Docker space:
[https://emilbm-text2vector.hf.space](https://emilbm-text2vector.hf.space) Otherwise, the Swagger UI can be acccessed at:
[https://emilbm-text2vector.hf.space/docs](https://emilbm-text2vector.hf.space/docs) ## Features - FastAPI-based REST API - `/embed` endpoint for generating embeddings from a list of texts - `/health` endpoint for checking the API status - Uses HuggingFace Transformers and PyTorch - Includes linting and unit tests - Dockerfile for containerization - CI/CD with GitHub Actions to build, lint, test, and deploy to Hugging Face ## Local Development ### Requirements - Python 3.12+ - [UV](https://docs.astral.sh/uv/) - (Optional) Docker ### Installation 1. **Clone the repository:** ```sh git clone https://github.com/EmilbMadsen/embedding-api.git cd embedding-api ``` 2. **Create a virtual environment and activate it:** ```sh uv venv source .venv/bin/activate ``` 3. **Install dependencies:** ```sh uv sync ``` ### Formatting, Linting and Unit Tests - **Formatting (with Black and Ruff) and linting (with Black, Ruff, and MyPy):** ```sh make format make lint ``` - **Run unit tests:** ```sh make test ``` ### Running Locally (without Docker) Start the API server with Uvicorn: ```sh uvicorn app.main:app --reload --port 7860 ``` ### Running Locally (with Docker) Build and start the API server with Docker: ```sh docker build -t embedding-api . docker run -p 7860:7860 embedding-api ``` ### Test the endpoint Test the endpoint with either: ```sh curl -X 'POST' \ 'http://127.0.0.1:7860/embed' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "texts": [ "query: what is the capital of France?", "passage: Paris is the capital of France." ] }' ``` Or through the Swagger UI. ## Usage ### Embed Endpoint - **POST** `/embed` - **Request Body:** ```json { "texts": [ "query: what is the capital of France?", "passage: Paris is the capital of France." ] } ``` - **Response:** ```json { "embeddings": [[...], [...]] } ``` ### Health Endpoint - **GET** `/health` - **Response:** ```json { "status": "ok" } ``` ## Project Structure ``` app/ main.py # FastAPI app embeddings.py # Embedding logic models.py # Request/response models logger.py # Logging setup tests/ test_api.py # API tests test_embeddings.py # Embedding tests ```