idioms / README.md
Nick Starkov
files added
67a207f
---
title: Idioms
emoji: ๐ŸŒ
colorFrom: pink
colorTo: blue
sdk: docker
pinned: false
short_description: Idioms API
---
# Idioms API
This is a Hugging Face Space that provides an API to retrieve 50 random idioms from the [UCSC-Admire/idiom-SFT-dataset-561-2024-12-06_00-40-30](https://huggingface.co/datasets/UCSC-Admire/idiom-SFT-dataset-561-2024-12-06_00-40-30) dataset. The API is built using Flask and runs in a Docker container.
## API Endpoint
- **URL**: `https://<username>-idioms.hf.space/api/idioms`
- **Method**: GET
- **Response**: JSON array of 50 idioms, each with the following fields:
- `idiom`: The idiom phrase (from the dataset's `compound` field).
- `example`: An example sentence using the idiom (from the dataset's `sentence` field).
- `definition`: The compound type description (from the `Compound Type` field in the dataset's `output` JSON, with quotes removed).
Example response:
```json
[
{
"idiom": "across the board",
"example": "The company implemented changes across the board to improve efficiency.",
"definition": "Literal - Direct spatial description of marking a line across a surface"
},
...
]
```
## Setup Instructions
1. **Clone the Repository**:
```bash
git clone https://huggingface.co/spaces/<username>/idioms
cd idioms
```
2. **File Structure**:
- `Dockerfile`: Defines the Docker container setup with Python 3.10-slim.
- `app.py`: Flask application that serves the `/api/idioms` endpoint.
- `requirements.txt`: Lists dependencies (`flask` and `datasets`).
- `README.md`: This file.
3. **Push Changes**:
```bash
git add .
git commit -m "Update app with idioms API"
git push
```
4. **Deploy**:
- Hugging Face Spaces automatically builds and deploys the Docker container.
- Check the Space's status in the Hugging Face interface.
5. **Test the API**:
```bash
curl https://<username>-idioms.hf.space/api/idioms
```
## Dependencies
- `flask==2.3.3`: Web framework for the API.
- `datasets==2.21.0`: Hugging Face library to load the dataset.
## Notes
- Ensure your Hugging Face account has access to the dataset.
- The `output` field in the dataset is parsed as JSON to extract `Compound Type`. If parsing fails, an empty string is returned for `definition`.
- The API runs on port 8000, which is standard for Hugging Face Spaces.
For issues, check the Space's logs in the Hugging Face interface.