Spaces:

Prashant26am
/

llava-chat

Sleeping

App Files Files Community

llava-chat / README.md

Prashant26am

Move app.py to root directory for Hugging Face Space deployment

1ea681e 10 months ago

preview code

raw

history blame contribute delete

1.79 kB

	---
	title: LLaVA Chat
	emoji: 🖼️
	colorFrom: blue
	colorTo: indigo
	sdk: gradio
	sdk_version: 4.19.2
	app_file: app.py
	pinned: false
	license: mit
	---

	# LLaVA Chat

	A lightweight implementation of LLaVA (Large Language and Vision Assistant) optimized for Hugging Face Spaces deployment.

	## Features

	- Efficient model loading with 8-bit quantization
	- Memory-optimized inference
	- FastAPI backend with Gradio interface
	- Support for image understanding and visual conversations
	- Optimized for deployment on Hugging Face Spaces

	## Quick Start

	1. Visit the [Hugging Face Space](https://huggingface.co/spaces/Prashant26am/llava-chat)
	2. Upload an image
	3. Ask questions about the image
	4. Get AI-powered responses

	## Local Development

	1. Clone the repository:
	```bash
	git clone https://github.com/Prashant-ambati/llava-implementation.git
	cd llava-implementation
	```

	2. Install dependencies:
	```bash
	pip install -r requirements.txt
	```

	3. Run the application:
	```bash
	python llava-chat/app.py
	```

	## Model Architecture

	- Vision Model: CLIP ViT-Base
	- Language Model: TinyLlama-1.1B-Chat
	- Projection Layer: MLP with configurable hidden dimensions

	## Memory Optimization

	The implementation includes several memory optimization techniques:
	- 8-bit quantization for language model
	- Efficient image processing
	- Gradient checkpointing
	- Memory-efficient attention
	- Automatic mixed precision

	## API Endpoints

	- `POST /process_image`: Process an image with a prompt
	- `GET /status`: Check model and application status

	## License

	This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

	## Acknowledgments

	- Based on the paper "Visual Instruction Tuning" (NeurIPS 2023)
	- Uses models from Hugging Face Transformers
	- Built with FastAPI and Gradio