Add requirements.txt and installation guide

- Complete requirements.txt with all dependencies
- Detailed installation guide in Portuguese
- Performance tuning instructions
- Troubleshooting section

Files changed (2) hide show

INSTALLATION_GUIDE.md +133 -0
requirements.txt +67 -0

INSTALLATION_GUIDE.md ADDED Viewed

	@@ -0,0 +1,133 @@

+# 🚀 Guia Completo de Instalação - LLaMA-Omni2
+## Pré-requisitos
+- Python 3.10+
+- CUDA 12.1+ (para GPU)
+- 24GB+ VRAM (recomendado RTX A5000 ou superior)
+- Ubuntu 20.04+ ou sistema compatível
+## Instalação Rápida
+### 1. Clone o repositório
+```bash
+git clone https://huggingface.co/marcosremar2/llama-omni2-official-code
+cd llama-omni2-official-code
+```
+### 2. Execute o script de instalação automática
+```bash
+chmod +x install.sh
+./install.sh
+```
+## Instalação Manual
+### 1. Crie ambiente virtual
+```bash
+python -m venv venv
+source venv/bin/activate  # Linux/Mac
+# ou
+venv\Scripts\activate  # Windows
+```
+### 2. Instale dependências
+```bash
+pip install --upgrade pip
+pip install -r requirements.txt
+```
+### 3. Instale o projeto
+```bash
+pip install -e .
+```
+### 4. Baixe os modelos
+#### Whisper (Reconhecimento de Voz)
+```python
+import whisper
+model = whisper.load_model("base", download_root="models/")
+```
+#### Qwen 2.5 (LLM)
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct")
+model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct")
+```
+## Teste Rápido
+### 1. Teste básico do sistema
+```bash
+python simple_speech_chat.py
+```
+### 2. Teste com áudio
+```bash
+python generate_test_audios.py
+python test_latency_final.py
+```
+## Estrutura do Projeto
+```
+llama-omni2-official-code/
+├── llama_omni2/          # Módulo principal
+│   ├── model/           # Modelos e arquiteturas
+│   ├── serve/           # Servidor web e APIs
+│   └── inference/       # Scripts de inferência
+├── simple_speech_chat.py # Chat de voz simples
+├── install.sh           # Script de instalação
+├── requirements.txt     # Dependências Python
+└── pyproject.toml      # Configuração do projeto
+```
+## Configuração de Performance
+### Para melhor latência (< 1000ms)
+```python
+# Em simple_speech_chat.py
+whisper_model = "base"  # Mais rápido
+max_new_tokens = 20     # Respostas curtas
+temperature = 0.0       # Greedy decoding
+```
+### Para melhor qualidade
+```python
+whisper_model = "small"  # Mais preciso
+max_new_tokens = 50     # Respostas completas
+temperature = 0.7       # Mais criativo
+```
+## Solução de Problemas
+### CUDA não disponível
+```bash
+# Verifique CUDA
+python -c "import torch; print(torch.cuda.is_available())"
+# Reinstale PyTorch com CUDA
+pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121
+```
+### Erro de memória GPU
+- Reduza batch_size
+- Use modelo menor (Qwen 0.5B ao invés de 1.5B)
+- Use quantização (bitsandbytes)
+### Áudio não funciona
+```bash
+# Instale ffmpeg
+sudo apt-get update && sudo apt-get install -y ffmpeg
+# Teste gTTS
+python -c "from gtts import gTTS; tts = gTTS('teste', lang='pt'); tts.save('test.mp3')"
+```
+## Suporte
+- Repositório: https://huggingface.co/marcosremar2/llama-omni2-official-code
+- Issues: Abra uma issue no HuggingFace
+## Licença
+Apache 2.0 - Veja o arquivo LICENSE para detalhes

requirements.txt ADDED Viewed

	@@ -0,0 +1,67 @@

+# Core dependencies
+torch==2.3.1
+torchaudio==2.3.1
+transformers==4.43.4
+tokenizers==0.19.1
+sentencepiece==0.1.99
+# LLM dependencies
+accelerate==0.33.0
+peft==0.14.0
+bitsandbytes==0.45.0
+# Audio processing
+whisper
+openai-whisper
+faster-whisper
+soundfile
+librosa
+scipy
+webrtcvad-wheels
+# TTS engines
+gtts
+edge-tts
+pyttsx3
+# Web server
+gradio==5.3.0
+gradio_client==1.4.2
+fastapi
+uvicorn[standard]
+websockets
+httpx
+# Utilities
+numpy==1.26.4
+scikit-learn==1.2.2
+pydantic==2.7.0
+markdown2[all]==2.5.2
+shortuuid==1.0.13
+einops
+onnxscript
+omegaconf
+pypinyin>=0.44.0
+pytorch-lightning
+setuptools>=69.5.1
+conformer==0.3.2
+diffusers==0.30.3
+grpcio==1.67.0
+hydra-core==1.3.2
+hyperpyyaml
+inflect==7.0.0
+librosa==0.10.2
+matplotlib
+matcha-tts
+ModelScope
+networkx
+onnx==1.17.0
+onnxruntime-gpu==1.20.0
+pydub
+rich
+rotary-embedding-torch
+sounddevice
+tensorboard
+tiktoken==0.8.0
+tqdm
+WeTextProcessing