MarcosFRGames commited on
Commit
a9f151a
·
verified ·
1 Parent(s): 0feaca0

Upload 5 files

Browse files
Files changed (5) hide show
  1. Dockerfile (1).txt +26 -0
  2. README (4).md +191 -0
  3. app (3).py +365 -0
  4. gitattributes.txt +35 -0
  5. requirements (2).txt +5 -0
Dockerfile (1).txt ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ WORKDIR /app
4
+
5
+ # Instalar dependencias del sistema necesarias para llama-cpp-python
6
+ RUN apt-get update && \
7
+ apt-get install -y --no-install-recommends \
8
+ build-essential \
9
+ curl \
10
+ && apt-get clean \
11
+ && rm -rf /var/lib/apt/lists/*
12
+
13
+ # Copiar requirements primero (para mejor cache de Docker)
14
+ COPY requirements.txt .
15
+
16
+ # Instalar dependencias de Python
17
+ RUN pip install --no-cache-dir -r requirements.txt
18
+
19
+ # Copiar aplicación
20
+ COPY app.py .
21
+
22
+ # Exponer puerto
23
+ EXPOSE 7860
24
+
25
+ # Comando de inicio
26
+ CMD ["python", "-m", "gunicorn", "--bind", "0.0.0.0:7860", "--workers", "1", "--timeout", "120", "app:app"]
README (4).md ADDED
@@ -0,0 +1,191 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Ollama API Space
3
+ emoji: 🚀
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ app_port: 7860
8
+ ---
9
+
10
+ # 🚀 Ollama API Space
11
+
12
+ A Hugging Face Space that provides a REST API interface for Ollama models, allowing you to run local LLMs through a web API.
13
+
14
+ ## 🌟 Features
15
+
16
+ - **Model Management**: List and pull Ollama models
17
+ - **Text Generation**: Generate text using any available Ollama model
18
+ - **REST API**: Simple HTTP endpoints for easy integration
19
+ - **Health Monitoring**: Built-in health checks and status monitoring
20
+ - **OpenWebUI Integration**: Compatible with OpenWebUI for a full chat interface
21
+
22
+ ## 🚀 Quick Start
23
+
24
+ ### 1. Deploy to Hugging Face Spaces
25
+
26
+ 1. Fork this repository or create a new Space
27
+ 2. Upload these files to your Space
28
+ 3. **No environment variables needed** - Ollama runs inside the Space!
29
+ 4. Wait for the build to complete (may take 10-15 minutes due to Ollama installation)
30
+
31
+ ### 2. Local Development
32
+
33
+ ```bash
34
+ # Clone the repository
35
+ git clone <your-repo-url>
36
+ cd ollama-space
37
+
38
+ # Install dependencies
39
+ pip install -r requirements.txt
40
+
41
+ # Install Ollama locally
42
+ curl -fsSL https://ollama.ai/install.sh | sh
43
+
44
+ # Start Ollama in another terminal
45
+ ollama serve
46
+
47
+ # Run the application
48
+ python app.py
49
+ ```
50
+
51
+ ## 📡 API Endpoints
52
+
53
+ ### GET `/api/models`
54
+ List all available Ollama models.
55
+
56
+ **Response:**
57
+ ```json
58
+ {
59
+ "status": "success",
60
+ "models": ["llama2", "codellama", "neural-chat"],
61
+ "count": 3
62
+ }
63
+ ```
64
+
65
+ ### POST `/api/models/pull`
66
+ Pull a model from Ollama.
67
+
68
+ **Request Body:**
69
+ ```json
70
+ {
71
+ "name": "llama2"
72
+ }
73
+ ```
74
+
75
+ **Response:**
76
+ ```json
77
+ {
78
+ "status": "success",
79
+ "model": "llama2"
80
+ }
81
+ ```
82
+
83
+ ### POST `/api/generate`
84
+ Generate text using a model.
85
+
86
+ **Request Body:**
87
+ ```json
88
+ {
89
+ "model": "llama2",
90
+ "prompt": "Hello, how are you?",
91
+ "temperature": 0.7,
92
+ "max_tokens": 100
93
+ }
94
+ ```
95
+
96
+ **Response:**
97
+ ```json
98
+ {
99
+ "status": "success",
100
+ "response": "Hello! I'm doing well, thank you for asking...",
101
+ "model": "llama2",
102
+ "usage": {
103
+ "prompt_tokens": 7,
104
+ "completion_tokens": 15,
105
+ "total_tokens": 22
106
+ }
107
+ }
108
+ ```
109
+
110
+ ### GET `/health`
111
+ Health check endpoint.
112
+
113
+ **Response:**
114
+ ```json
115
+ {
116
+ "status": "healthy",
117
+ "ollama_connection": "connected",
118
+ "available_models": 3
119
+ }
120
+ ```
121
+
122
+ ## 🔧 Configuration
123
+
124
+ ### Environment Variables
125
+
126
+ - `OLLAMA_BASE_URL`: URL to your Ollama instance (default: `http://localhost:11434` - **Ollama runs inside this Space!**)
127
+ - `MODELS_DIR`: Directory for storing models (default: `/models`)
128
+ - `ALLOWED_MODELS`: Comma-separated list of allowed models (default: all models)
129
+
130
+ **Note**: This Space now includes Ollama installed directly inside it, so you don't need an external Ollama instance!
131
+
132
+ ### Supported Models
133
+
134
+ By default, the following models are allowed:
135
+ - `llama2`
136
+ - `llama2:13b`
137
+ - `llama2:70b`
138
+ - `codellama`
139
+ - `neural-chat`
140
+
141
+ You can customize this list by setting the `ALLOWED_MODELS` environment variable.
142
+
143
+ ## 🌐 Integration with OpenWebUI
144
+
145
+ This Space is designed to work seamlessly with OpenWebUI. You can:
146
+
147
+ 1. Use this Space as a backend API for OpenWebUI
148
+ 2. Configure OpenWebUI to connect to this Space's endpoints
149
+ 3. Enjoy a full chat interface with your local Ollama models
150
+
151
+ ## 🐳 Docker Support
152
+
153
+ The Space includes a Dockerfile for containerized deployment:
154
+
155
+ ```bash
156
+ # Build the image
157
+ docker build -t ollama-space .
158
+
159
+ # Run the container
160
+ docker run -p 7860:7860 -e OLLAMA_BASE_URL=http://host.docker.internal:11434 ollama-space
161
+ ```
162
+
163
+ ## 🔒 Security Considerations
164
+
165
+ - The Space only allows access to models specified in `ALLOWED_MODELS`
166
+ - All API endpoints are publicly accessible (consider adding authentication for production use)
167
+ - The Space connects to your Ollama instance - ensure proper network security
168
+
169
+ ## 🚨 Troubleshooting
170
+
171
+ ### Common Issues
172
+
173
+ 1. **Connection to Ollama failed**: Check if Ollama is running and accessible
174
+ 2. **Model not found**: Ensure the model is available in your Ollama instance
175
+ 3. **Timeout errors**: Large models may take time to load - increase timeout values
176
+
177
+ ### Health Check
178
+
179
+ Use the `/health` endpoint to monitor the Space's status and Ollama connection.
180
+
181
+ ## 📝 License
182
+
183
+ This project is open source and available under the MIT License.
184
+
185
+ ## 🤝 Contributing
186
+
187
+ Contributions are welcome! Please feel free to submit a Pull Request.
188
+
189
+ ## 📞 Support
190
+
191
+ If you encounter any issues or have questions, please open an issue on the repository.
app (3).py ADDED
@@ -0,0 +1,365 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from flask import Flask, request, jsonify, Response
2
+ import os
3
+ import logging
4
+ import time
5
+ from llama_cpp import Llama
6
+ import requests
7
+ import tempfile
8
+
9
+ app = Flask(__name__)
10
+ logging.basicConfig(level=logging.INFO)
11
+
12
+ # CONFIGURACIÓN DE TOKENS
13
+ MAX_CONTEXT_TOKENS = 1024 * 8
14
+ MAX_GENERATION_TOKENS = 1024 * 4
15
+
16
+ MODELS = [
17
+ {
18
+ "url": "https://huggingface.co/Novaciano/Qwen2.5-0.5B-NSFW_Amoral_Christmas-GGUF/resolve/main/Qwen2.5-0.5b-NSFW_Amoral_Christmas.gguf",
19
+ "name": "qwen2.5-0.5b-nsfw-amoral-christmas"
20
+ },
21
+ {
22
+ "url": "https://huggingface.co/afrideva/dolphin-2_6-phi-2_oasst2_chatML_V2-GGUF/resolve/main/dolphin-2_6-phi-2_oasst2_chatml_v2.q4_k_m.gguf",
23
+ "name": "phi-2"
24
+ }
25
+ ]
26
+
27
+ class LLMManager:
28
+ def __init__(self, models_config):
29
+ self.models = {}
30
+ self.models_config = models_config
31
+ self.load_all_models()
32
+
33
+ def load_all_models(self):
34
+ """Cargar todos los modelos en RAM"""
35
+ for model_config in self.models_config:
36
+ try:
37
+ model_name = model_config["name"]
38
+ logging.info(f"🚀 Cargando modelo: {model_name}")
39
+
40
+ temp_path = self._download_model(model_config["url"])
41
+
42
+ actual_size = os.path.getsize(temp_path)
43
+ actual_gb = actual_size / (1024*1024*1024)
44
+ logging.info(f"📊 Tamaño descargado para {model_name}: {actual_gb:.2f} GB")
45
+
46
+ logging.info(f"🔄 Cargando {model_name} en RAM…")
47
+ llm_instance = Llama(
48
+ model_path=temp_path,
49
+ n_ctx=MAX_CONTEXT_TOKENS,
50
+ n_batch=128,
51
+ n_threads=6,
52
+ n_threads_batch=6,
53
+ use_mlock=True,
54
+ mmap=True,
55
+ low_vram=False,
56
+ vocab_only=False
57
+ )
58
+
59
+ os.remove(temp_path)
60
+
61
+ self.models[model_name] = {
62
+ "instance": llm_instance,
63
+ "loaded": True,
64
+ "config": model_config
65
+ }
66
+ logging.info(f"✅ Modelo {model_name} cargado")
67
+
68
+ except Exception as e:
69
+ logging.error(f"❌ Error cargando modelo {model_config['name']}: {e}")
70
+ self.models[model_config["name"]] = {
71
+ "instance": None,
72
+ "loaded": False,
73
+ "config": model_config,
74
+ "error": str(e)
75
+ }
76
+
77
+ def _download_model(self, model_url):
78
+ """Descargar modelo"""
79
+ temp_file = tempfile.NamedTemporaryFile(delete=False, suffix=".gguf")
80
+ temp_path = temp_file.name
81
+ temp_file.close()
82
+
83
+ logging.info("📥 Descargando modelo…")
84
+
85
+ response = requests.get(model_url, stream=True, timeout=300)
86
+ response.raise_for_status()
87
+
88
+ downloaded = 0
89
+ with open(temp_path, 'wb') as f:
90
+ for chunk in response.iter_content(chunk_size=8192):
91
+ if chunk:
92
+ f.write(chunk)
93
+ downloaded += len(chunk)
94
+
95
+ return temp_path
96
+
97
+ def get_model(self, model_name):
98
+ """Obtener instancia de modelo por nombre"""
99
+ return self.models.get(model_name)
100
+
101
+ def chat_completion(self, model_name, messages, **kwargs):
102
+ """Generar respuesta con modelo específico"""
103
+ model_data = self.get_model(model_name)
104
+
105
+ if not model_data or not model_data["loaded"]:
106
+ error_msg = f"Modelo {model_name} no cargado"
107
+ if model_data and "error" in model_data:
108
+ error_msg += f": {model_data['error']}"
109
+ return {"error": error_msg}
110
+
111
+ response = model_data["instance"].create_chat_completion(
112
+ messages=messages,
113
+ **kwargs
114
+ )
115
+
116
+ response["provider"] = "telechars-ai"
117
+ response["model"] = model_name
118
+ return response
119
+
120
+ def get_loaded_models(self):
121
+ """Obtener lista de modelos cargados"""
122
+ loaded = []
123
+ for name, data in self.models.items():
124
+ if data["loaded"]:
125
+ loaded.append(name)
126
+ return loaded
127
+
128
+ def get_all_models_status(self):
129
+ """Obtener estado de todos los modelos"""
130
+ status = {}
131
+ for name, data in self.models.items():
132
+ status[name] = {
133
+ "loaded": data["loaded"],
134
+ "url": data["config"]["url"]
135
+ }
136
+ if "error" in data:
137
+ status[name]["error"] = data["error"]
138
+ return status
139
+
140
+ # Inicializar el gestor con todos los modelos
141
+ llm_manager = LLMManager(MODELS)
142
+
143
+ @app.route('/')
144
+ def home():
145
+ loaded_models = llm_manager.get_loaded_models()
146
+ status_html = "<ul>"
147
+ for model_name, model_data in llm_manager.models.items():
148
+ status = "✅ SÍ" if model_data["loaded"] else "❌ NO"
149
+ status_html += f"<li>{model_name}: {status}</li>"
150
+ status_html += "</ul>"
151
+
152
+ return f'''
153
+ <!DOCTYPE html>
154
+ <html>
155
+ <head>
156
+ <title>TeleChars AI API</title>
157
+ <style>
158
+ body {{ font-family: Arial, sans-serif; margin: 40px; }}
159
+ .config {{ background: #f0f0f0; padding: 15px; border-radius: 5px; margin-bottom: 20px; }}
160
+ .endpoint {{ background: #e8f4f8; padding: 10px; border-left: 4px solid #2196F3; margin: 10px 0; }}
161
+ </style>
162
+ </head>
163
+ <body>
164
+ <h1>TeleChars AI API</h1>
165
+
166
+ <div class="config">
167
+ <h3>⚙️ Configuración</h3>
168
+ <p><strong>Max Context Tokens:</strong> {MAX_CONTEXT_TOKENS}</p>
169
+ <p><strong>Max Generation Tokens:</strong> {MAX_GENERATION_TOKENS}</p>
170
+ </div>
171
+
172
+ <h2>📦 Modelos cargados:</h2>
173
+ {status_html}
174
+ <p>Total modelos: {len(loaded_models)}/{len(MODELS)}</p>
175
+
176
+ <h2>🔗 Endpoints disponibles:</h2>
177
+ <div class="endpoint">
178
+ <strong>GET /generate/&lt;mensaje&gt;[?params]</strong><br>
179
+ Devuelve solo el texto generado. Parámetros opcionales:<br>
180
+ • system= (instrucciones del sistema)<br>
181
+ • temperature= (0.0-2.0)<br>
182
+ • top_p= (0.0-1.0)<br>
183
+ • model= (nombre del modelo)<br>
184
+ • max_tokens= (máximo tokens a generar, default: {MAX_GENERATION_TOKENS})
185
+ </div>
186
+
187
+ <div class="endpoint">
188
+ <strong>POST /v1/chat/completions</strong><br>
189
+ Compatible con OpenAI API
190
+ </div>
191
+
192
+ <div class="endpoint">
193
+ <strong>GET /health</strong><br>
194
+ Estado del servicio
195
+ </div>
196
+
197
+ <div class="endpoint">
198
+ <strong>GET /models</strong><br>
199
+ Lista todos los modelos disponibles
200
+ </div>
201
+ </body>
202
+ </html>
203
+ '''
204
+
205
+ @app.route('/v1/chat/completions', methods=['POST'])
206
+ def chat_completions():
207
+ try:
208
+ data = request.get_json()
209
+ messages = data.get('messages', [])
210
+ model_name = data.get('model', MODELS[0]["name"])
211
+
212
+ if model_name not in llm_manager.models:
213
+ return jsonify({"error": f"Modelo '{model_name}' no encontrado. Modelos disponibles: {list(llm_manager.models.keys())}"}), 400
214
+
215
+ kwargs = {}
216
+ for key in data.keys():
217
+ if key not in ['messages', 'model']:
218
+ kwargs[key] = data[key]
219
+
220
+ # Aplicar límite de tokens si no se especifica
221
+ if 'max_tokens' not in kwargs:
222
+ kwargs['max_tokens'] = MAX_GENERATION_TOKENS
223
+
224
+ result = llm_manager.chat_completion(model_name, messages, **kwargs)
225
+
226
+ if "error" in result:
227
+ return jsonify(result), 500
228
+
229
+ return jsonify(result), 200
230
+
231
+ except Exception as e:
232
+ return jsonify({"error": str(e)}), 500
233
+
234
+ @app.route('/generate/<path:user_message>', methods=['GET'])
235
+ def generate_endpoint(user_message):
236
+ """Endpoint GET para generar respuestas - Devuelve solo texto"""
237
+ try:
238
+ # Obtener parámetros GET con valores por defecto
239
+ system_instruction = request.args.get('system', 'Eres un asistente útil.')
240
+ temperature = float(request.args.get('temperature', 0.7))
241
+ top_p = float(request.args.get('top_p', 0.95))
242
+ model_name = request.args.get('model', MODELS[0]["name"])
243
+ max_tokens = int(request.args.get('max_tokens', MAX_GENERATION_TOKENS))
244
+
245
+ # Validar rangos
246
+ if not 0 <= temperature <= 2:
247
+ return Response(
248
+ f"Error: El parámetro 'temperature' debe estar entre 0 y 2",
249
+ status=400,
250
+ mimetype='text/plain'
251
+ )
252
+
253
+ if not 0 <= top_p <= 1:
254
+ return Response(
255
+ f"Error: El parámetro 'top_p' debe estar entre 0 y 1",
256
+ status=400,
257
+ mimetype='text/plain'
258
+ )
259
+
260
+ # Limitar max_tokens a la configuración máxima
261
+ if max_tokens > MAX_GENERATION_TOKENS:
262
+ max_tokens = MAX_GENERATION_TOKENS
263
+
264
+ # Validar que el modelo existe
265
+ if model_name not in llm_manager.models:
266
+ return Response(
267
+ f"Error: Modelo '{model_name}' no encontrado. Modelos disponibles: {', '.join(llm_manager.models.keys())}",
268
+ status=400,
269
+ mimetype='text/plain'
270
+ )
271
+
272
+ # Crear mensajes
273
+ messages = [
274
+ {"role": "system", "content": system_instruction},
275
+ {"role": "user", "content": user_message}
276
+ ]
277
+
278
+ # Configurar parámetros
279
+ kwargs = {
280
+ "temperature": temperature,
281
+ "top_p": top_p,
282
+ "max_tokens": max_tokens,
283
+ "stream": False
284
+ }
285
+
286
+ # Generar respuesta
287
+ result = llm_manager.chat_completion(model_name, messages, **kwargs)
288
+
289
+ if "error" in result:
290
+ return Response(
291
+ f"Error: {result['error']}",
292
+ status=500,
293
+ mimetype='text/plain'
294
+ )
295
+
296
+ response_text = result.get("choices", [{}])[0].get("message", {}).get("content", "")
297
+
298
+ if not response_text:
299
+ response_text = "No se generó respuesta"
300
+
301
+ # Devolver solo el texto plano
302
+ return Response(
303
+ response_text,
304
+ status=200,
305
+ mimetype='text/plain'
306
+ )
307
+
308
+ except ValueError as e:
309
+ return Response(
310
+ f"Error: Parámetros inválidos - {str(e)}. Asegúrate de que temperature, top_p y max_tokens sean números válidos.",
311
+ status=400,
312
+ mimetype='text/plain'
313
+ )
314
+ except Exception as e:
315
+ return Response(
316
+ f"Error: {str(e)}",
317
+ status=500,
318
+ mimetype='text/plain'
319
+ )
320
+
321
+ @app.route('/health', methods=['GET'])
322
+ def health():
323
+ loaded_models = llm_manager.get_loaded_models()
324
+ return jsonify({
325
+ "status": "healthy" if len(loaded_models) > 0 else "error",
326
+ "loaded_models": loaded_models,
327
+ "total_models": len(MODELS),
328
+ "config": {
329
+ "max_context_tokens": MAX_CONTEXT_TOKENS,
330
+ "max_generation_tokens": MAX_GENERATION_TOKENS
331
+ }
332
+ })
333
+
334
+ @app.route('/models', methods=['GET'])
335
+ def list_models():
336
+ """Endpoint para listar todos los modelos y su estado"""
337
+ return jsonify({
338
+ "available_models": MODELS,
339
+ "status": llm_manager.get_all_models_status(),
340
+ "config": {
341
+ "max_context_tokens": MAX_CONTEXT_TOKENS,
342
+ "max_generation_tokens": MAX_GENERATION_TOKENS
343
+ }
344
+ })
345
+
346
+ @app.route('/models/<model_name>', methods=['GET'])
347
+ def get_model_status(model_name):
348
+ """Endpoint para obtener el estado de un modelo específico"""
349
+ model_data = llm_manager.get_model(model_name)
350
+ if not model_data:
351
+ return jsonify({"error": f"Modelo '{model_name}' no encontrado"}), 404
352
+
353
+ return jsonify({
354
+ "model": model_name,
355
+ "loaded": model_data["loaded"],
356
+ "url": model_data["config"]["url"],
357
+ "error": model_data.get("error"),
358
+ "config": {
359
+ "max_context_tokens": MAX_CONTEXT_TOKENS,
360
+ "max_generation_tokens": MAX_GENERATION_TOKENS
361
+ }
362
+ })
363
+
364
+ if __name__ == '__main__':
365
+ app.run(host='0.0.0.0', port=7860, debug=False)
gitattributes.txt ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
requirements (2).txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ llama-cpp-python==0.3.1
2
+ gunicorn>=21.2.0
3
+ flask>=2.3.3
4
+ requests>=2.31.0
5
+ psutil>=5.9.6