haneph033 commited on
Commit
2c4cdea
Β·
1 Parent(s): efc0d7e

update app first

Browse files
Files changed (4) hide show
  1. .gitignore +140 -0
  2. README.md +114 -5
  3. app.py +303 -0
  4. requirements.txt +10 -0
.gitignore ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+ build/
8
+ develop-eggs/
9
+ dist/
10
+ downloads/
11
+ eggs/
12
+ .eggs/
13
+ lib/
14
+ lib64/
15
+ parts/
16
+ sdist/
17
+ var/
18
+ wheels/
19
+ *.egg-info/
20
+ .installed.cfg
21
+ *.egg
22
+ MANIFEST
23
+
24
+ # PyInstaller
25
+ *.manifest
26
+ *.spec
27
+
28
+ # Installer logs
29
+ pip-log.txt
30
+ pip-delete-this-directory.txt
31
+
32
+ # Unit test / coverage reports
33
+ htmlcov/
34
+ .tox/
35
+ .coverage
36
+ .coverage.*
37
+ .cache
38
+ nosetests.xml
39
+ coverage.xml
40
+ *.cover
41
+ .hypothesis/
42
+ .pytest_cache/
43
+
44
+ # Translations
45
+ *.mo
46
+ *.pot
47
+
48
+ # Django stuff:
49
+ *.log
50
+ local_settings.py
51
+ db.sqlite3
52
+
53
+ # Flask stuff:
54
+ instance/
55
+ .webassets-cache
56
+
57
+ # Scrapy stuff:
58
+ .scrapy
59
+
60
+ # Sphinx documentation
61
+ docs/_build/
62
+
63
+ # PyBuilder
64
+ target/
65
+
66
+ # Jupyter Notebook
67
+ .ipynb_checkpoints
68
+
69
+ # pyenv
70
+ .python-version
71
+
72
+ # celery beat schedule file
73
+ celerybeat-schedule
74
+
75
+ # SageMath parsed files
76
+ *.sage.py
77
+
78
+ # Environments
79
+ .env
80
+ .venv
81
+ env/
82
+ venv/
83
+ ENV/
84
+ env.bak/
85
+ venv.bak/
86
+
87
+ # Spyder project settings
88
+ .spyderproject
89
+ .spyproject
90
+
91
+ # Rope project settings
92
+ .ropeproject
93
+
94
+ # mkdocs documentation
95
+ /site
96
+
97
+ # mypy
98
+ .mypy_cache/
99
+ .dmypy.json
100
+ dmypy.json
101
+
102
+ # Temporary files
103
+ *.tmp
104
+ *.temp
105
+ temp/
106
+ tmp/
107
+
108
+ # Audio files
109
+ *.mp3
110
+ *.wav
111
+ *.ogg
112
+ *.m4a
113
+
114
+ # Model files (if too large)
115
+ *.bin
116
+ *.safetensors
117
+
118
+ # Hugging Face cache
119
+ .cache/
120
+ huggingface/
121
+
122
+ # OS generated files
123
+ .DS_Store
124
+ .DS_Store?
125
+ ._*
126
+ .Spotlight-V100
127
+ .Trashes
128
+ ehthumbs.db
129
+ Thumbs.db
130
+
131
+ # IDE files
132
+ .vscode/
133
+ .idea/
134
+ *.swp
135
+ *.swo
136
+ *~
137
+
138
+ # Logs
139
+ logs/
140
+ *.log
README.md CHANGED
@@ -1,12 +1,121 @@
1
  ---
2
- title: Ersi
3
- emoji: πŸŒ–
4
- colorFrom: yellow
5
  colorTo: purple
6
  sdk: gradio
7
- sdk_version: 5.49.0
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Health Article Generator
3
+ emoji: πŸ₯
4
+ colorFrom: blue
5
  colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 5.43.1
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
+ # Health Article Generator
13
+
14
+ Aplikasi AI untuk generate artikel kesehatan menggunakan model Meta Llama 3.1 8B Instruct dengan fitur text-to-speech dan download audio MP3.
15
+
16
+ ## πŸš€ Features
17
+
18
+ - **AI Text Generation**: Menggunakan Meta Llama 3.1 8B Instruct untuk generate artikel kesehatan berkualitas tinggi
19
+ - **Topik Kesehatan**: 20+ topik kesehatan yang dapat dipilih
20
+ - **Customizable Length**: Pilihan panjang artikel (Pendek, Sedang, Panjang)
21
+ - **Subtopik**: Hingga 5 subtopik opsional untuk fokus artikel
22
+ - **Text-to-Speech**: Konversi artikel ke audio dengan Google TTS
23
+ - **Download Audio**: Download hasil audio dalam format MP3
24
+ - **Modern UI**: Interface yang user-friendly dengan Gradio 5.43.1
25
+
26
+ ## πŸ“‹ Requirements
27
+
28
+ - Python 3.8+
29
+ - CUDA (opsional, untuk GPU acceleration)
30
+ - 8GB+ RAM (untuk model Llama 3.1 8B)
31
+
32
+ ## πŸ› οΈ Installation
33
+
34
+ 1. Clone repository ini:
35
+ ```bash
36
+ git clone <repository-url>
37
+ cd ersi
38
+ ```
39
+
40
+ 2. Install dependencies:
41
+ ```bash
42
+ pip install -r requirements.txt
43
+ ```
44
+
45
+ 3. Run aplikasi:
46
+ ```bash
47
+ python app.py
48
+ ```
49
+
50
+ ## πŸš€ Deployment ke Hugging Face Spaces
51
+
52
+ 1. Buat akun di [Hugging Face](https://huggingface.co)
53
+ 2. Buat Space baru dengan tipe "Gradio"
54
+ 3. Upload semua file ke repository Space
55
+ 4. Set environment variables jika diperlukan
56
+ 5. Space akan otomatis deploy
57
+
58
+ ### File yang diperlukan untuk deployment:
59
+ - `app.py` - Aplikasi utama
60
+ - `requirements.txt` - Dependencies
61
+ - `README.md` - Dokumentasi
62
+ - `.gitignore` - Git ignore file
63
+
64
+ ## πŸ“– Cara Penggunaan
65
+
66
+ 1. **Pilih Topik**: Pilih topik kesehatan dari dropdown
67
+ 2. **Set Panjang**: Pilih panjang artikel yang diinginkan
68
+ 3. **Tambah Subtopik** (Opsional): Masukkan hingga 5 subtopik untuk fokus artikel
69
+ 4. **Generate**: Klik tombol "Generate Article"
70
+ 5. **Convert to Speech**: Klik "Convert to Speech" untuk generate audio
71
+ 6. **Download**: Download file MP3 yang dihasilkan
72
+
73
+ ## 🎯 Topik Kesehatan yang Tersedia
74
+
75
+ - Nutrisi dan Diet Sehat
76
+ - Olahraga dan Kebugaran
77
+ - Kesehatan Mental
78
+ - Penyakit Jantung
79
+ - Diabetes dan Gula Darah
80
+ - Kesehatan Pencernaan
81
+ - Kesehatan Kulit
82
+ - Kesehatan Mata
83
+ - Kesehatan Gigi dan Mulut
84
+ - Kesehatan Reproduksi
85
+ - Kesehatan Anak
86
+ - Kesehatan Lansia
87
+ - Pencegahan Kanker
88
+ - Kesehatan Tulang dan Sendi
89
+ - Kesehatan Pernapasan
90
+ - Kesehatan Hati
91
+ - Kesehatan Ginjal
92
+ - Kesehatan Saraf
93
+ - Kesehatan Kardiovaskular
94
+ - Kesehatan Imunitas
95
+
96
+ ## πŸ”§ Technical Details
97
+
98
+ - **Model**: Meta-Llama-3.1-8B-Instruct
99
+ - **Framework**: Gradio 5.43.1
100
+ - **TTS Engine**: Google Text-to-Speech (gTTS)
101
+ - **Audio Format**: MP3
102
+ - **Language**: Indonesian
103
+
104
+ ## πŸ“ Notes
105
+
106
+ - Model akan di-download otomatis saat pertama kali dijalankan
107
+ - Untuk performa terbaik, gunakan GPU dengan CUDA
108
+ - Audio generation membutuhkan koneksi internet untuk Google TTS
109
+ - File audio temporary akan dihapus otomatis
110
+
111
+ ## 🀝 Contributing
112
+
113
+ Pull requests dan suggestions sangat diterima! Untuk perubahan besar, silakan buka issue terlebih dahulu.
114
+
115
+ ## πŸ“„ License
116
+
117
+ MIT License - lihat file LICENSE untuk detail.
118
+
119
+ ## πŸ†˜ Support
120
+
121
+ Jika mengalami masalah, silakan buat issue di repository ini atau hubungi developer.
app.py ADDED
@@ -0,0 +1,303 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import torch
3
+ from transformers import AutoTokenizer, AutoModelForCausalLM
4
+ from gtts import gTTS
5
+ import tempfile
6
+ import os
7
+ import json
8
+ from typing import List, Optional
9
+
10
+ class HealthTextGenerator:
11
+ def __init__(self):
12
+ self.model = None
13
+ self.tokenizer = None
14
+ self.device = "cuda" if torch.cuda.is_available() else "cpu"
15
+
16
+ def load_model(self):
17
+ """Load the Llama 3.1 8B Instruct model"""
18
+ if self.model is None:
19
+ print("Loading Llama 3.1 8B Instruct model...")
20
+ model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"
21
+
22
+ self.tokenizer = AutoTokenizer.from_pretrained(model_name)
23
+ self.model = AutoModelForCausalLM.from_pretrained(
24
+ model_name,
25
+ torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,
26
+ device_map="auto" if self.device == "cuda" else None,
27
+ low_cpu_mem_usage=True
28
+ )
29
+
30
+ if self.device == "cpu":
31
+ self.model = self.model.to(self.device)
32
+
33
+ print("Model loaded successfully!")
34
+
35
+ def generate_health_text(self, topic: str, text_length: str, subtopics: List[str]) -> str:
36
+ """Generate health-related text based on topic and subtopics"""
37
+ if self.model is None:
38
+ self.load_model()
39
+
40
+ # Prepare subtopics text
41
+ subtopics_text = ""
42
+ if subtopics and any(subtopic.strip() for subtopic in subtopics):
43
+ valid_subtopics = [s.strip() for s in subtopics if s.strip()]
44
+ if valid_subtopics:
45
+ subtopics_text = f" dengan fokus pada: {', '.join(valid_subtopics)}"
46
+
47
+ # Create prompt based on text length
48
+ length_instructions = {
49
+ "Pendek (100-200 kata)": "Buatlah artikel kesehatan yang singkat dan padat",
50
+ "Sedang (300-500 kata)": "Buatlah artikel kesehatan yang informatif dan detail",
51
+ "Panjang (600-1000 kata)": "Buatlah artikel kesehatan yang komprehensif dan mendalam"
52
+ }
53
+
54
+ prompt = f"""Buatlah artikel kesehatan tentang {topic}{subtopics_text}.
55
+ {length_instructions.get(text_length, "Buatlah artikel kesehatan yang informatif")}.
56
+
57
+ Pastikan artikel:
58
+ - Berisi informasi yang akurat dan bermanfaat
59
+ - Menggunakan bahasa Indonesia yang mudah dipahami
60
+ - Menyertakan tips praktis jika relevan
61
+ - Memiliki struktur yang jelas dengan paragraf yang terorganisir
62
+
63
+ Artikel:"""
64
+
65
+ # Tokenize and generate
66
+ inputs = self.tokenizer(prompt, return_tensors="pt").to(self.device)
67
+
68
+ with torch.no_grad():
69
+ outputs = self.model.generate(
70
+ **inputs,
71
+ max_new_tokens=1024,
72
+ temperature=0.7,
73
+ do_sample=True,
74
+ pad_token_id=self.tokenizer.eos_token_id,
75
+ eos_token_id=self.tokenizer.eos_token_id,
76
+ repetition_penalty=1.1
77
+ )
78
+
79
+ # Decode the generated text
80
+ generated_text = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
81
+
82
+ # Extract only the generated part (remove the prompt)
83
+ generated_text = generated_text[len(prompt):].strip()
84
+
85
+ return generated_text
86
+
87
+ def text_to_speech(self, text: str, language: str = "id") -> str:
88
+ """Convert text to speech and return the audio file path"""
89
+ if not text.strip():
90
+ return None
91
+
92
+ try:
93
+ # Create temporary file
94
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".mp3") as tmp_file:
95
+ temp_path = tmp_file.name
96
+
97
+ # Generate speech
98
+ tts = gTTS(text=text, lang=language, slow=False)
99
+ tts.save(temp_path)
100
+
101
+ return temp_path
102
+ except Exception as e:
103
+ print(f"Error in text-to-speech: {e}")
104
+ return None
105
+
106
+ # Initialize the generator
107
+ generator = HealthTextGenerator()
108
+
109
+ # Health topics
110
+ HEALTH_TOPICS = [
111
+ "Nutrisi dan Diet Sehat",
112
+ "Olahraga dan Kebugaran",
113
+ "Kesehatan Mental",
114
+ "Penyakit Jantung",
115
+ "Diabetes dan Gula Darah",
116
+ "Kesehatan Pencernaan",
117
+ "Kesehatan Kulit",
118
+ "Kesehatan Mata",
119
+ "Kesehatan Gigi dan Mulut",
120
+ "Kesehatan Reproduksi",
121
+ "Kesehatan Anak",
122
+ "Kesehatan Lansia",
123
+ "Pencegahan Kanker",
124
+ "Kesehatan Tulang dan Sendi",
125
+ "Kesehatan Pernapasan",
126
+ "Kesehatan Hati",
127
+ "Kesehatan Ginjal",
128
+ "Kesehatan Saraf",
129
+ "Kesehatan Kardiovaskular",
130
+ "Kesehatan Imunitas"
131
+ ]
132
+
133
+ def generate_article(topic, text_length, subtopic1, subtopic2, subtopic3, subtopic4, subtopic5):
134
+ """Generate health article with given parameters"""
135
+ subtopics = [subtopic1, subtopic2, subtopic3, subtopic4, subtopic5]
136
+ subtopics = [s for s in subtopics if s and s.strip()]
137
+
138
+ try:
139
+ article = generator.generate_health_text(topic, text_length, subtopics)
140
+ return article, None
141
+ except Exception as e:
142
+ return f"Error generating article: {str(e)}", None
143
+
144
+ def convert_to_speech(text):
145
+ """Convert generated text to speech"""
146
+ if not text or not text.strip():
147
+ return None
148
+
149
+ try:
150
+ audio_path = generator.text_to_speech(text)
151
+ return audio_path
152
+ except Exception as e:
153
+ print(f"Error converting to speech: {e}")
154
+ return None
155
+
156
+ # Create Gradio interface
157
+ with gr.Blocks(
158
+ title="Health Article Generator",
159
+ theme=gr.themes.Soft(),
160
+ css="""
161
+ .gradio-container {
162
+ max-width: 1200px !important;
163
+ margin: auto !important;
164
+ }
165
+ .main-header {
166
+ text-align: center;
167
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
168
+ color: white;
169
+ padding: 2rem;
170
+ border-radius: 10px;
171
+ margin-bottom: 2rem;
172
+ }
173
+ """
174
+ ) as app:
175
+
176
+ gr.HTML("""
177
+ <div class="main-header">
178
+ <h1>πŸ₯ Health Article Generator</h1>
179
+ <p>Generate comprehensive health articles using AI and convert them to speech</p>
180
+ </div>
181
+ """)
182
+
183
+ with gr.Row():
184
+ with gr.Column(scale=1):
185
+ gr.Markdown("### βš™οΈ Settings")
186
+
187
+ topic = gr.Dropdown(
188
+ choices=HEALTH_TOPICS,
189
+ label="Pilih Topik Kesehatan",
190
+ value=HEALTH_TOPICS[0],
191
+ interactive=True
192
+ )
193
+
194
+ text_length = gr.Radio(
195
+ choices=["Pendek (100-200 kata)", "Sedang (300-500 kata)", "Panjang (600-1000 kata)"],
196
+ label="Panjang Artikel",
197
+ value="Sedang (300-500 kata)",
198
+ interactive=True
199
+ )
200
+
201
+ gr.Markdown("### πŸ“ Subtopik (Opsional)")
202
+ gr.Markdown("Tambahkan hingga 5 subtopik untuk fokus artikel:")
203
+
204
+ subtopic1 = gr.Textbox(
205
+ label="Subtopik 1",
206
+ placeholder="Contoh: Tips diet sehat",
207
+ interactive=True
208
+ )
209
+
210
+ subtopic2 = gr.Textbox(
211
+ label="Subtopik 2",
212
+ placeholder="Contoh: Makanan yang harus dihindari",
213
+ interactive=True
214
+ )
215
+
216
+ subtopic3 = gr.Textbox(
217
+ label="Subtopik 3",
218
+ placeholder="Contoh: Jadwal makan yang baik",
219
+ interactive=True
220
+ )
221
+
222
+ subtopic4 = gr.Textbox(
223
+ label="Subtopik 4",
224
+ placeholder="Contoh: Suplemen yang direkomendasikan",
225
+ interactive=True
226
+ )
227
+
228
+ subtopic5 = gr.Textbox(
229
+ label="Subtopik 5",
230
+ placeholder="Contoh: Olahraga pendukung",
231
+ interactive=True
232
+ )
233
+
234
+ generate_btn = gr.Button(
235
+ "πŸš€ Generate Article",
236
+ variant="primary",
237
+ size="lg"
238
+ )
239
+
240
+ with gr.Column(scale=2):
241
+ gr.Markdown("### πŸ“„ Generated Article")
242
+
243
+ article_output = gr.Textbox(
244
+ label="Artikel yang Dihasilkan",
245
+ lines=15,
246
+ max_lines=20,
247
+ interactive=False,
248
+ show_copy_button=True
249
+ )
250
+
251
+ with gr.Row():
252
+ tts_btn = gr.Button(
253
+ "πŸ”Š Convert to Speech",
254
+ variant="secondary"
255
+ )
256
+
257
+ download_audio = gr.File(
258
+ label="Download Audio (MP3)",
259
+ visible=False
260
+ )
261
+
262
+ gr.Markdown("### πŸ”Š Audio Player")
263
+ audio_player = gr.Audio(
264
+ label="Generated Audio",
265
+ type="filepath",
266
+ visible=False
267
+ )
268
+
269
+ # Event handlers
270
+ generate_btn.click(
271
+ fn=generate_article,
272
+ inputs=[topic, text_length, subtopic1, subtopic2, subtopic3, subtopic4, subtopic5],
273
+ outputs=[article_output, audio_player]
274
+ )
275
+
276
+ tts_btn.click(
277
+ fn=convert_to_speech,
278
+ inputs=[article_output],
279
+ outputs=[audio_player]
280
+ )
281
+
282
+ # Show download button when audio is generated
283
+ audio_player.change(
284
+ fn=lambda x: gr.update(visible=True, value=x) if x else gr.update(visible=False),
285
+ inputs=[audio_player],
286
+ outputs=[download_audio]
287
+ )
288
+
289
+ # Footer
290
+ gr.HTML("""
291
+ <div style="text-align: center; margin-top: 2rem; padding: 1rem; background-color: #f8f9fa; border-radius: 10px;">
292
+ <p><strong>Health Article Generator</strong> - Powered by Meta Llama 3.1 8B Instruct</p>
293
+ <p>Generate comprehensive health articles and convert them to speech for better accessibility</p>
294
+ </div>
295
+ """)
296
+
297
+ if __name__ == "__main__":
298
+ app.launch(
299
+ server_name="0.0.0.0",
300
+ server_port=7860,
301
+ share=False,
302
+ show_error=True
303
+ )
requirements.txt ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ gradio==5.43.1
2
+ transformers==4.45.2
3
+ torch==2.4.1
4
+ accelerate==0.33.2
5
+ sentencepiece==0.2.0
6
+ protobuf==4.25.5
7
+ gtts==2.5.1
8
+ pydub==0.25.1
9
+ requests==2.32.3
10
+ numpy==1.26.4