Spaces:
Running on CPU Upgrade
Running on CPU Upgrade
Update services/sqt_generator.py
Browse files
services/sqt_generator.py
CHANGED
|
@@ -22,12 +22,14 @@ class SQTGenerator:
|
|
| 22 |
"1. **Summarize:** Write a single, concise sentence that captures the absolute core purpose of the text.\n"
|
| 23 |
"2. **Categorize:** Identify 3-5 high-level conceptual tags for the content (e.g., 'ethics', 'code_library', 'philosophy').\n"
|
| 24 |
"3. **Synthesize SQT:** Based on your analysis, create a single, dense SQT. An SQT should be no more than 20 characters and use alphanumeric, special characters, and emojis to represent the core meaning.\n\n"
|
|
|
|
| 25 |
)
|
|
|
|
| 26 |
if context:
|
| 27 |
analysis_prompt += f"**Additional Context for Distillation:** {context}\n\n"
|
| 28 |
|
| 29 |
analysis_prompt += (
|
| 30 |
-
"Please provide the output as a JSON object with three keys: 'summary', 'tags', and '
|
| 31 |
"--- START OF RAW TEXT ---\n"
|
| 32 |
f"{text_content[:4000]}...\n" # Limit text to 4000 characters to prevent token limits
|
| 33 |
"--- END OF RAW TEXT ---"
|
|
|
|
| 22 |
"1. **Summarize:** Write a single, concise sentence that captures the absolute core purpose of the text.\n"
|
| 23 |
"2. **Categorize:** Identify 3-5 high-level conceptual tags for the content (e.g., 'ethics', 'code_library', 'philosophy').\n"
|
| 24 |
"3. **Synthesize SQT:** Based on your analysis, create a single, dense SQT. An SQT should be no more than 20 characters and use alphanumeric, special characters, and emojis to represent the core meaning.\n\n"
|
| 25 |
+
"4. **Classify Domain:** Identify the primary knowledge domain of this text (e.g. 'coding', 'math', 'chemistry', 'astrophysics', 'philosophy'). If none applies, use null."
|
| 26 |
)
|
| 27 |
+
|
| 28 |
if context:
|
| 29 |
analysis_prompt += f"**Additional Context for Distillation:** {context}\n\n"
|
| 30 |
|
| 31 |
analysis_prompt += (
|
| 32 |
+
"Please provide the output as a JSON object with three keys: 'summary', 'tags', 'sqt', and 'domain'.\n\n"
|
| 33 |
"--- START OF RAW TEXT ---\n"
|
| 34 |
f"{text_content[:4000]}...\n" # Limit text to 4000 characters to prevent token limits
|
| 35 |
"--- END OF RAW TEXT ---"
|