KingOfThoughtFleuren commited on
Commit
703c8b9
·
verified ·
1 Parent(s): 94f30b2

Update services/sqt_generator.py

Browse files
Files changed (1) hide show
  1. services/sqt_generator.py +3 -1
services/sqt_generator.py CHANGED
@@ -22,12 +22,14 @@ class SQTGenerator:
22
  "1. **Summarize:** Write a single, concise sentence that captures the absolute core purpose of the text.\n"
23
  "2. **Categorize:** Identify 3-5 high-level conceptual tags for the content (e.g., 'ethics', 'code_library', 'philosophy').\n"
24
  "3. **Synthesize SQT:** Based on your analysis, create a single, dense SQT. An SQT should be no more than 20 characters and use alphanumeric, special characters, and emojis to represent the core meaning.\n\n"
 
25
  )
 
26
  if context:
27
  analysis_prompt += f"**Additional Context for Distillation:** {context}\n\n"
28
 
29
  analysis_prompt += (
30
- "Please provide the output as a JSON object with three keys: 'summary', 'tags', and 'sqt'.\n\n"
31
  "--- START OF RAW TEXT ---\n"
32
  f"{text_content[:4000]}...\n" # Limit text to 4000 characters to prevent token limits
33
  "--- END OF RAW TEXT ---"
 
22
  "1. **Summarize:** Write a single, concise sentence that captures the absolute core purpose of the text.\n"
23
  "2. **Categorize:** Identify 3-5 high-level conceptual tags for the content (e.g., 'ethics', 'code_library', 'philosophy').\n"
24
  "3. **Synthesize SQT:** Based on your analysis, create a single, dense SQT. An SQT should be no more than 20 characters and use alphanumeric, special characters, and emojis to represent the core meaning.\n\n"
25
+ "4. **Classify Domain:** Identify the primary knowledge domain of this text (e.g. 'coding', 'math', 'chemistry', 'astrophysics', 'philosophy'). If none applies, use null."
26
  )
27
+
28
  if context:
29
  analysis_prompt += f"**Additional Context for Distillation:** {context}\n\n"
30
 
31
  analysis_prompt += (
32
+ "Please provide the output as a JSON object with three keys: 'summary', 'tags', 'sqt', and 'domain'.\n\n"
33
  "--- START OF RAW TEXT ---\n"
34
  f"{text_content[:4000]}...\n" # Limit text to 4000 characters to prevent token limits
35
  "--- END OF RAW TEXT ---"