YagiASAFAS commited on
Commit
68775ac
·
verified ·
1 Parent(s): d78c1e3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -106
README.md CHANGED
@@ -82,112 +82,6 @@ The training data was aggregated from multiple sources:
82
  OpenAI API labeling was performed by combining Human-in-the-loop machine learning—where prompt engineering was applied to select the most accurate prompt—with the OpenAI API (gpt-4o-mini) to generate labels.
83
 
84
 
85
- #### Translation Prompt
86
- ```
87
- You are a professional translator.
88
- Your task is to translate the given text into English if it is not already in English.
89
- The text originates from Malaysian news articles and public commentary. Therefore, please pay extra attention to local language expressions, proper names, abbreviations, and cultural nuances that are specific to Malaysia.
90
- The possible source languages are Malay, Chinese, and Tamil.
91
- If the text is already in English, simply return None.
92
-
93
- Text: {text}
94
-
95
- Output the result in the following JSON format:
96
- {
97
- "translated_text": "<Translated text or None>"
98
- }
99
- ```
100
-
101
- #### Classification Prompt
102
- ```
103
- INSTRUCTION
104
- You are a classifier focusing on Malaysian news articles.
105
- Classify the following text according to 12 topics, and for each topic,
106
- assign exactly one of [unknown, negative, neutral, positive].
107
-
108
- The 12 topics are:
109
- 1. democracy
110
- 2. economy
111
- 3. race
112
- 4. leadership
113
- 5. development
114
- 6. corruption
115
- 7. political instability
116
- 8. safety
117
- 9. administration
118
- 10. education
119
- 11. religion
120
- 12. environment
121
-
122
- GUIDELINES:
123
- - If the article does not mention or imply anything about the topic, label it as "unknown".
124
- - If the article mentions the topic in a negative or critical way, label it as "negative".
125
- - If the article mentions the topic without clear negativity or positivity, label it as "neutral".
126
- - If the article mentions the topic in a clearly positive or supportive way, label it as "positive".
127
- - Return only the JSON object with the 12 keys, no extra explanation or text.
128
-
129
- EXAMPLES:
130
- [Example1]
131
- 1. Race: Malaysia's rich tapestry of cultures and ethnicities fosters a vibrant society where diversity is celebrated, promoting unity and mutual respect among its people.
132
- 2. Economy: The Malaysian economy is showing resilience and adaptability, with innovative sectors emerging that promise sustainable growth and increased opportunities for all citizens.
133
- 3. Development: Malaysia's commitment to sustainable development is evident in its investment in green technologies and infrastructure, paving the way for a brighter and more sustainable future for generations to come.
134
- Classify as:
135
- {{
136
- "democracy": "unknown",
137
- "economy": "positive"
138
- "race" "positive",
139
- "leadership": "unknown",
140
- "development: "positive",
141
- "corruption: "unknown",
142
- "political instability: "unknown",
143
- "safety: "unknown",
144
- "administration: "unknown",
145
- "education: "unknown",
146
- "religion: "unknown",
147
- "environment: "unknown"
148
- }}
149
-
150
- [Example2]
151
- Corruption remains a significant challenge in Malaysia, influencing various sectors and prompting ongoing discussions about governance and accountability. Addressing this issue is vital for fostering trust in public institutions and promoting sustainable development. #Malaysia #Corruption
152
- Classify as:
153
- {{
154
- "democracy": "unknown",
155
- "economy": "unknown",
156
- "race" "unknown",
157
- "leadership": "unknown",
158
- "development: "unknown",
159
- "corruption: "neutral",
160
- "political instability: "unknown",
161
- "safety: "unknown",
162
- "administration: "unknown",
163
- "education: "unknown",
164
- "religion: "unknown",
165
- "environment: "unknown"
166
- }}
167
-
168
- [Example3]
169
- The 13 May incident was an episode of Sino-Malay sectarian violence that took place in Kuala Lumpur, the capital of Malaysia, on 13 May 1969. The riot occurred in the aftermath of the 1969 Malaysian general election when opposition parties such as the Democratic Action Party and Gerakan made gains at the expense of the ruling coalition, the Alliance Party.
170
- Classify as:
171
- {{
172
- "democracy": "unknown",
173
- "economy": "unknown",
174
- "race" "negative",
175
- "leadership": "unknown",
176
- "development: "unknown",
177
- "corruption: "unknown",
178
- "political instability: "negative",
179
- "safety: "unknown",
180
- "administration: "unknown",
181
- "education: "unknown",
182
- "religion: "unknown",
183
- "environment: "unknown"
184
- }}
185
-
186
-
187
- TEXT:
188
- {text}
189
- ```
190
-
191
  #### Synthetic Data via Data Augmentation
192
  - **Method**: Synthetic data was generated to balance the dataset by augmenting underrepresented labels or sentiments.
193
 
 
82
  OpenAI API labeling was performed by combining Human-in-the-loop machine learning—where prompt engineering was applied to select the most accurate prompt—with the OpenAI API (gpt-4o-mini) to generate labels.
83
 
84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
  #### Synthetic Data via Data Augmentation
86
  - **Method**: Synthetic data was generated to balance the dataset by augmenting underrepresented labels or sentiments.
87