praisethefool commited on
Commit
b467d9f
·
verified ·
1 Parent(s): 8949568

Create app.py

Browse files
Files changed (1) hide show
  1. app.py +342 -0
app.py ADDED
@@ -0,0 +1,342 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from datasets import load_dataset
2
+ import gradio as gr
3
+ import pandas as pd
4
+
5
+ dataset = load_dataset(
6
+ "praisethefool/dca-distroid_digest-issue_44",
7
+ keep_default_na=False)
8
+
9
+ df = pd.DataFrame(dataset['train'])
10
+
11
+ for x in df.columns:
12
+
13
+ if 'fields' in x:
14
+
15
+ y = x.replace('fields.', '')
16
+
17
+ y = y.lower()
18
+
19
+ df.rename({x:y}, axis = 'columns', inplace=True)
20
+ else:
21
+ y = x.lower()
22
+
23
+ df.rename({x:y}, axis = 'columns', inplace=True)
24
+
25
+ description = """
26
+ # Overview
27
+ This Gradio demo was developed to show how end-users could be empowered to create their own personalized feeds by customizing algorithmic recommendation systems.
28
+
29
+ In this demo, I focused on how readers of the Distroid Digest could personalize the [Distroid Digest](https://distroid.substack.com/) to fit their needs by customizing the Distroid Curator Algorithm (DCA),
30
+ the (planned) curation algorithm used by curators for curating Works collected in the Distroid Catalogue Knowledge Graph (DCKG) into the Distroid Digest newsletter issues.
31
+
32
+ The DCA’s objective is to produce a ranked feed of items (here being Works in the DCKG) that increase understanding of frontier information.
33
+ More specifically, increasing understanding of how to diagnose and improve the human-technology relationship.
34
+
35
+ We assume that items with higher scores have the potential to provide readers with a greater understanding of frontier information than items with a lower score.
36
+
37
+ For DCA Version 0.1 (V0.1), Works are rated based on the following six quality signals:
38
+
39
+ 1. ELI5: "The ability to explain complex topics in lay-man's terms",
40
+
41
+ 2. Implications: "The real, imagined, or theorized positive, neutral, or negative outcomes (or impacts) of frontier information (or discoveries, technologies, and cultures) on society, environment, economy, or in other areas." ,
42
+
43
+ 3. Idea Machine Intersectionality: "The number of idea machines the work is classified under.",
44
+
45
+ 4. Novelty: "New knowledge that moves the knowledge frontier.",
46
+
47
+ 5. Informative: "The content improved my understanding of a topic.", and
48
+
49
+ 6. Evergreen: "Knowledge that is applicable regardless of time or location".
50
+
51
+ # Dataset
52
+
53
+ I used the Works curated in [Distroid Digest Issue 44](https://distroid.substack.com/p/digest-issue-44-how-bluesky-works) for this demo.
54
+
55
+ # Objective Function
56
+
57
+ In this demo, the objective function is a weighted sum formula, where weights
58
+ between zero and twenty (0-20) are applied to the ratings of each signal.
59
+
60
+ # Functionality
61
+
62
+ 1. Users can customize the feed by setting the weights from zero to twenty (0-20) for each marker described above.
63
+ 2. Users can set a minimum DCA score for Works to be added to their feed.
64
+
65
+ # Tips & Tricks
66
+ If you think a signal should not be included in your feed, you can set that marker's weight to zero (0).
67
+
68
+ # Learn more
69
+
70
+ You can read more about the early work on the DCA [here](https://ledgerback.pubpub.org/pub/9ibht7wp/release/8).
71
+
72
+ For background information on recommender systems, please read [Recommender Systems 101](https://kgi.georgetown.edu/wp-content/uploads/2025/02/Recommender-Systems-101.pdf).
73
+
74
+ # Related Work
75
+ Related work on creating alternative feeds and newsletters can be found below:
76
+
77
+ 1. [Fedi-Feed](https://foryoufeed.vercel.app/login)
78
+ 2. [News Minimalist](https://www.newsminimalist.com/)
79
+ 3. [Building a Social Media Algorithm That Actually Promotes Societal Values](https://hai.stanford.edu/news/building-social-media-algorithm-actually-promotes-societal-values)
80
+ 4. [PDN Pro-Social with Smitha Milli: Ranking by User Value](https://www.youtube.com/watch?v=6ltsAT5RUrI)
81
+
82
+ # Outputs
83
+
84
+ 1. DCA Objective Function: The current objective function after the parameters are set.
85
+ 2. DCA Scores: A table of Works sorted by their DCA score in the Score column. Also includes the Work's title and url.
86
+ 3. Scores per Signal: A table showing the scores for each signal after setting the weights.
87
+
88
+ # Caveats
89
+
90
+ 1. The Works are pre-rated, so you cannot edit the ratings per marker.
91
+ 2. The Weights and Minimum Score ranges are pre-set.
92
+ 3. In the Scores per Signal tab, Idea Machine Interserctionality has been shortened to 'imi'.
93
+ """
94
+
95
+ def grad_wg_int(
96
+ w_nov, #Novelty Wgt
97
+ w_eve, #Evergreen,
98
+ w_inf, #Informative,
99
+ w_imi, #Implications
100
+ w_eli, #ELI5
101
+ w_imp, #Implications
102
+ min_score
103
+ ):
104
+
105
+ muse = []
106
+
107
+ weights = {
108
+ "w_nov": w_nov,
109
+ "w_eve": w_eve,
110
+ "w_inf": w_inf,
111
+ "w_imi": w_imi,
112
+ "w_eli": w_eli,
113
+ "w_imp": w_imp,
114
+ }
115
+
116
+ dc_algo_mk = zip(df['novelty'],
117
+ df['evergreen'],
118
+ df['informative'],
119
+ df['idea machine intersectionality'],
120
+ df['eli5'],
121
+ df['implications'],
122
+ df['title'],
123
+ df['url'],
124
+ )
125
+
126
+ for m_nov, m_eve, m_inf, m_imi, m_eli, m_imp, title, url in dc_algo_mk:
127
+
128
+ score_nov = weights['w_nov'] * int(m_nov)
129
+
130
+ score_eve = weights['w_eve'] * int(m_eve)
131
+
132
+ score_inf = weights['w_inf'] * int(m_inf)
133
+
134
+ score_imi = weights['w_imi'] * int(m_imi)
135
+
136
+ score_eli = weights['w_eli'] * int(m_eli)
137
+
138
+ score_imp = weights['w_imp'] * int(m_imp)
139
+
140
+
141
+ # need to save the weight and score for each marker into
142
+ # a table with key included
143
+ rank_sum = (score_nov +
144
+ score_eve +
145
+ score_inf +
146
+ score_imi +
147
+ score_eli +
148
+ score_imp)
149
+
150
+ rank_sum = round(float(rank_sum), 2)
151
+
152
+
153
+ score_rank = {
154
+ "Score": rank_sum,
155
+ "Title": title,
156
+ 'URL': url,
157
+ }
158
+
159
+ muse.append(score_rank)
160
+
161
+ tug = pd.DataFrame(muse)
162
+
163
+ tug = tug.query(f"Score >= {min_score}")
164
+
165
+ tug.sort_values('Score', ascending=False, inplace=True)
166
+
167
+ return tug
168
+
169
+ def grad_wg_int_scr(
170
+ w_nov, #Novelty Wgt
171
+ w_eve, #Evergreen,
172
+ w_inf, #Informative,
173
+ w_imi, #Idea Machine Intersectionality
174
+ w_eli, #ELI5
175
+ w_imp, #Implications
176
+ ):
177
+
178
+ weights = {
179
+ "w_nov": w_nov,
180
+ "w_eve": w_eve,
181
+ "w_inf": w_inf,
182
+ "w_imi": w_imi,
183
+ "w_eli": w_eli,
184
+ "w_imp": w_imp,
185
+ }
186
+
187
+ df_ = df.copy()
188
+
189
+ df_['novelty'] = df['novelty'] * weights['w_nov']
190
+
191
+ df_['evergreen'] = weights['w_eve'] * df['evergreen']
192
+
193
+ df_['informative'] = weights['w_inf'] * df['informative']
194
+
195
+ df_['idea machine intersectionality'] = df['idea machine intersectionality'] * weights['w_imi']
196
+
197
+ df_['eli5'] = df['eli5'] * weights['w_eli']
198
+
199
+ df_['implications'] = df['implications'] * weights['w_imp']
200
+
201
+ df_.rename({'idea machine intersectionality': 'imi'}, axis='columns', inplace=True)
202
+
203
+ df_.drop(['url', 'likeable'], axis='columns', inplace=True)
204
+
205
+ df_ = df_.round(1)
206
+
207
+ return df_
208
+
209
+ def grad_wg_int_form(
210
+ w_nov, #Novelty Wgt
211
+ w_eve, #Evergreen,
212
+ w_inf, #Informative,
213
+ w_imi, #Implications
214
+ w_eli, #ELI5
215
+ w_imp, #Implications
216
+
217
+ ):
218
+
219
+ weights = {
220
+ "w_nov": w_nov,
221
+ "w_eve": w_eve,
222
+ "w_inf": w_inf,
223
+ "w_imi": w_imi,
224
+ "w_eli": w_eli,
225
+ "w_imp": w_imp,
226
+ }
227
+
228
+ formula_a = f"""
229
+ ({weights['w_nov']} * Novelty) + ({weights['w_eve']} * Evergreen) + \n\n
230
+ ({weights['w_inf']} * Informative) + ({weights['w_eli']} * ELI5) + \n\n
231
+ ({weights['w_imp']} * Implications) + ({weights['w_imi']} * Idea Machine Intersectionality)
232
+ """
233
+
234
+ return formula_a
235
+
236
+ with gr.Blocks(fill_width=True) as demo:
237
+
238
+ gr.Markdown('# Welcome to the DCA Personalized Feed Demo')
239
+
240
+ with gr.Row():
241
+
242
+ with gr.Accordion():
243
+
244
+ gr.Markdown(description)
245
+
246
+ with gr.Row():
247
+
248
+ with gr.Sidebar():
249
+
250
+ gr.Markdown("### Customize")
251
+
252
+ tune_eli = gr.Slider(0.00, 20.00, value=1, label="ELI5 Weight", info="Choose between 0 and 20")
253
+
254
+ tune_evg = gr.Slider(0.00, 20.00, value=1, label="Evergreen Weight", info="Choose between 0 and 20")
255
+
256
+ tune_inf = gr.Slider(0.00, 20.00, value=1, label="Informative Weight", info="Choose between 0 and 20")
257
+
258
+ tune_imp = gr.Slider(0.00, 20.00, value=1, label="Implications Weight", info="Choose between 0 and 20")
259
+
260
+ tune_nov = gr.Slider(0.00, 20.00, value=1, label="Novelty Weight", info="Choose between 0 and 20")
261
+
262
+ tune_imi = gr.Slider(0.00, 20.00, value=1, label="Idea Machine Intersectionality Weight", info="Choose between 0 and 20")
263
+
264
+ tune_min = gr.Slider(0.00, 50.00, value=1, label="Minimum DCA Score", info="Choose between 0 and 50")
265
+
266
+ text_button = gr.Button(value="Set Parameters")
267
+
268
+ clear_button = gr.ClearButton(value="Clear Parameters")
269
+
270
+ with gr.Column(scale=3):
271
+
272
+ form_plot = gr.Label(label="DCA Objective Function")
273
+
274
+ text_button.click(grad_wg_int_form,
275
+ inputs=[
276
+ tune_nov,
277
+ tune_evg,
278
+ tune_inf,
279
+ tune_eli,
280
+ tune_imi,
281
+ tune_imp,
282
+ ],
283
+ outputs=[form_plot])
284
+
285
+ with gr.Tab("Scores per Signal"):
286
+
287
+ output_df = gr.DataFrame(
288
+ wrap = True,
289
+ show_search='filter',
290
+ show_copy_button = True,
291
+ show_fullscreen_button=True )
292
+
293
+
294
+ text_button.click(
295
+ grad_wg_int_scr,
296
+ inputs=[
297
+ tune_nov,
298
+ tune_evg,
299
+ tune_inf,
300
+ tune_eli,
301
+ tune_imi,
302
+ tune_imp,
303
+ ],
304
+ outputs=[output_df])
305
+
306
+
307
+ with gr.Tab("Feed"):
308
+
309
+
310
+ output_df = gr.DataFrame(
311
+ wrap = True,
312
+ show_search='filter',
313
+ show_copy_button = True,
314
+ show_fullscreen_button=True )
315
+
316
+
317
+ text_button.click(
318
+ grad_wg_int,
319
+ inputs=[
320
+ tune_nov,
321
+ tune_evg,
322
+ tune_inf,
323
+ tune_eli,
324
+ tune_imi,
325
+ tune_imp,
326
+ tune_min,
327
+ ],
328
+ outputs=[output_df])
329
+
330
+ clear_button.add([
331
+ tune_eli,
332
+ tune_evg,
333
+ tune_inf,
334
+ tune_imp,
335
+ tune_nov,
336
+ tune_imi,
337
+ tune_min,
338
+ ])
339
+
340
+
341
+ if __name__ == "__main__":
342
+ demo.launch()