Spaces:

GenAIDevTOProd
/

Reddit-SemanticSearch-Prototype

Sleeping

GenAIDevTOProd commited on Aug 6

Commit

49b5d56

verified ·

1 Parent(s): 2f0a24e

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -7,8 +7,6 @@ Original file is located at
     https://colab.research.google.com/drive/1nLqIbyBDiBI96gDZ0TziLNX8I4uWnl9G
 """
-pip install datasets
 """Picking subreddits, split=sub as the data on huggingface datasets is split w.r.t subreddits and not train/test/validation.
 Streaming = True, because we don't want to load all the data into local memory
@@ -17,7 +15,8 @@ loading and combining all the iterables together.
 """
-from datasets import load_dataset, concatenate_datasets
 target_subreddits = ["askscience", "gaming", "technology", "todayilearned", "programming"]

     https://colab.research.google.com/drive/1nLqIbyBDiBI96gDZ0TziLNX8I4uWnl9G
 """
 """Picking subreddits, split=sub as the data on huggingface datasets is split w.r.t subreddits and not train/test/validation.
 Streaming = True, because we don't want to load all the data into local memory
 """
+from huggingface_hub import hf_hub_url, cached_download
+import json
 target_subreddits = ["askscience", "gaming", "technology", "todayilearned", "programming"]