compendious commited on
Commit
81760e6
·
1 Parent(s): bd2f24a

woops forgot tags

Browse files
Files changed (2) hide show
  1. .github/README.md +138 -0
  2. README.md +12 -0
.github/README.md ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Précis
2
+
3
+ <!-- This version of the README is created just for HuggingFace to work -->
4
+
5
+ A system for compressing long-form content into clear, structured summaries. Précis is designed for videos, articles, and papers. Paste a YouTube link, drop in an article, or upload a text file. Précis will pulls the key facts into a single sentence using a local LLM via [Ollama](https://ollama.com).
6
+
7
+ ## Features
8
+
9
+ - **YouTube summarization**: paste a URL, transcript is fetched automatically via `youtube-transcript-api`
10
+ - **Article / transcript**: paste any text directly
11
+ - **File upload**: drag-and-drop `.txt` files
12
+ - **Streaming**: summaries stream token-by-token from Ollama via NDJSON
13
+ - **Model switching**: choose between available Ollama models from the UI
14
+
15
+ ## API Endpoints
16
+
17
+ | Method | Path | Description |
18
+ |---------|-------------------------|-----------------------|
19
+ | `GET` | `/health` | Health check |
20
+ | `GET` | `/status` | Ollama statuses, etc. |
21
+ | `GET` | `/models` | List available models |
22
+ | `POST` | `/summarize/transcript` | Raw text summary |
23
+ | `POST` | `/summarize/youtube` | YouTube video by URL |
24
+ | `POST` | `/summarize/file` | `.txt` file summary |
25
+
26
+ All `/summarize/*` endpoints accept an optional `model` field to override the default.
27
+
28
+ ## Local Setup
29
+
30
+ ### Prerequisites
31
+
32
+ - Python 3.11+,
33
+ - Node.js 18+ (or an alternative like [Bun](https://bun.sh)),
34
+ - [Ollama](https://ollama.com) installed and running (`ollama serve` is the command, although it may be on auto-start).
35
+ - At least one model pulled: `ollama pull phi4-mini:latest` (for example)
36
+
37
+ ### Run the Fine-Tuning
38
+
39
+ Follow the scripts in `scripts/`, using any model you prefer. This project has been primarily tested with phi4-mini (from Microsoft) and Qwen 3-4b (from Alibaba) (`ollama pull qwen3:4b` to pull it).
40
+
41
+ ### Start the Backend
42
+
43
+ ```bash
44
+ # Create a venv or conda environment or whatever else you may want
45
+ pip install -r ../requirements.txt
46
+ cd backend
47
+ uvicorn app:app --reload
48
+ ```
49
+
50
+ Runs on `http://localhost:8000`. Interactive docs at `/docs`.
51
+
52
+ ### Run the Frontend
53
+
54
+ ```bash
55
+ cd frontend
56
+ npm install # or whatever replacement for npm you may be using
57
+ npm run dev
58
+ ```
59
+
60
+ Runs on `http://localhost:5173`.
61
+
62
+ <!-- ## Data -->
63
+
64
+ <!-- Later, for fine-tuning data details -->
65
+
66
+ <!-- Interview Dataset -->
67
+ <!--
68
+
69
+ @article{zhu2021mediasum,
70
+ title={MediaSum: A Large-scale Media Interview Dataset for Dialogue Summarization},
71
+ author={Zhu, Chenguang and Liu, Yang and Mei, Jie and Zeng, Michael},
72
+ journal={arXiv preprint arXiv:2103.06410},
73
+ year={2021}
74
+ }
75
+
76
+ -->
77
+
78
+ <!--------------------------------------------------------------------------------------------------->
79
+
80
+ <!--
81
+
82
+ @inproceedings{chen-etal-2021-dialogsum,
83
+ title = "{D}ialog{S}um: {A} Real-Life Scenario Dialogue Summarization Dataset",
84
+ author = "Chen, Yulong and
85
+ Liu, Yang and
86
+ Chen, Liang and
87
+ Zhang, Yue",
88
+ booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
89
+ month = aug,
90
+ year = "2021",
91
+ address = "Online",
92
+ publisher = "Association for Computational Linguistics",
93
+ url = "https://aclanthology.org/2021.findings-acl.449",
94
+ doi = "10.18653/v1/2021.findings-acl.449",
95
+ pages = "5062--5074",
96
+ }
97
+
98
+ -->
99
+
100
+ <!------------------------------------------------------------------------------------------------->
101
+
102
+ <!-- "Single question followed by an answer" dataset -->
103
+
104
+ <!--
105
+
106
+ @article{wang2022squality,
107
+ title = {SQuALITY: Building a Long-Document Summarization Dataset the Hard Way},
108
+ author = {Wang, Alex and Pang, Richard Yuanzhe and Chen, Angelica and Phang, Jason and Bowman, Samuel R.},
109
+ journal = {arXiv preprint arXiv:2205.11465},
110
+ year = {2022},
111
+ archivePrefix = {arXiv},
112
+ eprint = {2205.11465},
113
+ primaryClass = {cs.CL},
114
+ doi = {10.48550/arXiv.2205.11465},
115
+ url = {https://doi.org/10.48550/arXiv.2205.11465}
116
+ }
117
+
118
+ -->
119
+
120
+ <!------------------------------------------------------------------------------------------------->
121
+
122
+ <!-- High Quality Query-Answer (concise) examples -->
123
+
124
+ <!--
125
+
126
+ @inproceedings{nguyen2016msmarco,
127
+ title = {MS MARCO: A Human Generated Machine Reading Comprehension Dataset},
128
+ author = {Nguyen, Tri and Rosenberg, Mir and Song, Xia and Gao, Jianfeng and Tiwary, Saurabh and Majumder, Rangan and Deng, Li},
129
+ booktitle = {Proceedings of the Workshop on Cognitive Computation: Integrating Neural and Symbolic Approaches 2016},
130
+ year = {2016},
131
+ publisher = {CEUR-WS.org}
132
+ }
133
+
134
+ -->
135
+
136
+ ## License
137
+
138
+ [GPL-3.0](LICENSE.md)
README.md CHANGED
@@ -1,3 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Précis
2
 
3
  A system for compressing long-form content into clear, structured summaries. Précis is designed for videos, articles, and papers. Paste a YouTube link, drop in an article, or upload a text file. Précis will pulls the key facts into a single sentence using a local LLM via [Ollama](https://ollama.com).
 
1
+ ---
2
+ title: Précis
3
+ emoji: 📝
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ sdk_version: "1"
8
+ python_version: "3.11"
9
+ app_file: app.py
10
+ pinned: false
11
+ ---
12
+
13
  # Précis
14
 
15
  A system for compressing long-form content into clear, structured summaries. Précis is designed for videos, articles, and papers. Paste a YouTube link, drop in an article, or upload a text file. Précis will pulls the key facts into a single sentence using a local LLM via [Ollama](https://ollama.com).