mnoorchenar commited on
Commit
0c077ab
·
1 Parent(s): 4d55138

Update 2026-01-28 20:20:23

Browse files
Files changed (2) hide show
  1. Dockerfile +17 -5
  2. README.md +17 -201
Dockerfile CHANGED
@@ -1,6 +1,18 @@
1
  FROM python:3.9-slim
2
- WORKDIR /app
3
- COPY requirements.txt .
4
- RUN pip install --no-cache-dir -r requirements.txt
5
- COPY . .
6
- CMD ["python", "app.py"]
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  FROM python:3.9-slim
2
+
3
+ WORKDIR /app
4
+
5
+ # Copy requirements
6
+ COPY requirements.txt .
7
+
8
+ # Install dependencies
9
+ RUN pip install --no-cache-dir -r requirements.txt
10
+
11
+ # Copy application
12
+ COPY . .
13
+
14
+ # Expose port (HF Spaces uses 7860)
15
+ EXPOSE 7860
16
+
17
+ # Run Flask app
18
+ CMD ["python", "app.py"]
README.md CHANGED
@@ -1,205 +1,21 @@
1
- # Reference Management System - Flask Application
2
-
3
- A web-based application for processing, enriching, and managing BibTeX references. Users paste their BibTeX entries, apply optional processing steps, view results, and save to a local database.
4
-
5
- ## Features
6
-
7
- - **Paste & Process**: Directly input BibTeX entries without needing files
8
- - **Crossref Enrichment**: Fetch updated metadata from Crossref API
9
- - **Journal Abbreviations**: Apply LTWA abbreviations using local `ltwa.txt` file
10
- - **Acronym Protection**: Wrap acronyms in braces to preserve LaTeX capitalization
11
- - **Database Storage**: Save all versions to SQLite database
12
- - **Multiple Export Formats**: Export as CSV or BibTeX file
13
- - **Database Management**: View, delete, and manage stored references
14
- - **Real-time Statistics**: Track total entries, types, and years
15
-
16
- ## Installation
17
-
18
- ### Requirements
19
- - Python 3.8+
20
- - pip
21
-
22
- ### Setup Steps
23
-
24
- 1. **Install Dependencies**
25
- ```bash
26
- pip install -r requirements.txt
27
- ```
28
-
29
- 2. **Add Journal Abbreviations (Optional)**
30
- - Place `ltwa.txt` in the same directory as `app.py`
31
- - Format: Tab-separated values
32
- ```
33
- Journal Full Name Abbrev
34
- Nature Biotechnology Nat Biotechnol
35
- ```
36
-
37
- 3. **Run the Application**
38
- ```bash
39
- python app.py
40
- ```
41
- - Open browser to `http://localhost:5000`
42
-
43
- ## Usage
44
-
45
- ### Basic Workflow
46
-
47
- 1. **Paste BibTeX**
48
- - Copy your BibTeX entries into the input area
49
- - Supports multiple entries
50
-
51
- 2. **Select Processing Options**
52
- - ✅ **Enrich with Crossref**: Query Crossref API (3-5 sec per entry)
53
- - ✅ **Apply Journal Abbreviations**: Requires ltwa.txt file
54
- - ✅ **Protect Acronyms**: Wraps multi-uppercase words in braces
55
- - ✅ **Save to Database**: Store for later access
56
-
57
- 3. **Process & View Results**
58
- - Click "Process References"
59
- - View in three formats: Table, BibTeX, JSON
60
- - Results include metadata, similarity scores, multiple BibTeX versions
61
-
62
- 4. **Manage & Export**
63
- - View database entries with statistics
64
- - Delete individual entries
65
- - Export all data as CSV or BibTeX
66
-
67
- ### Processing Options Explained
68
-
69
- #### Enrich with Crossref
70
- - Queries Crossref API for each reference
71
- - Fetches complete metadata and BibTeX
72
- - Shows title similarity percentage (95%+ uses Crossref data)
73
- - Slower but most accurate for published papers
74
-
75
- #### Journal Abbreviations
76
- - **Requires**: `ltwa.txt` file in app directory
77
- - Replaces full journal names with standard abbreviations
78
- - Essential for formatting journal names correctly
79
-
80
- #### Protect Acronyms
81
- - Wraps acronyms/multi-uppercase words in braces: `{RNN}`, `{LSTM}`
82
- - Prevents LaTeX from lowercasing acronyms in titles
83
- - Applied to title, booktitle, and journal fields
84
-
85
- #### Save to Database
86
- - Stores original and all processed versions
87
- - Preserves session/import date
88
- - Enables future bulk exports
89
-
90
- ## Database Structure
91
-
92
- The SQLite database (`refs_management.db`) includes:
93
-
94
- | Column | Purpose |
95
- |--------|---------|
96
- | id | Primary key |
97
- | key | BibTeX citation key |
98
- | type | Entry type (article, book, etc.) |
99
- | authors | Author names |
100
- | title | Paper/book title |
101
- | journal_booktitle | Journal or conference name |
102
- | year | Publication year |
103
- | bibtex | Original user-provided BibTeX |
104
- | crossref_bibtex | Crossref API fetched version |
105
- | crossref_bibtex_abbrev | With journal abbreviations |
106
- | crossref_bibtex_protected | With protected acronyms |
107
- | title_similarity | Crossref match percentage |
108
- | imported_date | When imported |
109
-
110
- ## API Endpoints
111
-
112
- ### Process References
113
- - **POST** `/api/process`
114
- - **Body**:
115
- ```json
116
- {
117
- "bibtex_content": "...",
118
- "enrich": true,
119
- "abbreviate": true,
120
- "protect": true,
121
- "save_to_db": true
122
- }
123
- ```
124
-
125
- ### Database Management
126
- - **GET** `/api/database/entries` - List all entries
127
- - **DELETE** `/api/database/delete/<key>` - Delete entry
128
- - **GET** `/api/database/export` - Export CSV
129
- - **GET** `/api/database/export-bibtex` - Export BibTeX
130
-
131
- ### Statistics
132
- - **GET** `/api/stats` - Database statistics
133
-
134
- ## File Structure
135
-
136
- ```
137
- .
138
- ├── app.py # Flask application
139
- ├── templates/
140
- │ └── index.html # Web interface
141
- ├── requirements.txt # Python dependencies
142
- ├── refs_management.db # Database (created automatically)
143
- └── ltwa.txt # Journal abbreviations (optional)
144
  ```
145
 
146
- ## Example BibTeX Input
147
-
148
- ```bibtex
149
- @article{Smith2024,
150
- author = {Smith, John and Doe, Jane},
151
- title = {Machine Learning for Natural Language Processing},
152
- journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
153
- year = {2024},
154
- volume = {46},
155
- pages = {1234-1250}
156
- }
157
 
158
- @book{Johnson2023,
159
- author = {Johnson, Michael},
160
- title = {Deep Learning Fundamentals},
161
- publisher = {Academic Press},
162
- year = {2023}
163
- }
164
  ```
165
-
166
- ## Troubleshooting
167
-
168
- ### "No journal abbreviations found"
169
- - Ensure `ltwa.txt` is in the same directory as `app.py`
170
- - Verify tab-separated format
171
-
172
- ### Crossref enrichment is slow
173
- - Normal behavior (API rate limiting)
174
- - Average: 3-5 seconds per entry
175
- - Results cached in database
176
-
177
- ### Database not updating
178
- - Check `refs_management.db` permissions
179
- - Ensure "Save to Database" option is checked
180
-
181
- ### Port 5000 already in use
182
- - Change in app.py: `app.run(debug=True, port=5001)`
183
-
184
- ## Performance Notes
185
-
186
- - **Crossref API**: ~3-5 seconds per reference (rate-limited)
187
- - **Batch processing**: Process 10 references in ~30-50 seconds
188
- - **Database**: Handles 1000+ entries efficiently
189
-
190
- ## Future Enhancements
191
-
192
- - Multiple database tables for different projects
193
- - BibTeX key normalization
194
- - Duplicate detection
195
- - Advanced search/filtering
196
- - Batch upload capability
197
-
198
- ## Support
199
-
200
- For issues with the Crossref API, visit: https://crossref.org
201
- For journal abbreviation standards: https://www.nlm.nih.gov/bsd/ltwa_mainpage.html
202
-
203
- ## License
204
-
205
- This application is provided as-is for reference management purposes.
 
1
+ ---
2
+ title: Reference Management System
3
+ emoji: 📚
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ app_file: app.py
8
+ pinned: false
9
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ```
11
 
12
+ Then keep your existing documentation below it.
 
 
 
 
 
 
 
 
 
 
13
 
14
+ Final repo should have:
 
 
 
 
 
15
  ```
16
+ app.py
17
+ requirements.txt
18
+ Dockerfile
19
+ .gitattributes
20
+ README.md (with HF header)
21
+ templates/index.html