fffffwl commited on
Commit
0b8530c
·
0 Parent(s):

Initial HF Space for Swedish CEFR web app

Browse files
.dockerignore ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ __pycache__/
2
+ *.pyc
3
+ *.log
4
+ *.tmp
5
+ .venv/
.gitattributes ADDED
@@ -0,0 +1 @@
 
 
1
+ *.pt filter=lfs diff=lfs merge=lfs -text
Dockerfile ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
+
3
+ ENV PYTHONDONTWRITEBYTECODE=1
4
+ ENV PYTHONUNBUFFERED=1
5
+ ENV PORT=7860
6
+
7
+ WORKDIR /app
8
+
9
+ COPY web_app/requirements.txt /app/web_app/requirements.txt
10
+ RUN pip install --no-cache-dir -r /app/web_app/requirements.txt gunicorn
11
+
12
+ COPY . /app
13
+ WORKDIR /app/web_app
14
+
15
+ EXPOSE 7860
16
+ CMD ["gunicorn", "-w", "1", "-k", "gthread", "--threads", "4", "-b", "0.0.0.0:7860", "app:app"]
README.md ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Swedish CEFR Sentence Grader
3
+ colorFrom: yellow
4
+ colorTo: blue
5
+ sdk: docker
6
+ app_port: 7860
7
+ pinned: false
8
+ ---
9
+
10
+ # Swedish CEFR Sentence Grader
11
+
12
+ Flask web app for sentence-level CEFR assessment in Swedish using a Metric Proto K3 model.
13
+
14
+ - Base model: KB/bert-base-swedish-cased
15
+ - Levels: A1-C2
16
+ - Input: Swedish text, auto sentence splitting
runs/metric-proto-k3/metric_proto.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30137ef1bc2c6def1b17e3e018edc704eda2ea4411840cfc45867d43a727b4fe
3
+ size 498903733
web_app/README.md ADDED
@@ -0,0 +1,244 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CEFR Sentence-Level Assessment Web Application
2
+
3
+ A Flask-based web interface for assessing Swedish text at the sentence level using a trained CEFR (Common European Framework of Reference for Languages) classification model.
4
+
5
+ ## Features
6
+
7
+ - **Web Interface**: Clean, modern UI for easy text input and analysis
8
+ - **Sentence Segmentation**: Automatically splits text into sentences
9
+ - **CEFR Level Assessment**: Assigns proficiency levels (A1-C2) to each sentence
10
+ - **Real-time Results**: Visual highlighting of CEFR levels in the text
11
+ - **Statistics Dashboard**: Shows distribution of levels and confidence scores
12
+ - **Detailed Table**: View all sentences with their levels and confidence
13
+
14
+ ## Model Information
15
+
16
+ - **Architecture**: Metric Proto K3 (Prototype-based Classification)
17
+ - **Base Model**: KB/bert-base-swedish-cased
18
+ - **Prototypes**: 3 prototypes per CEFR level (K=3)
19
+ - **Temperature**: 10.0
20
+ - **Performance**: 84.1% macro F1, 87.3% accuracy, 94.5% QWK
21
+ - **Device**: CUDA (if available) or CPU
22
+
23
+ ## Project Structure
24
+
25
+ ```
26
+ web_app/
27
+ ├── app.py # Flask application
28
+ ├── model.py # Model loading and inference
29
+ ├── requirements.txt # Python dependencies
30
+ ├── templates/
31
+ │ └── index.html # Main HTML template
32
+ └── static/
33
+ ├── css/
34
+ │ └── style.css # Styling
35
+ └── js/
36
+ └── app.js # Frontend JavaScript
37
+ ```
38
+
39
+ ## Installation
40
+
41
+ ### Prerequisites
42
+
43
+ - Python 3.8+
44
+ - CUDA-compatible GPU (optional but recommended)
45
+ - Linux/macOS
46
+
47
+ ### Setup
48
+
49
+ 1. **Ensure virtual environment is set up** (from project root):
50
+ ```bash
51
+ cd /home/fwl/src/textmining
52
+ # Virtual environment should already exist at .venv/
53
+ ```
54
+
55
+ 2. **Activate virtual environment**:
56
+ ```bash
57
+ source .venv/bin/activate
58
+ ```
59
+
60
+ 3. **Install Flask** (if not already installed):
61
+ ```bash
62
+ pip install flask flask-cors
63
+ ```
64
+
65
+ 4. **Navigate to web app directory**:
66
+ ```bash
67
+ cd web_app
68
+ ```
69
+
70
+ 5. **Verify model weights exist**:
71
+ ```bash
72
+ ls ../runs/metric-proto-k3/metric_proto.pt
73
+ ```
74
+
75
+ ## Running the Application
76
+
77
+ ### Development Server
78
+
79
+ 1. **Start the Flask application**:
80
+ ```bash
81
+ # Make sure virtual environment is activated
82
+ source /home/fwl/src/textmining/.venv/bin/activate
83
+
84
+ # Run the app
85
+ cd /home/fwl/src/textmining/web_app
86
+ python -m flask run --host=0.0.0.0 --port=5000
87
+ ```
88
+
89
+ 2. **Access the web interface**:
90
+ Open your browser and go to: http://localhost:5000
91
+
92
+ ### Production Deployment (Gunicorn)
93
+
94
+ For production use, install Gunicorn and run:
95
+
96
+ ```bash
97
+ pip install gunicorn
98
+ gunicorn --bind 0.0.0.0:5000 app:app --workers 4
99
+ ```
100
+
101
+ ## Usage
102
+
103
+ ### Web Interface
104
+
105
+ 1. **Enter Swedish text** in the large text area
106
+ 2. **Click "Analyze Text"** button
107
+ 3. **View results**:
108
+ - Statistics overview (total sentences, average confidence, dominant level)
109
+ - CEFR level distribution bar chart
110
+ - Annotated text with color-coded levels
111
+ - Detailed table of all sentences
112
+
113
+ ### API Endpoints
114
+
115
+ #### Assess Text
116
+ ```http
117
+ POST /assess
118
+ Content-Type: application/json
119
+
120
+ {
121
+ "text": "Jag heter Anna. Jag kommer från Sverige."
122
+ }
123
+ ```
124
+
125
+ Response:
126
+ ```json
127
+ {
128
+ "results": [
129
+ {
130
+ "sentence": "Jag heter Anna.",
131
+ "level": "A1",
132
+ "confidence": 0.85
133
+ }
134
+ ],
135
+ "stats": {
136
+ "total_sentences": 1,
137
+ "avg_confidence": 0.85,
138
+ "level_distribution": {"A1": 1},
139
+ "most_common_level": {"level": "A1", "count": 1, "percentage": 100}
140
+ }
141
+ }
142
+ ```
143
+
144
+ #### Batch Predict API
145
+ ```http
146
+ POST /api/predict
147
+ Content-Type: application/json
148
+
149
+ {
150
+ "sentences": ["Sentence 1", "Sentence 2", ...]
151
+ }
152
+ ```
153
+
154
+ Response:
155
+ ```json
156
+ {
157
+ "predictions": [
158
+ {
159
+ "sentence": "Sentence 1",
160
+ "level": "B1",
161
+ "confidence": 0.72
162
+ }
163
+ ],
164
+ "count": 1
165
+ }
166
+ ```
167
+
168
+ ## CEFR Level Reference
169
+
170
+ | Level | Name | Description | Color |
171
+ |-------|------|-------------|-------|
172
+ | A1 | Beginner | Basic phrases and simple sentences | 🔴 Red |
173
+ | A2 | Elementary | Simple direct exchanges of information | 🟠 Orange |
174
+ | B1 | Intermediate | Simple connected text on familiar topics | 🟡 Yellow |
175
+ | B2 | Upper Intermediate | Complex text, technical discussions | 🟢 Green |
176
+ | C1 | Advanced | Flexible, effective, nuanced expression | 🔵 Blue |
177
+ | C2 | Proficient | Precise, sophisticated, complex content | 🟣 Purple |
178
+
179
+ ## Troubleshooting
180
+
181
+ ### Model Loading Issues
182
+
183
+ If model fails to load:
184
+ 1. Check that model weights exist: `runs/metric-proto-k3/metric_proto.pt`
185
+ 2. Verify virtual environment is activated
186
+ 3. Check CUDA availability: `python -c "import torch; print(torch.cuda.is_available())"`
187
+
188
+ ### Out of Memory Errors
189
+
190
+ If you encounter OOM errors:
191
+ 1. Reduce batch size in `model.py` (modify `predict_batch`)
192
+ 2. Use CPU instead of GPU: Set `device='cpu'` in CEFRModel initialization
193
+ 3. Process text in smaller chunks
194
+
195
+ ### Prediction Time
196
+
197
+ First prediction may take longer due to model loading. Subsequent predictions are faster.
198
+
199
+ ## Model Details
200
+
201
+ ### Architecture
202
+
203
+ The model uses a prototype-based approach:
204
+ - Encodes sentences using Swedish BERT
205
+ - Computes cosine similarity to learned prototypes
206
+ - Each CEFR level has 3 prototypes (K=3)
207
+ - Temperature scaling (T=10.0) sharpens predictions
208
+
209
+ ### Training Data
210
+
211
+ Model trained on Swedish CEFR-labeled sentences from:
212
+ - SUC 3.0 corpus
213
+ - COCTAILL corpus
214
+ - Filtered for quality and length constraints
215
+
216
+ ### Performance Metrics
217
+
218
+ - **Accuracy**: 87.3%
219
+ - **Macro F1**: 84.1%
220
+ - **Quadratic Weighted Kappa**: 94.5%
221
+
222
+ ## Development
223
+
224
+ ### Adding Features
225
+
226
+ To add new features:
227
+ 1. Modify `app.py` for backend logic
228
+ 2. Update `templates/index.html` for UI
229
+ 3. Add styles to `static/css/style.css`
230
+ 4. Implement frontend logic in `static/js/app.js`
231
+
232
+ ### Frontend Structure
233
+
234
+ - Vanilla JavaScript (no frameworks required)
235
+ - Responsive design with CSS Grid and Flexbox
236
+ - Modern UI with animations and transitions
237
+
238
+ ## License
239
+
240
+ Same as parent project.
241
+
242
+ ## Citation
243
+
244
+ If you use this web application in your research, please cite the original CEFR-SP paper and this implementation.
web_app/STARTUP.md ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CEFR Auto-Grader Web App - Quick Start Guide
2
+
3
+ ## Application Status
4
+ ✅ **RUNNING** - Fully functional
5
+
6
+ ## Quick Access
7
+ - **Web Interface**: http://localhost:5000
8
+ - **LAN Access**: http://192.168.1.11:5000
9
+
10
+ ## Starting the Application
11
+
12
+ If the app is not running, start it from the project root:
13
+
14
+ ```bash
15
+ cd /home/fwl/src/textmining
16
+ source .venv/bin/activate
17
+ python web_app/app.py
18
+ ```
19
+
20
+ Or run in background:
21
+ ```bash
22
+ nohup python web_app/app.py > web_app/flask.log 2>&1 &
23
+ ```
24
+
25
+ ## Model Information
26
+
27
+ - **Architecture**: Metric Proto K3
28
+ - **Base Model**: KB/bert-base-swedish-cased
29
+ - **Device**: CUDA (GPU)
30
+ - **Performance**: 84.1% macro F1, 87.3% accuracy
31
+
32
+ ## Testing Examples
33
+
34
+ | Sentence | Predicted Level | Confidence |
35
+ |----------|----------------|------------|
36
+ | "Hej." | A1 | 98.9% |
37
+ | "Jag heter Anna." | A1 | 98.9% |
38
+ | "Jag studerar svenska." | A1 | 99.1% |
39
+ | "Den komplexa algoritmen..." | B2 | 99.0% |
40
+ | "Det metodologiska ramverket..." | C1 | 99.1% |
41
+
42
+ ## Features
43
+
44
+ - 📝 Large text input area
45
+ - 🔍 Automatic sentence segmentation
46
+ - 🎨 Color-coded CEFR levels (A1-C2)
47
+ - 📊 Statistics dashboard
48
+ - 📈 Level distribution visualization
49
+ - 📋 Detailed results table
50
+ - ⚡ Real-time processing
51
+
52
+ ## Files
53
+
54
+ - `app.py` - Flask application
55
+ - `model.py` - Model loading & inference
56
+ - `templates/index.html` - Web interface
57
+ - `static/css/style.css` - Styling
58
+ - `static/js/app.js` - Frontend logic
59
+
60
+ ## Troubleshooting
61
+
62
+ If predictions are all the same level:
63
+ 1. Check model loaded: `grep "Loading model" web_app/flask.log`
64
+ 2. Verify model path: `ls runs/metric-proto-k3/metric_proto.pt`
65
+ 3. Restart from project root: `cd /home/fwl/src/textmining`
66
+
67
+ ## API Endpoint
68
+
69
+ ```bash
70
+ curl -X POST http://localhost:5000/assess \
71
+ -H "Content-Type: application/json" \
72
+ -d '{"text": "Jag heter Anna."}'
73
+ ```
web_app/app.py ADDED
@@ -0,0 +1,174 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ CEFR Sentence Level Assessment Web Application
3
+ Flask-based web interface for assessing Swedish text at sentence level
4
+ """
5
+
6
+ import os
7
+ from pathlib import Path
8
+
9
+ from flask import Flask, render_template, request, jsonify
10
+
11
+ from model import CEFRModel, assess_text
12
+
13
+ # Initialize Flask app
14
+ app = Flask(__name__)
15
+ app.config['SECRET_KEY'] = 'cefr-assessment-app'
16
+
17
+ # Initialize model
18
+ print("Loading CEFR assessment model...")
19
+ model_path = os.environ.get('MODEL_PATH', 'runs/metric-proto-k3/metric_proto.pt')
20
+ model = CEFRModel(model_path=model_path)
21
+ print(f"Model loaded successfully! Using device: {model.device}")
22
+
23
+ # CEFR level styles for HTML display
24
+ CEFR_STYLES = {
25
+ 'A1': {'color': '#E74C3C', 'name': 'A1 - Beginner'},
26
+ 'A2': {'color': '#E67E22', 'name': 'A2 - Elementary'},
27
+ 'B1': {'color': '#F39C12', 'name': 'B1 - Intermediate'},
28
+ 'B2': {'color': '#27AE60', 'name': 'B2 - Upper Intermediate'},
29
+ 'C1': {'color': '#3498DB', 'name': 'C1 - Advanced'},
30
+ 'C2': {'color': '#9B59B6', 'name': 'C2 - Proficient'},
31
+ }
32
+
33
+
34
+ @app.route('/')
35
+ def index():
36
+ """Home page with text input form"""
37
+ return render_template('index.html')
38
+
39
+
40
+ @app.route('/assess', methods=['POST'])
41
+ def assess():
42
+ """Assess text and return results"""
43
+ try:
44
+ # Get text from form
45
+ data = request.get_json()
46
+ text = data.get('text', '').strip()
47
+
48
+ if not text:
49
+ return jsonify({'error': 'Please enter some text to assess'}), 400
50
+
51
+ # Limit text length
52
+ if len(text) > 50000: # ~50KB limit
53
+ return jsonify({'error': 'Text is too long. Please limit to 50,000 characters.'}), 400
54
+
55
+ # Assess text
56
+ results = assess_text(text, model)
57
+
58
+ if not results:
59
+ return jsonify({'error': 'No valid sentences found in the text'}), 400
60
+
61
+ # Prepare response
62
+ response = {
63
+ 'results': results,
64
+ 'cefr_styles': CEFR_STYLES,
65
+ 'stats': compute_stats(results)
66
+ }
67
+
68
+ return jsonify(response)
69
+
70
+ except Exception as e:
71
+ print(f"Error in assessment: {str(e)}")
72
+ return jsonify({'error': f'An error occurred during assessment: {str(e)}'}), 500
73
+
74
+
75
+ @app.route('/api/predict', methods=['POST'])
76
+ def api_predict():
77
+ """API endpoint for batch predictions"""
78
+ try:
79
+ data = request.get_json()
80
+ sentences = data.get('sentences', [])
81
+
82
+ if not sentences:
83
+ return jsonify({'error': 'No sentences provided'}), 400
84
+
85
+ if not isinstance(sentences, list):
86
+ return jsonify({'error': 'Sentences must be a list'}), 400
87
+
88
+ # Limit batch size
89
+ if len(sentences) > 100:
90
+ return jsonify({'error': 'Batch size limited to 100 sentences'}), 400
91
+
92
+ # Predict
93
+ predictions = model.predict_batch(sentences)
94
+
95
+ # Format response
96
+ results = []
97
+ for sent, (level, confidence) in zip(sentences, predictions):
98
+ results.append({
99
+ 'sentence': sent,
100
+ 'level': level,
101
+ 'confidence': confidence
102
+ })
103
+
104
+ return jsonify({
105
+ 'predictions': results,
106
+ 'count': len(results)
107
+ })
108
+
109
+ except Exception as e:
110
+ print(f"Error in API prediction: {str(e)}")
111
+ return jsonify({'error': str(e)}), 500
112
+
113
+
114
+ def compute_stats(results: list) -> dict:
115
+ """Compute statistics about the assessment results"""
116
+ if not results:
117
+ return {}
118
+
119
+ # Count levels
120
+ level_counts = {}
121
+ for item in results:
122
+ level = item['level']
123
+ level_counts[level] = level_counts.get(level, 0) + 1
124
+
125
+ # Average confidence
126
+ avg_confidence = sum(item['confidence'] for item in results) / len(results)
127
+
128
+ # Most common level
129
+ if level_counts:
130
+ most_common = max(level_counts, key=level_counts.get)
131
+ most_common_count = level_counts[most_common]
132
+ most_common_pct = (most_common_count / len(results)) * 100
133
+ else:
134
+ most_common = None
135
+ most_common_count = 0
136
+ most_common_pct = 0
137
+
138
+ return {
139
+ 'total_sentences': len(results),
140
+ 'level_distribution': level_counts,
141
+ 'avg_confidence': avg_confidence,
142
+ 'most_common_level': {
143
+ 'level': most_common,
144
+ 'count': most_common_count,
145
+ 'percentage': round(most_common_pct, 1)
146
+ }
147
+ }
148
+
149
+
150
+ @app.context_processor
151
+ def utility_processor():
152
+ """Utility functions for Jinja templates"""
153
+ return dict(
154
+ round=round,
155
+ len=len
156
+ )
157
+
158
+
159
+ if __name__ == '__main__':
160
+ # Create uploads directory
161
+ os.makedirs('uploads', exist_ok=True)
162
+
163
+ print("Starting CEFR Assessment Web App...")
164
+ print(f"\nModel path: {model_path}")
165
+ print(f"Model device: {model.device}")
166
+ print("\nStarting Flask server...")
167
+
168
+ # Run app
169
+ app.run(
170
+ debug=True,
171
+ host='0.0.0.0',
172
+ port=5000,
173
+ threaded=True
174
+ )
web_app/debug_model.py ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python
2
+ import sys
3
+ sys.path.append('/home/fwl/src/textmining')
4
+
5
+ from web_app.model import CEFRModel
6
+ import torch
7
+
8
+ print("Loading model...")
9
+ model = CEFRModel(model_path='runs/metric-proto-k3/metric_proto.pt')
10
+
11
+ # Test simple sentence
12
+ sentences = ["Jag är bra."]
13
+ print(f"\nTesting: {sentences}")
14
+
15
+ # Tokenize
16
+ encoded = model.tokenize(sentences)
17
+ input_ids = encoded["input_ids"].to(model.device)
18
+ attention_mask = encoded["attention_mask"].to(model.device)
19
+
20
+ print(f"Input shape: {input_ids.shape}")
21
+ print(f"Device: {model.device}")
22
+
23
+ # Predict
24
+ with torch.no_grad():
25
+ logits = model.model(input_ids, attention_mask)["logits"]
26
+ print(f"Logits shape: {logits.shape}")
27
+ print(f"Logits: {logits}")
28
+
29
+ probs = torch.softmax(logits, dim=1)
30
+ print(f"Probs shape: {probs.shape}")
31
+ print(f"Probs: {probs}")
32
+
33
+ predictions = torch.argmax(logits, dim=1)
34
+ print(f"Predictions: {predictions}")
35
+
36
+ # Test different ways to extract confidence
37
+ cpu_probs = probs.cpu()
38
+ for i, pred in enumerate(predictions.cpu().numpy()):
39
+ print(f"\nSentence {i}: '{sentences[i]}'")
40
+ print(f" Predicted class: {pred}")
41
+ print(f" Predicted level: {model.id_to_label[pred]}")
42
+ print(f" Method 1 - probs[i][pred]: {probs[i][pred].item()}")
43
+ print(f" Method 2 - cpu_probs[i][pred]: {cpu_probs[i][pred].item()}")
44
+ print(f" Method 3 - float(cpu_probs[i][pred].item()): {float(cpu_probs[i][pred].item())}")
45
+
46
+ # Test using predict_batch
47
+ print("\n" + "="*60)
48
+ print("Using predict_batch method:")
49
+ results = model.predict_batch(sentences)
50
+ for sent, (level, conf) in zip(sentences, results):
51
+ print(f" {level} ({conf*100:.1f}%): {sent}")
52
+
53
+ # Test using predict_sentence
54
+ print("\n" + "="*60)
55
+ print("Using predict_sentence method:")
56
+ level, conf = model.predict_sentence(sentences[0])
57
+ print(f" {level} ({conf*100:.1f}%): {sentences[0]}")
web_app/model.py ADDED
@@ -0,0 +1,297 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ CEFR Sentence Level Assessment Model
3
+ Loads and runs inference with the metric proto k3 model
4
+ """
5
+
6
+ import re
7
+ from pathlib import Path
8
+ from typing import List, Tuple, Dict
9
+
10
+ import torch
11
+ from transformers import AutoTokenizer, AutoModel
12
+
13
+
14
+ class PrototypeClassifier(torch.nn.Module):
15
+ """Metric-based prototype classifier for CEFR level assessment"""
16
+
17
+ def __init__(
18
+ self,
19
+ encoder,
20
+ num_labels: int,
21
+ hidden_size: int,
22
+ prototypes_per_class: int,
23
+ temperature: float = 10.0,
24
+ layer_index: int = -2,
25
+ ):
26
+ super().__init__()
27
+ self.encoder = encoder
28
+ self.num_labels = num_labels
29
+ self.prototypes_per_class = prototypes_per_class
30
+ self.temperature = temperature
31
+ self.layer_index = layer_index
32
+ self.prototypes = torch.nn.Parameter(
33
+ torch.empty(num_labels, prototypes_per_class, hidden_size)
34
+ )
35
+
36
+ def set_prototypes(self, proto_tensor: torch.Tensor) -> None:
37
+ """Set prototype weights"""
38
+ with torch.no_grad():
39
+ self.prototypes.copy_(proto_tensor)
40
+
41
+ def encode(self, input_ids, attention_mask, token_type_ids=None) -> torch.Tensor:
42
+ """Encode input sentences to normalized embeddings"""
43
+ outputs = self.encoder(
44
+ input_ids=input_ids,
45
+ attention_mask=attention_mask,
46
+ token_type_ids=token_type_ids,
47
+ output_hidden_states=True,
48
+ )
49
+ hidden = outputs.hidden_states[self.layer_index]
50
+ # mean pooling
51
+ mask = attention_mask.unsqueeze(-1).float()
52
+ summed = torch.sum(hidden * mask, dim=1)
53
+ counts = torch.clamp(mask.sum(dim=1), min=1e-9)
54
+ pooled = summed / counts
55
+ pooled = torch.nn.functional.normalize(pooled, p=2, dim=1)
56
+ return pooled
57
+
58
+ def forward(self, input_ids, attention_mask, token_type_ids=None):
59
+ """Forward pass returning logits"""
60
+ x = self.encode(input_ids, attention_mask, token_type_ids)
61
+ # cosine similarity with prototypes, average over K for each class
62
+ protos = torch.nn.functional.normalize(self.prototypes, p=2, dim=-1)
63
+ # [B, H] x [C,K,H] -> [B,C,K]
64
+ sim = torch.einsum("bh,ckh->bck", x, protos)
65
+ sim_mean = sim.mean(dim=2) # average over K
66
+ logits = sim_mean * self.temperature
67
+ return {"logits": logits}
68
+
69
+ def predict(self, input_ids, attention_mask, token_type_ids=None) -> torch.Tensor:
70
+ """Predict CEFR levels"""
71
+ outputs = self.forward(input_ids, attention_mask, token_type_ids)
72
+ return torch.argmax(outputs["logits"], dim=1)
73
+
74
+
75
+ class CEFRModel:
76
+ """Wrapper class for CEFR assessment model"""
77
+
78
+ def __init__(self, model_path: str = None, device: str = None):
79
+ """
80
+ Initialize the CEFR assessment model
81
+
82
+ Args:
83
+ model_path: Path to the trained model checkpoint
84
+ device: Device to run inference on ('cuda' or 'cpu')
85
+ """
86
+ if device is None:
87
+ self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
88
+ else:
89
+ self.device = torch.device(device)
90
+
91
+ # CEFR level mapping
92
+ self.id_to_label = {0: "A1", 1: "A2", 2: "B1", 3: "B2", 4: "C1", 5: "C2"}
93
+ self.label_to_id = {v: k for k, v in self.id_to_label.items()}
94
+
95
+ # Model parameters
96
+ self.model_name = "KB/bert-base-swedish-cased"
97
+ self.hidden_size = 768
98
+ self.num_labels = 6
99
+ self.prototypes_per_class = 3
100
+ self.temperature = 10.0
101
+
102
+ # Load tokenizer
103
+ self.tokenizer = AutoTokenizer.from_pretrained(self.model_name)
104
+
105
+ # Load model
106
+ encoder = AutoModel.from_pretrained(self.model_name)
107
+ self.model = PrototypeClassifier(
108
+ encoder=encoder,
109
+ num_labels=self.num_labels,
110
+ hidden_size=self.hidden_size,
111
+ prototypes_per_class=self.prototypes_per_class,
112
+ temperature=self.temperature,
113
+ )
114
+
115
+ # Load trained weights
116
+ if model_path is None:
117
+ # Try to find the model automatically
118
+ default_paths = [
119
+ "runs/metric-proto-k3/metric_proto.pt",
120
+ "runs/metric-proto/metric_proto.pt",
121
+ "runs/bert-baseline/bert_baseline.pt",
122
+ "../runs/metric-proto-k3/metric_proto.pt", # Relative to web_app/
123
+ ]
124
+ for path in default_paths:
125
+ if Path(path).exists():
126
+ model_path = path
127
+ print(f"Auto-detected model: {model_path}")
128
+ break
129
+
130
+ if model_path:
131
+ # Try different relative paths
132
+ possible_paths = [
133
+ Path(model_path),
134
+ Path(__file__).parent / model_path,
135
+ Path(__file__).parent.parent / model_path,
136
+ ]
137
+
138
+ checkpoint = None
139
+ for path in possible_paths:
140
+ if path.exists():
141
+ print(f"Loading model from {path}")
142
+ checkpoint = torch.load(path, map_location=self.device, weights_only=False)
143
+ break
144
+
145
+ if checkpoint is None:
146
+ print(f"Warning: Model file not found at {model_path}")
147
+ print("Model will be initialized with random weights!")
148
+ else:
149
+ print("Warning: No model path specified. Model will be initialized with random weights!")
150
+ checkpoint = None
151
+
152
+ if checkpoint is not None:
153
+
154
+ # Load model state dict
155
+ if "state_dict" in checkpoint:
156
+ state_dict = checkpoint["state_dict"]
157
+ # Handle DataParallel state dict
158
+ new_state_dict = {}
159
+ for key, value in state_dict.items():
160
+ if key.startswith("model."):
161
+ new_key = key[6:] # Remove 'model.' prefix
162
+ else:
163
+ new_key = key
164
+ new_state_dict[new_key] = value
165
+ self.model.load_state_dict(new_state_dict, strict=False)
166
+ else:
167
+ self.model.load_state_dict(checkpoint)
168
+
169
+ # Load prototypes if available
170
+ if "prototypes" in checkpoint:
171
+ self.model.set_prototypes(checkpoint["prototypes"].to(self.device))
172
+
173
+ self.model.to(self.device)
174
+ self.model.eval()
175
+
176
+ def tokenize(self, texts: List[str], max_length: int = 128) -> Dict[str, torch.Tensor]:
177
+ """Tokenize input texts"""
178
+ encoded = self.tokenizer(
179
+ texts,
180
+ truncation=True,
181
+ padding=True,
182
+ max_length=max_length,
183
+ return_tensors="pt",
184
+ )
185
+ return encoded
186
+
187
+ def predict_batch(self, sentences: List[str]) -> List[Tuple[str, float]]:
188
+ """
189
+ Predict CEFR levels for a batch of sentences
190
+
191
+ Args:
192
+ sentences: List of sentences to assess
193
+
194
+ Returns:
195
+ List of (level, confidence) tuples
196
+ """
197
+ if not sentences:
198
+ return []
199
+
200
+ # Tokenize
201
+ encoded = self.tokenize(sentences)
202
+ input_ids = encoded["input_ids"].to(self.device)
203
+ attention_mask = encoded["attention_mask"].to(self.device)
204
+
205
+ # Predict
206
+ with torch.no_grad():
207
+ logits = self.model(input_ids, attention_mask)["logits"]
208
+ probs = torch.softmax(logits, dim=1)
209
+ predictions = torch.argmax(logits, dim=1)
210
+
211
+ # Format results
212
+ results = []
213
+ cpu_probs = probs.cpu()
214
+ for i, pred in enumerate(predictions.cpu().numpy()):
215
+ level = self.id_to_label[pred]
216
+ confidence = float(cpu_probs[i][pred].item())
217
+ # Handle NaN values
218
+ if torch.isnan(cpu_probs[i][pred]):
219
+ confidence = 1.0 / self.num_labels
220
+ results.append((level, confidence))
221
+
222
+ return results
223
+
224
+ def predict_sentence(self, sentence: str) -> Tuple[str, float]:
225
+ """Predict CEFR level for a single sentence"""
226
+ results = self.predict_batch([sentence])
227
+ return results[0]
228
+
229
+
230
+ def split_into_sentences(text: str) -> List[str]:
231
+ """
232
+ Split text into sentences
233
+
234
+ Args:
235
+ text: Input text (Swedish)
236
+
237
+ Returns:
238
+ List of sentences
239
+ """
240
+ # Simple sentence splitting based on punctuation
241
+ # Swedish sentence endings: . ! ?
242
+ # Split on punctuation followed by space and uppercase letter, or end of string
243
+
244
+ sentences = re.split(r'([.!?])\s+', text)
245
+
246
+ # Combine punctuation with previous sentence
247
+ combined = []
248
+ for i in range(0, len(sentences) - 1, 2):
249
+ if i + 1 < len(sentences):
250
+ combined.append(sentences[i] + sentences[i + 1])
251
+ else:
252
+ combined.append(sentences[i])
253
+
254
+ # Handle the last sentence if there's no punctuation
255
+ if len(sentences) % 2 == 1 and sentences[-1].strip():
256
+ combined.append(sentences[-1])
257
+
258
+ # Clean up sentences
259
+ cleaned = []
260
+ for sent in combined:
261
+ sent = sent.strip()
262
+ if sent:
263
+ cleaned.append(sent)
264
+
265
+ return cleaned
266
+
267
+
268
+ def assess_text(text: str, model: CEFRModel) -> List[Dict[str, any]]:
269
+ """
270
+ Assess a text and return sentence-level CEFR annotations
271
+
272
+ Args:
273
+ text: Input text (Swedish)
274
+ model: CEFR assessment model
275
+
276
+ Returns:
277
+ List of dictionaries with sentence and level information
278
+ """
279
+ # Split text into sentences
280
+ sentences = split_into_sentences(text)
281
+
282
+ if not sentences:
283
+ return []
284
+
285
+ # Predict CEFR levels
286
+ predictions = model.predict_batch(sentences)
287
+
288
+ # Format results
289
+ results = []
290
+ for sent, (level, confidence) in zip(sentences, predictions):
291
+ results.append({
292
+ "sentence": sent,
293
+ "level": level,
294
+ "confidence": confidence,
295
+ })
296
+
297
+ return results
web_app/requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ torch>=1.9.0
2
+ transformers>=4.0.0
3
+ flask>=2.0.0
4
+ flask-cors>=3.0.0
5
+ numpy>=1.21.0
web_app/static/css/style.css ADDED
@@ -0,0 +1,625 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /* CSS Reset and Base Styles */
2
+ * {
3
+ margin: 0;
4
+ padding: 0;
5
+ box-sizing: border-box;
6
+ }
7
+
8
+ :root {
9
+ /* Color Palette - Modern Deep Blues */
10
+ --primary-color: #1A3A6C;
11
+ --primary-dark: #0D2147;
12
+ --primary-light: #2C5282;
13
+ --accent-color: #2B89E0;
14
+ --success-color: #27AE60;
15
+ --warning-color: #F39C12;
16
+ --error-color: #E74C3C;
17
+
18
+ /* CEFR Level Colors */
19
+ --a1-color: #E74C3C;
20
+ --a2-color: #E67E22;
21
+ --b1-color: #F39C12;
22
+ --b2-color: #27AE60;
23
+ --c1-color: #3498DB;
24
+ --c2-color: #9B59B6;
25
+
26
+ /* Neutral Colors */
27
+ --bg-color: #F8FAFC;
28
+ --card-bg: #FFFFFF;
29
+ --text-primary: #1E293B;
30
+ --text-secondary: #64748B;
31
+ --border-color: #E2E8F0;
32
+ --shadow: 0 1px 3px 0 rgba(0, 0, 0, 0.1);
33
+ --shadow-lg: 0 10px 15px -3px rgba(0, 0, 0, 0.1);
34
+
35
+ /* Typography */
36
+ --font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
37
+ --font-size-sm: 0.875rem;
38
+ --font-size-base: 1rem;
39
+ --font-size-lg: 1.125rem;
40
+ --font-size-xl: 1.25rem;
41
+ --font-size-2xl: 1.5rem;
42
+
43
+ /* Spacing */
44
+ --spacing-xs: 0.5rem;
45
+ --spacing-sm: 0.75rem;
46
+ --spacing-md: 1rem;
47
+ --spacing-lg: 1.5rem;
48
+ --spacing-xl: 2rem;
49
+ --spacing-2xl: 3rem;
50
+
51
+ /* Border Radius */
52
+ --radius-sm: 4px;
53
+ --radius-md: 8px;
54
+ --radius-lg: 12px;
55
+ }
56
+
57
+ /* Base Styles */
58
+ body {
59
+ font-family: var(--font-family);
60
+ background-color: var(--bg-color);
61
+ color: var(--text-primary);
62
+ line-height: 1.6;
63
+ }
64
+
65
+ .container {
66
+ max-width: 1200px;
67
+ margin: 0 auto;
68
+ padding: var(--spacing-md);
69
+ }
70
+
71
+ /* Header */
72
+ .header {
73
+ padding: var(--spacing-lg) 0;
74
+ margin-bottom: var(--spacing-lg);
75
+ display: flex;
76
+ justify-content: space-between;
77
+ align-items: center;
78
+ border-bottom: 1px solid var(--border-color);
79
+ }
80
+
81
+ .logo h1 {
82
+ font-size: var(--font-size-xl);
83
+ font-weight: 600;
84
+ color: var(--text-primary);
85
+ }
86
+
87
+ .github-link a {
88
+ color: var(--primary-color);
89
+ text-decoration: none;
90
+ font-weight: 500;
91
+ font-size: var(--font-size-base);
92
+ }
93
+
94
+ .github-link a:hover {
95
+ text-decoration: underline;
96
+ }
97
+
98
+ /* Cards */
99
+ .card {
100
+ background: var(--card-bg);
101
+ border-radius: var(--radius-lg);
102
+ padding: var(--spacing-xl);
103
+ box-shadow: var(--shadow);
104
+ margin-bottom: var(--spacing-xl);
105
+ border: 1px solid var(--border-color);
106
+ }
107
+
108
+ /* Section Headers */
109
+ h2, h3 {
110
+ color: var(--text-primary);
111
+ margin-bottom: var(--spacing-md);
112
+ }
113
+
114
+ h2 {
115
+ font-size: var(--font-size-xl);
116
+ font-weight: 600;
117
+ }
118
+
119
+ h3 {
120
+ font-size: var(--font-size-lg);
121
+ font-weight: 600;
122
+ }
123
+
124
+ .section-description {
125
+ color: var(--text-secondary);
126
+ margin-bottom: var(--spacing-lg);
127
+ font-size: var(--font-size-base);
128
+ }
129
+
130
+ /* Forms */
131
+ .form-group {
132
+ margin-bottom: var(--spacing-lg);
133
+ }
134
+
135
+ .form-label {
136
+ display: block;
137
+ margin-bottom: var(--spacing-sm);
138
+ font-weight: 500;
139
+ color: var(--text-primary);
140
+ font-size: var(--font-size-base);
141
+ }
142
+
143
+ .text-input {
144
+ width: 100%;
145
+ padding: var(--spacing-md);
146
+ border: 2px solid var(--border-color);
147
+ border-radius: var(--radius-md);
148
+ font-size: var(--font-size-base);
149
+ font-family: inherit;
150
+ resize: vertical;
151
+ min-height: 200px;
152
+ transition: border-color 0.2s ease;
153
+ }
154
+
155
+ .text-input:focus {
156
+ outline: none;
157
+ border-color: var(--accent-color);
158
+ box-shadow: 0 0 0 3px rgba(43, 137, 224, 0.1);
159
+ }
160
+
161
+ .input-hint {
162
+ margin-top: var(--spacing-sm);
163
+ font-size: var(--font-size-sm);
164
+ color: var(--text-secondary);
165
+ }
166
+
167
+ /* Buttons */
168
+ .button-group {
169
+ display: flex;
170
+ gap: var(--spacing-md);
171
+ flex-wrap: wrap;
172
+ }
173
+
174
+ .btn {
175
+ padding: var(--spacing-sm) var(--spacing-lg);
176
+ border: none;
177
+ border-radius: var(--radius-md);
178
+ font-size: var(--font-size-base);
179
+ font-weight: 500;
180
+ cursor: pointer;
181
+ display: inline-flex;
182
+ align-items: center;
183
+ gap: var(--spacing-sm);
184
+ transition: all 0.2s ease;
185
+ position: relative;
186
+ }
187
+
188
+ .btn-primary {
189
+ background: linear-gradient(135deg, var(--accent-color) 0%, var(--primary-light) 100%);
190
+ color: white;
191
+ }
192
+
193
+ .btn-primary:hover {
194
+ transform: translateY(-1px);
195
+ box-shadow: 0 4px 12px rgba(43, 137, 224, 0.3);
196
+ }
197
+
198
+ .btn-primary:active {
199
+ transform: translateY(0);
200
+ }
201
+
202
+ .btn-primary:disabled {
203
+ opacity: 0.6;
204
+ cursor: not-allowed;
205
+ transform: none;
206
+ }
207
+
208
+ .btn-secondary {
209
+ background: var(--card-bg);
210
+ color: var(--text-primary);
211
+ border: 1px solid var(--border-color);
212
+ }
213
+
214
+ .btn-secondary:hover {
215
+ background: #F1F5F9;
216
+ border-color: #CBD5E1;
217
+ }
218
+
219
+ .btn-small {
220
+ padding: var(--spacing-xs) var(--spacing-sm);
221
+ font-size: var(--font-size-sm);
222
+ }
223
+
224
+ .btn-outline {
225
+ background: transparent;
226
+ border: 1px solid var(--primary-light);
227
+ color: var(--primary-light);
228
+ }
229
+
230
+ .btn-outline:hover {
231
+ background: rgba(44, 82, 130, 0.05);
232
+ }
233
+
234
+ /* Button Loader */
235
+ .btn-loader {
236
+ display: none;
237
+ width: 16px;
238
+ height: 16px;
239
+ border: 2px solid #ffffff;
240
+ border-radius: 50%;
241
+ border-top-color: transparent;
242
+ animation: spin 0.8s linear infinite;
243
+ }
244
+
245
+ .btn-loader.active {
246
+ display: block;
247
+ }
248
+
249
+ @keyframes spin {
250
+ to { transform: rotate(360deg); }
251
+ }
252
+
253
+ /* Compact Stats */
254
+ .compact-stats {
255
+ display: flex;
256
+ gap: var(--spacing-lg);
257
+ margin: var(--spacing-md) 0;
258
+ padding: 0 var(--spacing-lg);
259
+ font-size: var(--font-size-sm);
260
+ color: var(--text-secondary);
261
+ }
262
+
263
+ .stat-item {
264
+ display: flex;
265
+ align-items: baseline;
266
+ gap: var(--spacing-xs);
267
+ }
268
+
269
+ .stat-value {
270
+ font-weight: 600;
271
+ color: var(--text-primary);
272
+ }
273
+
274
+ .stat-name {
275
+ font-weight: 400;
276
+ }
277
+
278
+ /* Distribution Bars */
279
+ .distribution-container h3 {
280
+ margin-bottom: var(--spacing-md);
281
+ font-size: var(--font-size-lg);
282
+ font-weight: 600;
283
+ }
284
+
285
+ .distribution-bars {
286
+ display: flex;
287
+ flex-direction: column;
288
+ gap: var(--spacing-sm);
289
+ }
290
+
291
+ .distribution-bar {
292
+ display: flex;
293
+ align-items: center;
294
+ gap: var(--spacing-md);
295
+ font-size: var(--font-size-sm);
296
+ }
297
+
298
+ .distribution-label {
299
+ width: 40px;
300
+ font-weight: 500;
301
+ }
302
+
303
+ .distribution-track {
304
+ flex: 1;
305
+ height: 20px;
306
+ background: var(--border-color);
307
+ border-radius: var(--radius-sm);
308
+ overflow: hidden;
309
+ position: relative;
310
+ }
311
+
312
+ .distribution-fill {
313
+ height: 100%;
314
+ border-radius: var(--radius-sm);
315
+ display: flex;
316
+ align-items: center;
317
+ justify-content: flex-end;
318
+ padding-right: var(--spacing-sm);
319
+ color: white;
320
+ font-size: var(--font-size-sm);
321
+ font-weight: 500;
322
+ transition: width 0.5s ease;
323
+ }
324
+
325
+ .distribution-count {
326
+ width: 30px;
327
+ text-align: right;
328
+ color: var(--text-secondary);
329
+ }
330
+
331
+ /* Annotated Text */
332
+ .container-header {
333
+ display: flex;
334
+ justify-content: space-between;
335
+ align-items: center;
336
+ margin-bottom: var(--spacing-md);
337
+ }
338
+
339
+ .main-result {
340
+ padding: var(--spacing-xl);
341
+ }
342
+
343
+ .annotated-text {
344
+ line-height: 2.5; /* Generous line height for the underlines */
345
+ font-size: var(--font-size-lg);
346
+ font-family: var(--font-family);
347
+ white-space: pre-wrap;
348
+ word-wrap: break-word;
349
+ }
350
+
351
+ .annotation {
352
+ display: inline;
353
+ padding-bottom: 2px;
354
+ border-bottom-width: 3px;
355
+ border-bottom-style: solid;
356
+ background-color: transparent; /* Override any potential background utilities */
357
+ box-decoration-break: clone;
358
+ -webkit-box-decoration-break: clone;
359
+ transition: background-color 0.2s ease, border-color 0.2s ease;
360
+ cursor: help;
361
+ }
362
+
363
+ .annotation:hover {
364
+ background-color: rgba(0, 0, 0, 0.03); /* Very subtle hover effect */
365
+ }
366
+
367
+ /* Specific border colors for annotations - Overriding the general background utility classes */
368
+ .annotated-text .annotation.level-a1 { border-bottom-color: var(--a1-color); background-color: transparent; }
369
+ .annotated-text .annotation.level-a2 { border-bottom-color: var(--a2-color); background-color: transparent; }
370
+ .annotated-text .annotation.level-b1 { border-bottom-color: var(--b1-color); background-color: transparent; }
371
+ .annotated-text .annotation.level-b2 { border-bottom-color: var(--b2-color); background-color: transparent; }
372
+ .annotated-text .annotation.level-c1 { border-bottom-color: var(--c1-color); background-color: transparent; }
373
+ .annotated-text .annotation.level-c2 { border-bottom-color: var(--c2-color); background-color: transparent; }
374
+
375
+ /* In case the utility classes win specificity wise, we ensure these apply */
376
+ .annotation.level-a1, .annotation.level-a2, .annotation.level-b1,
377
+ .annotation.level-b2, .annotation.level-c1, .annotation.level-c2 {
378
+ background-color: transparent;
379
+ }
380
+
381
+ .annotation:hover.level-a1 { background-color: rgba(231, 76, 60, 0.1); }
382
+ .annotation:hover.level-a2 { background-color: rgba(230, 126, 34, 0.1); }
383
+ .annotation:hover.level-b1 { background-color: rgba(243, 156, 18, 0.1); }
384
+ .annotation:hover.level-b2 { background-color: rgba(39, 174, 96, 0.1); }
385
+ .annotation:hover.level-c1 { background-color: rgba(52, 152, 219, 0.1); }
386
+ .annotation:hover.level-c2 { background-color: rgba(155, 89, 182, 0.1); }
387
+
388
+ .annotation-hidden {
389
+ border-bottom-color: transparent !important;
390
+ }
391
+
392
+ .cefr-badge {
393
+ /* Deprecated but kept to prevent errors if stale JS runs */
394
+ display: none;
395
+ }
396
+
397
+ /* Sentence Table */
398
+ .table-wrapper {
399
+ overflow-x: auto;
400
+ border-radius: var(--radius-md);
401
+ border: 1px solid var(--border-color);
402
+ }
403
+
404
+ .sentence-table {
405
+ width: 100%;
406
+ border-collapse: collapse;
407
+ font-size: var(--font-size-base);
408
+ }
409
+
410
+ .sentence-table th {
411
+ background: #F1F5F9;
412
+ padding: var(--spacing-md);
413
+ text-align: left;
414
+ font-weight: 600;
415
+ color: var(--text-primary);
416
+ border-bottom: 2px solid var(--border-color);
417
+ }
418
+
419
+ .sentence-table td {
420
+ padding: var(--spacing-md);
421
+ border-bottom: 1px solid var(--border-color);
422
+ }
423
+
424
+ .sentence-table tbody tr:last-child td {
425
+ border-bottom: none;
426
+ }
427
+
428
+ .sentence-table tbody tr:nth-child(even) {
429
+ background: #F8FAFC;
430
+ }
431
+
432
+ .sentence-table tbody tr:hover {
433
+ background: #E2E8F0;
434
+ }
435
+
436
+ .sentence-text {
437
+ max-width: 600px;
438
+ word-wrap: break-word;
439
+ }
440
+
441
+ .level-cell {
442
+ display: flex;
443
+ align-items: center;
444
+ gap: var(--spacing-sm);
445
+ }
446
+
447
+ .level-indicator {
448
+ width: 12px;
449
+ height: 12px;
450
+ border-radius: 50%;
451
+ flex-shrink: 0;
452
+ }
453
+
454
+ .confidence-bar {
455
+ display: inline-block;
456
+ width: 60px;
457
+ height: 6px;
458
+ background: var(--border-color);
459
+ border-radius: 3px;
460
+ position: relative;
461
+ margin-left: var(--spacing-sm);
462
+ }
463
+
464
+ .confidence-fill {
465
+ position: absolute;
466
+ left: 0;
467
+ top: 0;
468
+ height: 100%;
469
+ background: var(--accent-color);
470
+ border-radius: 3px;
471
+ }
472
+
473
+ /* Modal */
474
+ .modal {
475
+ position: fixed;
476
+ top: 0;
477
+ left: 0;
478
+ right: 0;
479
+ bottom: 0;
480
+ background: rgba(0, 0, 0, 0.5);
481
+ z-index: 1000;
482
+ display: flex;
483
+ align-items: center;
484
+ justify-content: center;
485
+ padding: var(--spacing-md);
486
+ }
487
+
488
+ .modal-content {
489
+ background: white;
490
+ border-radius: var(--radius-lg);
491
+ max-width: 500px;
492
+ width: 100%;
493
+ box-shadow: var(--shadow-lg);
494
+ overflow: hidden;
495
+ }
496
+
497
+ .modal-header {
498
+ padding: var(--spacing-lg);
499
+ background: var(--primary-color);
500
+ color: white;
501
+ display: flex;
502
+ justify-content: space-between;
503
+ align-items: center;
504
+ }
505
+
506
+ .modal-header h3 {
507
+ margin: 0;
508
+ }
509
+
510
+ .modal-close {
511
+ background: none;
512
+ border: none;
513
+ color: white;
514
+ font-size: 1.5rem;
515
+ cursor: pointer;
516
+ width: 32px;
517
+ height: 32px;
518
+ display: flex;
519
+ align-items: center;
520
+ justify-content: center;
521
+ border-radius: 50%;
522
+ transition: background 0.2s ease;
523
+ }
524
+
525
+ .modal-close:hover {
526
+ background: rgba(255, 255, 255, 0.1);
527
+ }
528
+
529
+ .modal-body {
530
+ padding: var(--spacing-lg);
531
+ color: var(--text-primary);
532
+ }
533
+
534
+ .modal-footer {
535
+ padding: var(--spacing-lg);
536
+ border-top: 1px solid var(--border-color);
537
+ display: flex;
538
+ justify-content: flex-end;
539
+ }
540
+
541
+ /* Footer */
542
+ .footer {
543
+ text-align: center;
544
+ padding: var(--spacing-lg);
545
+ color: var(--text-secondary);
546
+ font-size: var(--font-size-sm);
547
+ margin-top: var(--spacing-xl);
548
+ }
549
+
550
+ /* Responsive Design */
551
+ @media (max-width: 768px) {
552
+ .container {
553
+ padding: var(--spacing-sm);
554
+ }
555
+
556
+ .header {
557
+ flex-direction: column;
558
+ gap: var(--spacing-md);
559
+ text-align: center;
560
+ padding: var(--spacing-lg);
561
+ }
562
+
563
+ .stats-grid {
564
+ grid-template-columns: 1fr;
565
+ }
566
+
567
+ .button-group {
568
+ flex-direction: column;
569
+ }
570
+
571
+ .btn {
572
+ width: 100%;
573
+ justify-content: center;
574
+ }
575
+
576
+ .container-header {
577
+ flex-direction: column;
578
+ gap: var(--spacing-md);
579
+ align-items: flex-start;
580
+ }
581
+
582
+ .sentence-table {
583
+ font-size: var(--font-size-sm);
584
+ }
585
+
586
+ .sentence-table th,
587
+ .sentence-table td {
588
+ padding: var(--spacing-sm);
589
+ }
590
+ }
591
+
592
+ /* Animations */
593
+ @keyframes fadeIn {
594
+ from {
595
+ opacity: 0;
596
+ transform: translateY(20px);
597
+ }
598
+ to {
599
+ opacity: 1;
600
+ transform: translateY(0);
601
+ }
602
+ }
603
+
604
+ .card {
605
+ animation: fadeIn 0.4s ease forwards;
606
+ }
607
+
608
+ /* CEFR Level Colors */
609
+ .level-a1 { background-color: var(--a1-color); }
610
+ .level-a2 { background-color: var(--a2-color); }
611
+ .level-b1 { background-color: var(--b1-color); }
612
+ .level-b2 { background-color: var(--b2-color); }
613
+ .level-c1 { background-color: var(--c1-color); }
614
+ .level-c2 { background-color: var(--c2-color); }
615
+
616
+ /* Utility Classes */
617
+ .text-center { text-align: center; }
618
+ .text-left { text-align: left; }
619
+ .text-right { text-align: right; }
620
+ .mt-sm { margin-top: var(--spacing-sm); }
621
+ .mt-md { margin-top: var(--spacing-md); }
622
+ .mt-lg { margin-top: var(--spacing-lg); }
623
+ .mb-sm { margin-bottom: var(--spacing-sm); }
624
+ .mb-md { margin-bottom: var(--spacing-md); }
625
+ .mb-lg { margin-bottom: var(--spacing-lg); }
web_app/static/js/app.js ADDED
@@ -0,0 +1,273 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ // CEFR Assessment Web App JavaScript
2
+
3
+ class CEFRApp {
4
+ constructor() {
5
+ this.elements = {
6
+ form: document.getElementById('assessment-form'),
7
+ textInput: document.getElementById('text-input'),
8
+ charCount: document.getElementById('char-count'),
9
+ assessBtn: document.getElementById('assess-btn'),
10
+ btnLoader: document.getElementById('btn-loader'),
11
+ btnText: document.querySelector('#assess-btn .btn-text'),
12
+ clearBtn: document.getElementById('clear-btn'),
13
+ resultsSection: document.getElementById('results-section'),
14
+ totalSentences: document.getElementById('total-sentences'),
15
+ avgConfidence: document.getElementById('avg-confidence'),
16
+ dominantLevel: document.getElementById('dominant-level'),
17
+ distributionBars: document.getElementById('distribution-bars'),
18
+ annotatedText: document.getElementById('annotated-text'),
19
+ sentenceTbody: document.getElementById('sentence-tbody'),
20
+ toggleHighlight: document.getElementById('toggle-highlight'),
21
+ errorModal: document.getElementById('error-modal'),
22
+ errorMessage: document.getElementById('error-message'),
23
+ };
24
+
25
+ this.cefrStyles = {
26
+ 'A1': { color: '#E74C3C', name: 'A1 - Beginner' },
27
+ 'A2': { color: '#E67E22', name: 'A2 - Elementary' },
28
+ 'B1': { color: '#F39C12', name: 'B1 - Intermediate' },
29
+ 'B2': { color: '#27AE60', name: 'B2 - Upper Intermediate' },
30
+ 'C1': { color: '#3498DB', name: 'C1 - Advanced' },
31
+ 'C2': { color: '#9B59B6', name: 'C2 - Proficient' },
32
+ };
33
+
34
+ this.showHighlights = true;
35
+
36
+ this.init();
37
+ }
38
+
39
+ init() {
40
+ // Event Listeners
41
+ this.elements.form.addEventListener('submit', (e) => this.handleSubmit(e));
42
+ this.elements.clearBtn.addEventListener('click', () => this.clearText());
43
+ this.elements.textInput.addEventListener('input', () => this.updateCharCount());
44
+ this.elements.toggleHighlight.addEventListener('click', () => this.toggleHighlighting());
45
+
46
+ // Modal close events
47
+ document.querySelectorAll('.modal-close').forEach(btn => {
48
+ btn.addEventListener('click', () => this.hideError());
49
+ });
50
+
51
+ this.elements.errorModal.addEventListener('click', (e) => {
52
+ if (e.target === this.elements.errorModal) {
53
+ this.hideError();
54
+ }
55
+ });
56
+
57
+ // Initial char count
58
+ this.updateCharCount();
59
+ }
60
+
61
+ updateCharCount() {
62
+ const count = this.elements.textInput.value.length;
63
+ const maxLength = 50000;
64
+ this.elements.charCount.textContent = `${count.toLocaleString()} / ${maxLength.toLocaleString()} characters`;
65
+
66
+ if (count > maxLength * 0.9) {
67
+ this.elements.charCount.style.color = '#E74C3C';
68
+ } else if (count > maxLength * 0.8) {
69
+ this.elements.charCount.style.color = '#F39C12';
70
+ } else {
71
+ this.elements.charCount.style.color = '#64748B';
72
+ }
73
+ }
74
+
75
+ async handleSubmit(e) {
76
+ e.preventDefault();
77
+
78
+ const text = this.elements.textInput.value.trim();
79
+ if (!text) {
80
+ this.showError('Please enter some text to analyze.');
81
+ return;
82
+ }
83
+
84
+ this.setLoading(true);
85
+ this.hideResults();
86
+
87
+ try {
88
+ const response = await fetch('/assess', {
89
+ method: 'POST',
90
+ headers: {
91
+ 'Content-Type': 'application/json',
92
+ },
93
+ body: JSON.stringify({ text }),
94
+ });
95
+
96
+ const data = await response.json();
97
+
98
+ if (!response.ok) {
99
+ throw new Error(data.error || 'An error occurred');
100
+ }
101
+
102
+ this.displayResults(data);
103
+ this.showResults();
104
+
105
+ // Scroll to results
106
+ setTimeout(() => {
107
+ this.elements.resultsSection.scrollIntoView({ behavior: 'smooth' });
108
+ }, 100);
109
+
110
+ } catch (error) {
111
+ console.error('Error:', error);
112
+ this.showError(error.message);
113
+ } finally {
114
+ this.setLoading(false);
115
+ }
116
+ }
117
+
118
+ setLoading(loading) {
119
+ if (loading) {
120
+ this.elements.assessBtn.disabled = true;
121
+ this.elements.btnLoader.classList.add('active');
122
+ this.elements.btnText.textContent = 'Analyzing...';
123
+ } else {
124
+ this.elements.assessBtn.disabled = false;
125
+ this.elements.btnLoader.classList.remove('active');
126
+ this.elements.btnText.textContent = 'Analyze Text';
127
+ }
128
+ }
129
+
130
+ displayResults(data) {
131
+ // Update stats
132
+ this.elements.totalSentences.textContent = data.stats.total_sentences;
133
+ this.elements.avgConfidence.textContent =
134
+ Math.round(data.stats.avg_confidence * 100) + '%';
135
+ this.elements.dominantLevel.textContent = data.stats.most_common_level.level;
136
+ this.elements.dominantLevel.style.color =
137
+ this.cefrStyles[data.stats.most_common_level.level]?.color || '#000';
138
+
139
+ // Update distribution
140
+ this.displayDistribution(data.stats.level_distribution, data.stats.total_sentences);
141
+
142
+ // Update annotated text
143
+ this.displayAnnotatedText(data.results);
144
+
145
+ // Update table
146
+ this.displayTable(data.results);
147
+ }
148
+
149
+ displayDistribution(distribution, total) {
150
+ const levels = ['A1', 'A2', 'B1', 'B2', 'C1', 'C2'];
151
+
152
+ this.elements.distributionBars.innerHTML = '';
153
+
154
+ levels.forEach(level => {
155
+ const count = distribution[level] || 0;
156
+ const percentage = total > 0 ? (count / total) * 100 : 0;
157
+ const style = this.cefrStyles[level] || { color: '#000' };
158
+
159
+ const bar = document.createElement('div');
160
+ bar.className = 'distribution-bar';
161
+ bar.innerHTML = `
162
+ <div class="distribution-label" style="color: ${style.color}">
163
+ ${level}
164
+ </div>
165
+ <div class="distribution-track">
166
+ <div class="distribution-fill level-${level.toLowerCase()}"
167
+ style="width: ${percentage}%;">
168
+ ${percentage > 10 ? Math.round(percentage) + '%' : ''}
169
+ </div>
170
+ </div>
171
+ <div class="distribution-count">${count}</div>
172
+ `;
173
+
174
+ this.elements.distributionBars.appendChild(bar);
175
+ });
176
+ }
177
+
178
+ displayAnnotatedText(results) {
179
+ this.elements.annotatedText.innerHTML = '';
180
+
181
+ results.forEach((item, index) => {
182
+ const style = this.cefrStyles[item.level] || { color: '#000' };
183
+
184
+ const annotation = document.createElement('span');
185
+ annotation.className = `annotation level-${item.level.toLowerCase()}`;
186
+ annotation.title = `${item.level} - ${this.cefrStyles[item.level].name}`;
187
+ annotation.textContent = item.sentence;
188
+
189
+ this.elements.annotatedText.appendChild(annotation);
190
+
191
+ // Add single space between sentences instead of newline
192
+ if (index < results.length - 1) {
193
+ this.elements.annotatedText.appendChild(document.createTextNode(' '));
194
+ }
195
+ });
196
+ }
197
+
198
+ displayTable(results) {
199
+ this.elements.sentenceTbody.innerHTML = '';
200
+
201
+ results.forEach((item, index) => {
202
+ const style = this.cefrStyles[item.level] || { color: '#000' };
203
+ const confidence = Math.round(item.confidence * 100);
204
+ const confidenceWidth = confidence;
205
+
206
+ const row = document.createElement('tr');
207
+ row.innerHTML = `
208
+ <td class="sentence-text">${item.sentence}</td>
209
+ <td>
210
+ <div class="level-cell">
211
+ <div class="level-indicator level-${item.level.toLowerCase()}"
212
+ style="background-color: ${style.color}">
213
+ </div>
214
+ <span>${item.level}</span>
215
+ </div>
216
+ </td>
217
+ <td>
218
+ ${confidence}%
219
+ <div class="confidence-bar">
220
+ <div class="confidence-fill" style="width: ${confidenceWidth}%"></div>
221
+ </div>
222
+ </td>
223
+ `;
224
+
225
+ this.elements.sentenceTbody.appendChild(row);
226
+ });
227
+ }
228
+
229
+ toggleHighlighting() {
230
+ this.showHighlights = !this.showHighlights;
231
+
232
+ if (this.showHighlights) {
233
+ this.elements.toggleHighlight.textContent = 'Hide Markers';
234
+ document.querySelectorAll('.annotation').forEach(annotation => {
235
+ annotation.classList.remove('annotation-hidden');
236
+ });
237
+ } else {
238
+ this.elements.toggleHighlight.textContent = 'Show Markers';
239
+ document.querySelectorAll('.annotation').forEach(annotation => {
240
+ annotation.classList.add('annotation-hidden');
241
+ });
242
+ }
243
+ }
244
+
245
+ clearText() {
246
+ this.elements.textInput.value = '';
247
+ this.updateCharCount();
248
+ this.hideResults();
249
+ }
250
+
251
+ showResults() {
252
+ this.elements.resultsSection.style.display = 'block';
253
+ }
254
+
255
+ hideResults() {
256
+ this.elements.resultsSection.style.display = 'none';
257
+ }
258
+
259
+ showError(message) {
260
+ this.elements.errorMessage.textContent = message;
261
+ this.elements.errorModal.style.display = 'flex';
262
+ }
263
+
264
+ hideError() {
265
+ this.elements.errorModal.style.display = 'none';
266
+ this.elements.errorMessage.textContent = '';
267
+ }
268
+ }
269
+
270
+ // Initialize app when DOM is loaded
271
+ document.addEventListener('DOMContentLoaded', () => {
272
+ new CEFRApp();
273
+ });
web_app/templates/index.html ADDED
@@ -0,0 +1,144 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>CEFR Sentence Level Assessment</title>
7
+ <link rel="stylesheet" href="{{ url_for('static', filename='css/style.css') }}">
8
+ <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap" rel="stylesheet">
9
+ </head>
10
+ <body>
11
+ <div class="container">
12
+ <!-- Header -->
13
+ <header class="header">
14
+ <div class="logo">
15
+ <h1>Swedish sentence-level CEFR analyzer</h1>
16
+ </div>
17
+ <div class="github-link">
18
+ <a href="https://github.com/fanwenlin/swe-cefr-sp" target="_blank">GitHub</a>
19
+ </div>
20
+ </header>
21
+
22
+ <!-- Main Content -->
23
+ <main class="main-content">
24
+ <!-- Input Section -->
25
+ <section class="input-section card">
26
+ <h2>Analyze Swedish Text</h2>
27
+ <p class="section-description">
28
+ Enter Swedish text below to assess the CEFR level of each sentence.
29
+ The model will analyze sentence complexity and assign proficiency levels from A1 to C2.
30
+ </p>
31
+
32
+ <form id="assessment-form">
33
+ <div class="form-group">
34
+ <label for="text-input" class="form-label">Input Text (Swedish)</label>
35
+ <textarea
36
+ id="text-input"
37
+ name="text"
38
+ class="text-input"
39
+ placeholder="Skriv din text här... (Write your text here...)
40
+
41
+ Example:
42
+ Jag heter Anna. Jag kommer från Sverige. Jag studerar datavetenskap på universitetet."
43
+ rows="12"
44
+ maxlength="50000"
45
+ ></textarea>
46
+ <div class="input-hint">
47
+ <span id="char-count">0 / 50,000 characters</span>
48
+ </div>
49
+ </div>
50
+
51
+ <div class="button-group">
52
+ <button type="submit" id="assess-btn" class="btn btn-primary">
53
+ <span class="btn-text">Analyze Text</span>
54
+ <div class="btn-loader" id="btn-loader"></div>
55
+ </button>
56
+ <button type="button" id="clear-btn" class="btn btn-secondary">Clear</button>
57
+ </div>
58
+ </form>
59
+ </section>
60
+
61
+ <!-- Results Section -->
62
+ <section class="results-section" id="results-section" style="display: none;">
63
+ <!-- Annotated Text - Main visual focus -->
64
+ <div class="annotated-text-container card main-result">
65
+ <div class="container-header">
66
+ <h3>Analyzed Text</h3>
67
+ <button id="toggle-highlight" class="btn btn-small btn-outline">Hide Markers</button>
68
+ </div>
69
+ <div class="annotated-text" id="annotated-text">
70
+ <!-- Results will be populated by JavaScript -->
71
+ </div>
72
+ </div>
73
+
74
+ <!-- Compact Stats -->
75
+ <div class="compact-stats">
76
+ <div class="stat-item">
77
+ <span class="stat-value" id="total-sentences">0</span>
78
+ <span class="stat-name">sentences</span>
79
+ </div>
80
+ <div class="stat-item">
81
+ <span class="stat-value" id="avg-confidence">0%</span>
82
+ <span class="stat-name">avg confidence</span>
83
+ </div>
84
+ <div class="stat-item">
85
+ <span class="stat-value" id="dominant-level">-</span>
86
+ <span class="stat-name">dominant level</span>
87
+ </div>
88
+ </div>
89
+
90
+ <!-- Level Distribution -->
91
+ <div class="distribution-container card">
92
+ <h3>Level Distribution</h3>
93
+ <div class="distribution-bars" id="distribution-bars">
94
+ <!-- Bars will be populated by JavaScript -->
95
+ </div>
96
+ </div>
97
+
98
+ <!-- Sentence Table -->
99
+ <div class="sentence-table-container card">
100
+ <h3>Detailed Results</h3>
101
+ <div class="table-wrapper">
102
+ <table class="sentence-table" id="sentence-table">
103
+ <thead>
104
+ <tr>
105
+ <th>Sentence</th>
106
+ <th>Level</th>
107
+ <th>Confidence</th>
108
+ </tr>
109
+ </thead>
110
+ <tbody id="sentence-tbody">
111
+ <!-- Results will be populated by JavaScript -->
112
+ </tbody>
113
+ </table>
114
+ </div>
115
+ </div>
116
+ </section>
117
+ </main>
118
+
119
+ <!-- Footer -->
120
+ <footer class="footer">
121
+ <p>Powered by Metric Proto K3 • Swedish BERT-base Model</p>
122
+ <p>CEFR Levels: A1 (Beginner) → C2 (Proficient)</p>
123
+ </footer>
124
+ </div>
125
+
126
+ <!-- Error Modal -->
127
+ <div id="error-modal" class="modal" style="display: none;">
128
+ <div class="modal-content">
129
+ <div class="modal-header">
130
+ <h3>Error</h3>
131
+ <button class="modal-close">&times;</button>
132
+ </div>
133
+ <div class="modal-body" id="error-message">
134
+ <!-- Error message will be populated -->
135
+ </div>
136
+ <div class="modal-footer">
137
+ <button class="btn btn-primary modal-close">OK</button>
138
+ </div>
139
+ </div>
140
+ </div>
141
+
142
+ <script src="{{ url_for('static', filename='js/app.js') }}"></script>
143
+ </body>
144
+ </html>