Spaces:

google
/

embeddinggemma-tuning-lab

Running

App Files Files Community

bebechien commited on Oct 21, 2025

Commit

64ae41c

verified ·

1 Parent(s): ba200cc

Upload folder using huggingface_hub

Browse files

Files changed (12) hide show

README.md +161 -7
app.py +364 -0
cli_mood_reader.py +179 -0
config.py +56 -0
data_fetcher.py +112 -0
flask_app.py +58 -0
hn_mood_reader.py +71 -0
model_trainer.py +132 -0
requirements.txt +9 -0
templates/error.html +13 -0
templates/index.html +127 -0
vibe_logic.py +85 -0

README.md CHANGED Viewed

@@ -1,14 +1,168 @@
 ---
-title: Embeddinggemma Modkit
-emoji: 🏆
-colorFrom: red
-colorTo: yellow
 sdk: gradio
 sdk_version: 5.49.1
 app_file: app.py
 pinned: false
-license: apache-2.0
-short_description: EmbeddingGemma Mod Kit
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Embedding Gemma Modkit
+emoji: 😻
+colorFrom: green
+colorTo: indigo
 sdk: gradio
 sdk_version: 5.49.1
 app_file: app.py
 pinned: false
 ---
+# 🤖 Embedding Gemma Modkit: Fine-Tuning and Mood Reader
+This project provides a set of tools to fine-tune a sentence-embedding model to understand your personal taste in Hacker News titles and then use that model to score and rank new articles based on their "vibe."
+It includes three main applications:
+1.  A **Gradio App** for interactive fine-tuning, evaluation, and real-time "vibe checks."
+2.  An interactive **Command-Line (CLI) App** for viewing and scrolling through the scored feed directly in your terminal.
+3.  A **Flask App** for a simple, deployable web "mood reader" that displays the live HN feed.
+---
+## ✨ Features
+* **Interactive Fine-Tuning:** Use a Gradio interface to select your favorite Hacker News titles and fine-tune the `google/embeddinggemma-300m` model on your preferences.
+* **Semantic Search Evaluation:** See the immediate impact of your training by comparing semantic search results before and after fine-tuning.
+* **Live "Vibe Check":** Input any news title or text to get a real-time similarity score (its "vibe") against your personalized anchor.
+* **Interactive CLI:** A terminal-based mood reader with color-coded output, scrolling, and live refresh capabilities.
+* **Hacker News Mood Reader:** View the live Hacker News feed with each story scored and color-coded based on the current model's understanding of your taste.
+* **Data & Model Management:** Easily import additional training data, export the generated dataset, and download the fine-tuned model as a ZIP file.
+* **Standalone Flask App:** A lightweight, read-only web app to continuously display the scored HN feed, perfect for simple deployment.
+---
+## 🔧 How It Works
+The core idea is to measure the "vibe" of a news title by calculating the semantic similarity between its embedding and the embedding of a fixed anchor phrase, defined in `config.py` as **`MY_FAVORITE_NEWS`**.
+1.  **Embedding:** The `sentence-transformers` library is used to convert news titles and the anchor phrase into high-dimensional vectors (embeddings).
+2.  **Scoring:** The cosine similarity (or dot product on normalized embeddings) between a title's embedding and the anchor's embedding is calculated. A higher score means a better "vibe."
+3.  **Fine-Tuning:** The Gradio app generates a contrastive learning dataset from your selections.
+    * **Positive Pairs:** (`MY_FAVORITE_NEWS`, `[A title you selected]`)
+    * **Negative Pairs:** (`MY_FAVORITE_NEWS`, `[A title you did not select]`)
+4.  **Training:** The model is trained using `MultipleNegativesRankingLoss`, which fine-tunes it to pull the embeddings of your "favorite" titles closer to the anchor phrase and push the others away.
+## 🚀 Getting Started
+### 1. Prerequisites
+* Python 3.12+
+* Git
+### 2. Installation
+```bash
+# Clone the repository
+git clone https://huggingface.co/spaces/bebechien/news-vibe-checker
+cd news-vibe-checker
+# Create and activate a virtual environment (recommended)
+python -m venv venv
+source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
+# Install the required packages
+pip install -r requirements.txt
+````
+### 3\. (Optional) Hugging Face Authentication
+If you plan to use gated models or push your fine-tuned model to the Hugging Face Hub, you need to authenticate.
+```bash
+# Set your Hugging Face token as an environment variable
+export HF_TOKEN="your_hf_token_here"
+```
+-----
+## 🖥️ Running the Applications
+You can run any of the three applications depending on your needs.
+### Option A: Interactive Fine-Tuning (Gradio App)
+This is the main application for creating and evaluating a personalized model.
+**▶️ To run:**
+```bash
+python app.py
+```
+Navigate to the local URL provided (e.g., `http://127.0.0.1:7860`).
+### Option B: Interactive Terminal Viewer (CLI App)
+This app runs directly in your terminal, allowing you to quickly see and scroll through the scored Hacker News feed.
+**▶️ To run:**
+```bash
+python cli_mood_reader.py
+```
+**Interactive Controls:**
+  * **[↑|↓]** arrow keys to scroll through the story list.
+  * **[SPACE]** to refresh the feed with the latest stories.
+  * **[q]** to quit the application.
+You can also start it with options:
+```bash
+# Specify a different model from Hugging Face
+python cli_mood_reader.py --model google/embeddinggemma-300m
+# Show 10 stories per screen instead of the default 15
+python cli_mood_reader.py --top 10
+```
+### Option C: Standalone Web Viewer (Flask App)
+This app is a simple, read-only web page that fetches and displays the scored HN feed. It's ideal for deploying a finished model.
+**▶️ To run:**
+```bash
+# (Optional) Specify a model from the Hugging Face Hub
+export MOOD_MODEL="bebechien/embedding-gemma-finetuned-hn"
+# Run the Flask server
+python flask_app.py
+```
+Navigate to `http://127.0.0.1:5000` to see the results.
+-----
+## ⚙️ Configuration
+Key parameters can be adjusted in `config.py`:
+  * `MODEL_NAME`: The base model to use for fine-tuning (e.g., `'google/embeddinggemma-300m'`).
+  * `QUERY_ANCHOR`: The anchor text used for similarity scoring (e.g., `"MY_FAVORITE_NEWS"`).
+  * `DEFAULT_MOOD_READER_MODEL`: The default model used by the Flask and CLI apps.
+  * `HN_RSS_URL`: The RSS feed URL.
+  * `CACHE_DURATION_SECONDS`: How long to cache the RSS feed data.
+-----
+## 📂 File Structure
+```
+.
+├── app.py                  # Main Gradio application for fine-tuning
+├── cli_mood_reader.py      # Interactive command-line mood reader
+├── flask_app.py            # Standalone Flask application for mood reading
+├── hn_mood_reader.py       # Core logic for fetching and scoring (used by Flask/CLI)
+├── model_trainer.py        # Handles model loading and fine-tuning
+├── vibe_logic.py           # Calculates similarity scores and "vibe" status
+├── data_fetcher.py         # Fetches and caches the Hacker News RSS feed
+├── config.py               # Central configuration for all modules
+├── requirements.txt        # Python package dependencies
+├── README.md               # This file
+└── templates/              # HTML templates for the Flask app
+    ├── index.html
+    └── error.html
+```

app.py ADDED Viewed

	@@ -0,0 +1,364 @@

+import gradio as gr
+import os
+import shutil
+import time
+import csv
+from itertools import cycle
+from typing import List, Iterable, Tuple, Optional, Callable
+from datetime import datetime
+# Import modules
+from data_fetcher import read_hacker_news_rss, format_published_time
+from model_trainer import (
+    authenticate_hf,
+    train_with_dataset,
+    get_top_hits,
+    load_embedding_model
+)
+from config import AppConfig
+from vibe_logic import VibeChecker
+from sentence_transformers import SentenceTransformer
+# --- Main Application Class ---
+class HackerNewsFineTuner:
+    """
+    Encapsulates all application logic and state for the Gradio interface.
+    Manages the embedding model, news data, and training datasets.
+    """
+    def __init__(self, config: AppConfig = AppConfig):
+        # --- Dependencies ---
+        self.config = config
+        # --- Application State ---
+        self.model: Optional[SentenceTransformer] = None
+        self.vibe_checker: Optional[VibeChecker] = None
+        self.titles: List[str] = [] # Top titles for user selection
+        self.target_titles: List[str] = [] # Remaining titles for semantic search target pool
+        self.number_list: List[int] = [] # [0, 1, 2, ...] for checkbox indexing
+        self.last_hn_dataset: List[List[str]] = [] # Last generated dataset from HN selection
+        self.imported_dataset: List[List[str]] = [] # Manually imported dataset
+        # Setup
+        os.makedirs(self.config.ARTIFACTS_DIR, exist_ok=True)
+        print(f"Created artifact directory: {self.config.ARTIFACTS_DIR}")
+        authenticate_hf(self.config.HF_TOKEN)
+        # Load initial data on startup
+        self._initial_load()
+    def _initial_load(self):
+        """Helper to run the refresh function once at startup."""
+        print("--- Running Initial Data Load ---")
+        self.refresh_data_and_model()
+        print("--- Initial Load Complete ---")
+    def _update_vibe_checker(self):
+        """Initializes or updates the VibeChecker with the current model state."""
+        if self.model:
+            print("Updating VibeChecker instance with the current model.")
+            self.vibe_checker = VibeChecker(
+                model=self.model,
+                query_anchor=self.config.QUERY_ANCHOR,
+                task_name=self.config.TASK_NAME
+            )
+        else:
+            self.vibe_checker = None
+    ## Data and Model Management ##
+    def refresh_data_and_model(self) -> Tuple[gr.update, gr.update]:
+        """
+        1. Reloads the embedding model to clear fine-tuning.
+        2. Fetches fresh news data (from cache or web).
+        3. Updates the class state and returns Gradio updates for the UI.
+        """
+        print("\n" + "=" * 50)
+        print("RELOADING MODEL and RE-FETCHING DATA")
+        # Reset dataset state
+        self.last_hn_dataset = []
+        self.imported_dataset = []
+        # 1. Reload the base embedding model
+        try:
+            self.model = load_embedding_model(self.config.MODEL_NAME)
+            self._update_vibe_checker()
+        except Exception as e:
+            gr.Error(f"Model load failed: {e}")
+            self.model = None
+            self._update_vibe_checker()
+            return (
+                gr.update(choices=[], label="Model Load Failed"),
+                gr.update(value=f"CRITICAL ERROR: Model failed to load. {e}")
+            )
+        # 2. Fetch fresh news data
+        news_feed, status_msg = read_hacker_news_rss(self.config)
+        titles_out, target_titles_out = [], []
+        status_value: str = f"Model and data reloaded. Status: {status_msg}. Click 'Run Fine-Tuning' to begin."
+        if news_feed is not None and news_feed.entries:
+            # Use constant for clarity
+            titles_out = [item.title for item in news_feed.entries[:self.config.TOP_TITLES_COUNT]]
+            target_titles_out = [item.title for item in news_feed.entries[self.config.TOP_TITLES_COUNT:]]
+            print(f"Data reloaded: {len(titles_out)} selection titles, {len(target_titles_out)} target titles.")
+        else:
+            titles_out = ["Error fetching news, check console.", "Could not load feed.", "No data available."]
+            gr.Warning(f"Data reload failed. Using error placeholders. Details: {status_msg}")
+        self.titles = titles_out
+        self.target_titles = target_titles_out
+        self.number_list = list(range(len(self.titles)))
+        # Return Gradio updates for CheckboxGroup and Textbox
+        return (
+            gr.update(
+                choices=self.titles,
+                label=f"Hacker News Top {len(self.titles)} (Select your favorites)"
+            ),
+            gr.update(value=status_value)
+        )
+    # --- Import Dataset/Export ---
+    def import_additional_dataset(self, file_path: str) -> str:
+        if not file_path:
+            return "Please upload a CSV file."
+        new_dataset, num_imported = [], 0
+        try:
+            with open(file_path, 'r', newline='', encoding='utf-8') as f:
+                reader = csv.reader(f)
+                try:
+                    header = next(reader)
+                    if not (header and header[0].lower().strip() == 'anchor'):
+                        f.seek(0)
+                except StopIteration:
+                    return "Error: Uploaded file is empty."
+                for row in reader:
+                    if len(row) == 3:
+                        new_dataset.append([s.strip() for s in row])
+                        num_imported += 1
+            if num_imported == 0:
+                raise ValueError("No valid [Anchor, Positive, Negative] rows found in the CSV.")
+            self.imported_dataset = new_dataset
+            return f"Successfully imported {num_imported} additional training triplets."
+        except Exception as e:
+            gr.Error(f"Import failed. Ensure the CSV format is: [Anchor, Positive, Negative]. Error: {e}")
+            return "Import failed. Check console for details."
+    def export_dataset(self) -> Optional[str]:
+        if not self.last_hn_dataset:
+            gr.Warning("No dataset has been generated from current selection yet. Please run fine-tuning first.")
+            return None
+        file_path = self.config.DATASET_EXPORT_FILENAME
+        try:
+            print(f"Exporting dataset to {file_path}...")
+            with open(file_path, 'w', newline='', encoding='utf-8') as f:
+                writer = csv.writer(f)
+                writer.writerow(['Anchor', 'Positive', 'Negative'])
+                writer.writerows(self.last_hn_dataset)
+            gr.Info(f"Dataset successfully exported to {file_path}")
+            return str(file_path)
+        except Exception as e:
+            gr.Error(f"Failed to export the dataset to CSV. Error: {e}")
+            return None
+    def download_model(self) -> Optional[str]:
+        if not os.path.exists(self.config.OUTPUT_DIR):
+            gr.Warning(f"The model directory '{self.config.OUTPUT_DIR}' does not exist. Please run training first.")
+            return None
+        timestamp = int(time.time())
+        try:
+            base_name = os.path.join(self.config.ARTIFACTS_DIR, f"embedding_gemma_finetuned_{timestamp}")
+            archive_path = shutil.make_archive(
+                base_name=base_name,
+                format='zip',
+                root_dir=self.config.OUTPUT_DIR,
+            )
+            gr.Info(f"Model files successfully zipped to: {archive_path}")
+            return archive_path
+        except Exception as e:
+            gr.Error(f"Failed to create the model ZIP file. Error: {e}")
+            return None
+    ## Training Logic ##
+    def _create_hn_dataset(self, selected_ids: List[int]) -> Tuple[List[List[str]], str, str]:
+        """
+        Internal function to generate the [Anchor, Positive, Negative] triplets
+        from the user's Hacker News title selection.
+        Returns (dataset, favorite_title, non_favorite_title)
+        """
+        total_ids, selected_ids = set(self.number_list), set(selected_ids)
+        non_selected_ids = total_ids - selected_ids
+        is_minority = len(selected_ids) < (len(total_ids) / 2)
+        anchor_ids, pool_ids = (non_selected_ids, list(selected_ids)) if is_minority else (selected_ids, list(non_selected_ids))
+        def get_titles(anchor_id, pool_id):
+            return (self.titles[pool_id], self.titles[anchor_id]) if is_minority else (self.titles[anchor_id], self.titles[pool_id])
+        fav_idx = pool_ids[0] if is_minority else list(anchor_ids)[0]
+        non_fav_idx = list(anchor_ids)[0] if is_minority else pool_ids[0]
+        hn_dataset = []
+        pool_cycler = cycle(pool_ids)
+        for anchor_id in sorted(list(anchor_ids)):
+            fav, non_fav = get_titles(anchor_id, next(pool_cycler))
+            hn_dataset.append([self.config.QUERY_ANCHOR, fav, non_fav])
+        return hn_dataset, self.titles[fav_idx], self.titles[non_fav_idx]
+    def training(self, selected_ids: List[int]) -> str:
+        """
+        Generates a training dataset from user selection and runs the fine-tuning process.
+        """
+        if self.model is None:
+             raise gr.Error("Training failed: Embedding model is not loaded.")
+        if not selected_ids:
+            raise gr.Error("You must select at least one title.")
+        if len(selected_ids) == len(self.number_list):
+            raise gr.Error("You can't select all titles; a non-favorite is needed.")
+        hn_dataset, example_fav, _ = self._create_hn_dataset(selected_ids)
+        self.last_hn_dataset = hn_dataset
+        final_dataset = self.last_hn_dataset + self.imported_dataset
+        if not final_dataset:
+            raise gr.Error("Training failed: Final dataset is empty.")
+        print(f"Combined dataset size: {len(final_dataset)} triplets.")
+        def semantic_search_fn() -> str:
+            return get_top_hits(model=self.model, target_titles=self.target_titles, task_name=self.config.TASK_NAME, query=self.config.QUERY_ANCHOR)
+        result = "### Semantic Search Results (Before Training):\n" + f"{semantic_search_fn()}\n\n"
+        print("-" * 50 + "\nStarting Fine-tuning...")
+        train_with_dataset(model=self.model, dataset=final_dataset, output_dir=self.config.OUTPUT_DIR, task_name=self.config.TASK_NAME, search_fn=semantic_search_fn)
+        self._update_vibe_checker()
+        print("Fine-tuning Complete.\n" + "-" * 50)
+        result += "### Semantic Search Results (After Training):\n" + f"{semantic_search_fn()}"
+        return result
+    ## Vibe Check Logic (Tab 2) ##
+    def get_vibe_check(self, news_text: str) -> Tuple[str, str, gr.update]:
+        if not self.vibe_checker:
+            gr.Error("Model/VibeChecker not loaded.")
+            return "N/A", "Model Error", gr.update(value=self._generate_vibe_html("gray"))
+        if not news_text or len(news_text.split()) < 3:
+            gr.Warning("Please enter a longer text for a meaningful check.")
+            return "N/A", "Please enter text", gr.update(value=self._generate_vibe_html("white"))
+        try:
+            vibe_result = self.vibe_checker.check(news_text)
+            status = vibe_result.status_html.split('>')[1].split('<')[0] # Extract text from HTML
+            return f"{vibe_result.raw_score:.4f}", status, gr.update(value=self._generate_vibe_html(vibe_result.color_hsl))
+        except Exception as e:
+            gr.Error(f"Vibe check failed. Error: {e}")
+            return "N/A", f"Processing Error: {e}", gr.update(value=self._generate_vibe_html("gray"))
+    def _generate_vibe_html(self, color: str) -> str:
+        return f'<div style="background-color: {color}; height: 100px; border-radius: 12px; border: 2px solid #ccc;"></div>'
+    ## Mood Reader Logic (Tab 3) ##
+    def fetch_and_display_mood_feed(self) -> str:
+        if not self.vibe_checker:
+            return "**FATAL ERROR:** The Mood Reader failed to initialize. Check console."
+        feed, status = read_hacker_news_rss(self.config)
+        if not feed or not feed.entries:
+            return f"**An error occurred while fetching the feed:** {status}"
+        scored_entries = []
+        for entry in feed.entries:
+            title = entry.get('title')
+            if not title: continue
+            vibe_result = self.vibe_checker.check(title)
+            scored_entries.append({
+                "title": title,
+                "link": entry.get('link', '#'),
+                "comments": entry.get('comments', '#'),
+                "published": format_published_time(entry.published_parsed),
+                "mood": vibe_result
+            })
+        scored_entries.sort(key=lambda x: x["mood"].raw_score, reverse=True)
+        md = (f"## Hacker News Top Stories (Model: `{self.config.MODEL_NAME}`{' - Fine-tuned' if self.last_hn_dataset else ''}) ⬇️\n"
+              f"**Last Updated:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n\n"
+              "| Vibe | Title | Comments | Published |\n|---|---|---|---|\n")
+        for item in scored_entries:
+            md += (f"| {item['mood'].status_html} "
+                   f"| [{item['title']}]({item['link']}) "
+                   f"| [Comments]({item['comments']}) "
+                   f"| {item['published']} |\n")
+        return md
+# 🤖 Embedding Gemma Modkit: Fine-Tuning and Mood Reader
+    ## Gradio Interface Setup ##
+    def build_interface(self) -> gr.Blocks:
+        with gr.Blocks(title="Embedding Gemma Modkit") as demo:
+            gr.Markdown("# 🤖 Embedding Gemma Modkit: Fine-Tuning and Mood Reader")
+            gr.Markdown("See [README](./README.md) for more details.")
+            with gr.Tab("🚀 Fine-Tuning & Evaluation"):
+                self._build_training_interface()
+            with gr.Tab("💡 News Vibe Check"):
+                self._build_vibe_check_interface()
+            with gr.Tab("📰 Hacker News Mood Reader"):
+                self._build_mood_reader_interface()
+        return demo
+    def _build_training_interface(self):
+        with gr.Column():
+            gr.Markdown("## Fine-Tuning & Semantic Search\nSelect titles to fine-tune the model towards making them more similar to **`MY_FAVORITE_NEWS`**.")
+            with gr.Row():
+                favorite_list = gr.CheckboxGroup(self.titles, type="index", label=f"Hacker News Top {len(self.titles)}", show_select_all=True)
+                output = gr.Textbox(lines=24, label="Training and Search Results", value="Click 'Run Fine-Tuning' to begin.")
+            with gr.Row():
+                clear_reload_btn = gr.Button("Clear & Reload Model/Data")
+                run_training_btn = gr.Button("🚀 Run Fine-Tuning", variant="primary")
+            gr.Markdown("--- \n ## Dataset & Model Management")
+            with gr.Row():
+                import_file = gr.File(label="Upload Additional Dataset (.csv)", file_types=[".csv"], height=50)
+                download_dataset_btn = gr.Button("💾 Export Last HN Dataset")
+                download_model_btn = gr.Button("⬇️ Download Fine-Tuned Model")
+            download_status = gr.Markdown("Ready.")
+            with gr.Row():
+                dataset_output = gr.File(label="Download Dataset CSV", height=50, visible=False, interactive=False)
+                model_output = gr.File(label="Download Model ZIP", height=50, visible=False, interactive=False)
+            run_training_btn.click(fn=self.training, inputs=favorite_list, outputs=output)
+            clear_reload_btn.click(fn=self.refresh_data_and_model, inputs=None, outputs=[favorite_list, output], queue=False)
+            import_file.change(fn=self.import_additional_dataset, inputs=[import_file], outputs=download_status)
+            download_dataset_btn.click(lambda: [gr.update(value=None, visible=False), "Generating..."], None, [dataset_output, download_status], queue=False).then(self.export_dataset, None, dataset_output).then(lambda p: [gr.update(visible=p is not None, value=p), "CSV ready." if p else "Export failed."], [dataset_output], [dataset_output, download_status])
+            download_model_btn.click(lambda: [gr.update(value=None, visible=False), "Zipping..."], None, [model_output, download_status], queue=False).then(self.download_model, None, model_output).then(lambda p: [gr.update(visible=p is not None, value=p), "ZIP ready." if p else "Zipping failed."], [model_output], [model_output, download_status])
+    def _build_vibe_check_interface(self):
+        with gr.Column():
+            gr.Markdown(f"## News Vibe Check Mood Lamp\nEnter text to see its similarity to **`{self.config.QUERY_ANCHOR}`**.\n**Vibe Key:** Green = High, Red = Low")
+            news_input = gr.Textbox(label="Enter News Title or Summary", lines=3)
+            vibe_check_btn = gr.Button("Check Vibe", variant="primary")
+            with gr.Row():
+                vibe_color_block = gr.HTML(value=self._generate_vibe_html("white"), label="Mood Lamp")
+                with gr.Column():
+                    vibe_score = gr.Textbox(label="Cosine Similarity Score", value="N/A", interactive=False)
+                    vibe_status = gr.Textbox(label="Vibe Status", value="Enter text and click 'Check Vibe'", interactive=False, lines=2)
+            vibe_check_btn.click(fn=self.get_vibe_check, inputs=[news_input], outputs=[vibe_score, vibe_status, vibe_color_block])
+    def _build_mood_reader_interface(self):
+        with gr.Column():
+            gr.Markdown(f"## Live Hacker News Feed Vibe\nThis feed uses the current model (base or fine-tuned) to score the vibe of live HN stories against **`{self.config.QUERY_ANCHOR}`**.")
+            feed_output = gr.Markdown(value="Click 'Refresh Feed' to load stories.", label="Latest Stories")
+            refresh_button = gr.Button("Refresh Feed 🔄", size="lg", variant="primary")
+            refresh_button.click(fn=self.fetch_and_display_mood_feed, inputs=None, outputs=feed_output)
+if __name__ == "__main__":
+    app = HackerNewsFineTuner(AppConfig)
+    demo = app.build_interface()
+    print("Starting Gradio App...")
+    demo.launch()

cli_mood_reader.py ADDED Viewed

	@@ -0,0 +1,179 @@

+import os
+import sys
+import shutil
+import click
+from datetime import datetime
+from typing import List
+# --- Core Logic Imports ---
+# These modules contain the application's functionality.
+from config import AppConfig
+from hn_mood_reader import HnMoodReader, FeedEntry
+from vibe_logic import VIBE_THRESHOLDS
+# --- Helper Functions ---
+def get_status_text_and_color(score: float) -> (str, str):
+    """
+    Determines the plain text status and a corresponding color for a given score.
+    """
+    clamped_score = max(0.0, min(1.0, score))
+    # Define colors for different vibe levels
+    color_map = {
+        "VIBE:HIGH": "green",
+        "VIBE:GOOD": "cyan",
+        "VIBE:FLAT": "yellow",
+        "VIBE:LOW": "red"
+    }
+    for threshold in VIBE_THRESHOLDS:
+        if clamped_score >= threshold.score:
+            status = threshold.status.split(" ")[-1].replace('&nbsp;', '')
+            return status, color_map.get(status, "white")
+    # Fallback for the lowest score
+    status = VIBE_THRESHOLDS[-1].status.split(" ")[-1].replace('&nbsp;', '')
+    return status, color_map.get(status, "white")
+def initialize_reader(model_name: str) -> HnMoodReader:
+    """
+    Initializes the HnMoodReader instance with the specified model.
+    Exits the script if the model fails to load.
+    """
+    click.echo(f"Initializing mood reader with model: '{model_name}'...", err=True)
+    try:
+        reader = HnMoodReader(model_name=model_name)
+        click.secho("✅ Model loaded successfully.", fg="green", err=True)
+        return reader
+    except Exception as e:
+        click.secho(f"❌ FATAL: Could not initialize model '{model_name}'.", fg="red", err=True)
+        click.secho(f"   Error: {e}", fg="red", err=True)
+        sys.exit(1) # Exit with a non-zero code to indicate failure
+def display_feed(scored_entries: List[FeedEntry], top: int, offset: int, model_name: str):
+    """Clears the screen and displays the current slice of the feed."""
+    click.clear()
+    # Get terminal width, but default to 80 if it's too narrow
+    # to avoid breaking the layout.
+    try:
+        terminal_width = shutil.get_terminal_size()[0]
+    except OSError: # Handle cases where terminal size can't be determined (e.g., in a pipe)
+        terminal_width = 80
+    click.echo(f"📰 Hacker News Mood Reader")
+    click.echo(f"   Model: {model_name}")
+    click.echo(f"   Showing {offset + 1}-{min(offset + top, len(scored_entries))} of {len(scored_entries)} stories")
+    click.secho("=" * terminal_width, fg="blue")
+    header = f"{'VIBE':<5} | {'SCORE':<7} | {'PUBLISHED':<16} | {'TITLE'}"
+    click.secho(header, bold=True)
+    click.secho("-" * terminal_width, fg="blue")
+    # Calculate the fixed width of the columns before the title
+    # Vibe: 5
+    # Score: | + ' ' + '0.0000' + ' ' = 9
+    # Published: | + ' ' + 'YYYY-MM-DD HH:MM' + ' ' + | + ' ' = 21
+    # Total fixed width = 5 + 9 + 21 = 35
+    fixed_width = 35
+    max_title_width = terminal_width - fixed_width
+    # --- MODIFICATION END ---
+    if not scored_entries:
+        click.echo("No entries found in the feed.")
+    else:
+        # Display the current "page" of entries based on the offset
+        for entry in scored_entries[offset:offset + top]:
+            status, color = get_status_text_and_color(entry.mood.raw_score)
+            # --- MODIFICATION: VIBE width changed from 12 to 5 ---
+            # Also ensure the status text itself is truncated if it's longer than 5
+            truncated_status = status[5:]
+            vibe_part = click.style(f"{truncated_status:<5}", fg=color)
+            score_part = f"| {entry.mood.raw_score:>.4f} "
+            published_part = f"| {entry.published_time_str:<16} | "
+            # --- Title Truncation Logic ---
+            full_title = entry.title
+            if len(full_title) > max_title_width:
+                # Truncate and add ellipsis, reserving 3 chars for '...'
+                title_part = full_title[:max_title_width - 3] + "..."
+            else:
+                title_part = full_title
+            # --- End Title Truncation ---
+            # Combine parts and print
+            full_line = vibe_part + score_part + published_part + title_part
+            click.echo(full_line)
+    click.secho("-" * terminal_width, fg="blue")
+# --- Main Application Logic (CLI Command) ---
+@click.command()
+@click.option(
+    "-m", "--model",
+    help="Name of the Sentence Transformer model from Hugging Face. Overrides MOOD_MODEL env var.",
+    default=None,
+    show_default=False
+)
+@click.option(
+    "-n", "--top",
+    help="Number of stories to display on screen at once.",
+    default=15,
+    type=int,
+    show_default=True
+)
+def main(model, top):
+    """
+    Fetch and display Hacker News stories scored by a sentence-embedding model.
+    Runs continuously. Use arrow keys to scroll, [SPACE] to refresh, [q] to quit.
+    """
+    # --- State Management ---
+    model_name = model or os.environ.get("MOOD_MODEL") or AppConfig.DEFAULT_MOOD_READER_MODEL
+    reader = initialize_reader(model_name)
+    scored_entries: List[FeedEntry] = []
+    scroll_offset = 0
+    # --- Initial Fetch ---
+    click.echo("Fetching initial feed...", err=True)
+    try:
+        scored_entries = reader.fetch_and_score_feed()
+    except Exception as e:
+        click.secho(f"❌ ERROR: Initial fetch failed: {e}", fg="red", err=True)
+    # --- Main Loop ---
+    while True:
+        display_feed(scored_entries, top, scroll_offset, reader.model_name)
+        click.secho("Use [↑|↓] to scroll, [SPACE] to refresh, or [q] to quit.", bold=True, err=True)
+        key = click.getchar()
+        if key == ' ':
+            click.echo("Refreshing feed...", err=True)
+            try:
+                scored_entries = reader.fetch_and_score_feed()
+                scroll_offset = 0  # Reset scroll on refresh
+            except Exception as e:
+                click.secho(f"❌ ERROR: Refresh failed: {e}", fg="red", err=True)
+            continue
+        elif key in ('q', 'Q'):
+            click.echo("Exiting.")
+            break
+        # Arrow key handling for scrolling (might produce escape sequences)
+        elif key == '\x1b[A':  # Up Arrow
+            scroll_offset = max(0, scroll_offset - 1)
+        elif key == '\x1b[B':  # Down Arrow
+            # Prevent scrolling past the last page
+            scroll_offset = min(scroll_offset + 1, max(0, len(scored_entries) - top))
+if __name__ == "__main__":
+    main()

config.py ADDED Viewed

	@@ -0,0 +1,56 @@

+import os
+from typing import Final
+from pathlib import Path
+# --- Base Directory Definition ---
+# Use Path for modern, OS-agnostic path handling
+ARTIFACTS_DIR: Final[Path] = Path("artifacts")
+class AppConfig:
+    """
+    Central configuration class for the Hacker News Fine-Tuner application.
+    """
+    # --- Directory/Environment Configuration ---
+    ARTIFACTS_DIR: Final[Path] = ARTIFACTS_DIR
+    # Environment variable for Hugging Face token (used by model_trainer)
+    HF_TOKEN: Final[str | None] = os.getenv('HF_TOKEN')
+    # --- Caching/Data Fetching Configuration ---
+    HN_RSS_URL: Final[str] = "https://news.ycombinator.com/rss"
+    # Filename for the pickled cache data (using Path.joinpath)
+    CACHE_FILE: Final[Path] = ARTIFACTS_DIR.joinpath("hacker_news_cache.pkl")
+    # Cache duration set to 30 minutes (1800 seconds)
+    CACHE_DURATION_SECONDS: Final[int] = 60 * 30
+    # --- Model/Training Configuration ---
+    # Name of the pre-trained embedding model
+    MODEL_NAME: Final[str] = 'google/embeddinggemma-300M'
+    # Task name for prompting the embedding model (e.g., for instruction tuning)
+    TASK_NAME: Final[str] = "Classification"
+    # Output directory for the fine-tuned model
+    OUTPUT_DIR: Final[Path] = ARTIFACTS_DIR.joinpath("embedding-gemma-finetuned-hn")
+    # --- Gradio/App-Specific Configuration ---
+    # Anchor text used for contrastive learning dataset generation
+    QUERY_ANCHOR: Final[str] = "MY_FAVORITE_NEWS"
+    # Number of titles shown for user selection in the Gradio interface
+    TOP_TITLES_COUNT: Final[int] = 10
+    # Default export path for the dataset CSV
+    DATASET_EXPORT_FILENAME: Final[Path] = ARTIFACTS_DIR.joinpath("training_dataset.csv")
+    # Default model for the standalone Mood Reader tab
+    DEFAULT_MOOD_READER_MODEL: Final[str] = "bebechien/embedding-gemma-finetuned-hn"

data_fetcher.py ADDED Viewed

	@@ -0,0 +1,112 @@

+import feedparser
+import pickle
+import os
+import time
+from datetime import datetime
+from typing import Tuple, Any, Optional
+# Assuming AppConfig is passed in via dependency injection in the refactored main app.
+def format_published_time(published_parsed: Optional[time.struct_time]) -> str:
+    """Safely converts a feedparser time struct to a formatted string."""
+    if published_parsed:
+        try:
+            dt_obj = datetime.fromtimestamp(time.mktime(published_parsed))
+            return dt_obj.strftime('%Y-%m-%d %H:%M')
+        except Exception:
+            return 'N/A'
+    return 'N/A'
+def load_feed_from_cache(config: Any) -> Tuple[Optional[Any], str]:
+    """Attempts to load a feed object from the cache file if it exists and is not expired."""
+    if not os.path.exists(config.CACHE_FILE):
+        return None, "Cache file not found."
+    try:
+        # Check cache age
+        file_age_seconds = time.time() - os.path.getmtime(config.CACHE_FILE)
+        if file_age_seconds > config.CACHE_DURATION_SECONDS:
+            # The cache is too old
+            return None, f"Cache expired ({file_age_seconds:.0f}s old, limit is {config.CACHE_DURATION_SECONDS}s)."
+        with open(config.CACHE_FILE, 'rb') as f:
+            feed = pickle.load(f)
+            return feed, f"Loaded successfully from cache (Age: {file_age_seconds:.0f}s)."
+    except Exception as e:
+        # If loading fails, treat it as a miss and attempt to clean up
+        print(f"Warning: Failed to load cache file. Deleting corrupted cache. Reason: {e}")
+        try:
+            os.remove(config.CACHE_FILE)
+        except OSError:
+            pass # Ignore if removal fails
+        return None, "Cache file corrupted or invalid. Will re-fetch."
+def save_feed_to_cache(config: Any, feed: Any) -> None:
+    """Saves the fetched feed object to the cache file."""
+    try:
+        with open(config.CACHE_FILE, 'wb') as f:
+            pickle.dump(feed, f)
+        print(f"Successfully saved new feed data to cache: {config.CACHE_FILE}")
+    except Exception as e:
+        print(f"Error saving to cache: {e}")
+def read_hacker_news_rss(config: Any) -> Tuple[Optional[Any], str]:
+    """
+    Reads and parses the Hacker News RSS feed, using a cache if available.
+    Returns the feedparser object and a status message.
+    """
+    url = config.HN_RSS_URL
+    print(f"Attempting to fetch and parse RSS feed from: {url}")
+    print("-" * 50)
+    # 1. Attempt to load from cache
+    feed, cache_status = load_feed_from_cache(config)
+    print(f"Cache Status: {cache_status}")
+    # 2. If cache miss or stale, fetch from web
+    if feed is None:
+        print("Starting network fetch...")
+        try:
+            # Use feedparser to fetch and parse the feed
+            feed = feedparser.parse(url)
+            if feed.status >= 400:
+                status_msg = f"Error fetching the feed. HTTP Status: {feed.status}"
+                print(status_msg)
+                return None, status_msg
+            if feed.bozo:
+                # Bozo is set if any error occurred, even non-critical ones.
+                print(f"Warning: Failed to fully parse the feed. Reason: {feed.get('bozo_exception')}")
+            # 3. If fetch successful, save new data to cache
+            if feed.entries:
+                save_feed_to_cache(config, feed)
+                status_msg = f"Successfully fetched and cached {len(feed.entries)} entries."
+            else:
+                status_msg = "Fetch successful, but no entries found in the feed."
+                print(status_msg)
+                feed = None # Ensure feed is None if no entries
+        except Exception as e:
+            status_msg = f"An unexpected error occurred during network processing: {e}"
+            print(status_msg)
+            return None, status_msg
+    else:
+        status_msg = cache_status
+    return feed, status_msg
+# Example usage (not part of the refactored module's purpose but good for testing)
+if __name__ == '__main__':
+    from config import AppConfig
+    feed, status = read_hacker_news_rss(AppConfig)
+    if feed and feed.entries:
+        print(f"\nFetched {len(feed.entries)} entries. Top 3 titles:")
+        for entry in feed.entries[:3]:
+            print(f"- {entry.title}")
+    else:
+        print(f"Could not fetch the feed. Status: {status}")

flask_app.py ADDED Viewed

	@@ -0,0 +1,58 @@

+# app.py
+import os
+from datetime import datetime
+from typing import Optional
+from flask import Flask, render_template
+# Your existing config and core logic
+from config import AppConfig
+from hn_mood_reader import HnMoodReader, FeedEntry
+# --- Flask App Initialization ---
+app = Flask(__name__)
+# --- Global Cache for the Model ---
+global_reader: Optional[HnMoodReader] = None
+def initialize_reader() -> HnMoodReader:
+    """
+    Initializes the HnMoodReader instance. This function is called once
+    when the application starts.
+    """
+    print("Attempting to initialize the mood reader model...")
+    model_name = os.environ.get("MOOD_MODEL", AppConfig.DEFAULT_MOOD_READER_MODEL)
+    try:
+        reader = HnMoodReader(model_name=model_name)
+        print("Model loaded successfully.")
+        return reader
+    except Exception as e:
+        # If the model fails to load, print a fatal error and exit the app.
+        print(f"FATAL: Could not initialize model '{model_name}'. Error: {e}", file=sys.stderr)
+        sys.exit(1) # Exit with a non-zero code to indicate failure
+# --- Initialize the reader as soon as the app starts ---
+global_reader = initialize_reader()
+# --- Flask Route ---
+@app.route('/')
+def index():
+    """Main page route."""
+    try:
+        scored_entries = global_reader.fetch_and_score_feed()
+        return render_template(
+            'index.html',
+            entries=scored_entries,
+            model_name=global_reader.model_name,
+            last_updated=datetime.now().strftime('%H:%M:%S')
+        )
+    except Exception as e:
+        # Render a simple error page if something goes wrong
+        return render_template('error.html', error=str(e)), 500
+if __name__ == '__main__':
+    # Using debug=False is recommended for a stable display
+# use_reloader=False prevents the app from initializing the model twice in debug mode
+    app.run(host='0.0.0.0', port=5000, debug=False, use_reloader=False)

hn_mood_reader.py ADDED Viewed

	@@ -0,0 +1,71 @@

+# hn_mood_reader.py
+import feedparser
+from datetime import datetime
+from dataclasses import dataclass
+from typing import List
+import os
+# Assuming these are in separate files as in the original structure
+from config import AppConfig
+from data_fetcher import format_published_time
+from vibe_logic import VibeChecker, VibeResult
+# --- Data Structures ---
+@dataclass(frozen=True)
+class FeedEntry:
+    """Stores necessary data for a single HN story, including its calculated mood."""
+    title: str
+    link: str
+    comments_link: str
+    published_time_str: str
+    mood: VibeResult
+# --- Core Logic Class ---
+class HnMoodReader:
+    """Handles model initialization and mood scoring for Hacker News titles."""
+    def __init__(self, model_name: str):
+        try:
+            from sentence_transformers import SentenceTransformer
+        except ImportError as e:
+            raise ImportError("Please install 'sentence-transformers'") from e
+        print(f"Initializing SentenceTransformer with model: {model_name}...")
+        self.model = SentenceTransformer(model_name, truncate_dim=128)
+        print("Model initialized successfully.")
+        self.vibe_checker = VibeChecker(
+            model=self.model,
+            query_anchor=AppConfig.QUERY_ANCHOR,
+            task_name=AppConfig.TASK_NAME
+        )
+        self.model_name = model_name
+    def _get_mood_result(self, title: str) -> VibeResult:
+        """Calculates the mood for a title using the VibeChecker."""
+        return self.vibe_checker.check(title)
+    def fetch_and_score_feed(self) -> List[FeedEntry]:
+        """Fetches, scores, and sorts entries from the HN RSS feed."""
+        feed = feedparser.parse(AppConfig.HN_RSS_URL)
+        if feed.bozo:
+            raise IOError(f"Error parsing feed from {AppConfig.HN_RSS_URL}.")
+        scored_entries: List[FeedEntry] = []
+        for entry in feed.entries:
+            title, link = entry.get('title'), entry.get('link')
+            if not title or not link:
+                continue
+            scored_entries.append(
+                FeedEntry(
+                    title=title,
+                    link=link,
+                    comments_link=entry.get('comments', '#'),
+                    published_time_str=format_published_time(entry.published_parsed),
+                    mood=self._get_mood_result(title)
+                )
+            )
+        scored_entries.sort(key=lambda x: x.mood.raw_score, reverse=True)
+        return scored_entries

model_trainer.py ADDED Viewed

	@@ -0,0 +1,132 @@

+from huggingface_hub import login
+from sentence_transformers import SentenceTransformer, util
+from datasets import Dataset
+from sentence_transformers import SentenceTransformerTrainer, SentenceTransformerTrainingArguments
+from sentence_transformers.losses import MultipleNegativesRankingLoss
+from transformers import TrainerCallback, TrainingArguments
+from typing import List, Callable, Optional
+from pathlib import Path
+# --- Model/Utility Functions ---
+def authenticate_hf(token: Optional[str]) -> None:
+    """Logs into the Hugging Face Hub."""
+    if token:
+        print("Logging into Hugging Face Hub...")
+        login(token=token)
+    else:
+        print("Skipping Hugging Face login: HF_TOKEN not set.")
+def load_embedding_model(model_name: str) -> SentenceTransformer:
+    """Initializes the Sentence Transformer model."""
+    print(f"Loading Sentence Transformer model: {model_name}")
+    try:
+        model = SentenceTransformer(model_name)
+        print("Model loaded successfully.")
+        return model
+    except Exception as e:
+        print(f"Error loading Sentence Transformer model {model_name}: {e}")
+        raise
+def get_top_hits(
+    model: SentenceTransformer,
+    target_titles: List[str],
+    task_name: str,
+    query: str = "MY_FAVORITE_NEWS",
+    top_k: int = 5
+) -> str:
+    """Performs semantic search on target_titles and returns a formatted result string."""
+    if not target_titles:
+        return "No target titles available for search."
+    # Encode the query
+    query_embedding = model.encode(query, prompt_name=task_name)
+    # Encode the target titles (only done once per call)
+    title_embeddings = model.encode(target_titles, prompt_name=task_name)
+    # Perform semantic search
+    top_hits = util.semantic_search(query_embedding, title_embeddings, top_k=top_k)[0]
+    result = []
+    for hit in top_hits:
+        title = target_titles[hit['corpus_id']]
+        score = hit['score']
+        result.append(f"[{title}] {score:.4f}")
+    return "\n".join(result)
+# --- Training Class and Function ---
+class EvaluationCallback(TrainerCallback):
+    """
+    A callback that runs the semantic search evaluation at the end of each log step.
+    The search function is passed in during initialization.
+    """
+    def __init__(self, search_fn: Callable[[], str]):
+        self.search_fn = search_fn
+    def on_log(self, args: TrainingArguments, state, control, **kwargs):
+        print(f"Step {state.global_step} finished. Running evaluation:")
+        print(f"\n{self.search_fn()}\n")
+def train_with_dataset(
+    model: SentenceTransformer,
+    dataset: List[List[str]],
+    output_dir: Path,
+    task_name: str,
+    search_fn: Callable[[], str]
+) -> None:
+    """
+    Fine-tunes the provided Sentence Transformer MODEL on the dataset.
+    The dataset should be a list of lists: [[anchor, positive, negative], ...].
+    """
+    # Convert to Hugging Face Dataset format
+    data_as_dicts = [
+        {"anchor": row[0], "positive": row[1], "negative": row[2]}
+        for row in dataset
+    ]
+    train_dataset = Dataset.from_list(data_as_dicts)
+    # Use MultipleNegativesRankingLoss, suitable for contrastive learning
+    loss = MultipleNegativesRankingLoss(model)
+    # Note: SentenceTransformer models typically have a 'prompts' attribute
+    # which we need to access for the training arguments.
+    prompts = getattr(model, 'prompts', {}).get(task_name)
+    if not prompts:
+        print(f"Warning: Could not find prompts for task '{task_name}' in model. Training may be less effective.")
+        # Fallback to an empty list or appropriate default if required by the model's structure
+        prompts = []
+    args = SentenceTransformerTrainingArguments(
+        output_dir=output_dir,
+        prompts=prompts,
+        num_train_epochs=4,
+        per_device_train_batch_size=1,
+        learning_rate=2e-5,
+        warmup_ratio=0.1,
+        logging_steps=train_dataset.num_rows,
+        report_to="none",
+        save_strategy="no" # No saving during training, only at the end
+    )
+    trainer = SentenceTransformerTrainer(
+        model=model,
+        args=args,
+        train_dataset=train_dataset,
+        loss=loss,
+        callbacks=[EvaluationCallback(search_fn)]
+    )
+    trainer.train()
+    print("Training finished. Model weights are updated in memory.")
+    # Save the final fine-tuned model
+    trainer.save_model()
+    print(f"Model saved locally to: {output_dir}")

requirements.txt ADDED Viewed

	@@ -0,0 +1,9 @@

+accelerate
+beautifulsoup4
+datasets
+feedparser
+flask
+gradio
+html_to_markdown
+sentence-transformers
+git+https://github.com/huggingface/transformers@v4.56.0-Embedding-Gemma-preview

templates/error.html ADDED Viewed

	@@ -0,0 +1,13 @@

+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <title>Error</title>
+    <style>body { background-color: #121212; color: #ff5555; font-family: sans-serif; padding: 2rem; }</style>
+</head>
+<body>
+    <h1>An Error Occurred</h1>
+    <p>Could not load the feed. See server logs for details.</p>
+    <pre>{{ error }}</pre>
+</body>
+</html>

templates/index.html ADDED Viewed

	@@ -0,0 +1,127 @@

+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <meta http-equiv="refresh" content="300">
+    <title>Hacker News Vibe Reader</title>
+    <link rel="preconnect" href="https://fonts.googleapis.com">
+    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
+    <link href="https://fonts.googleapis.com/css2?family=Press+Start+2P&display=swap" rel="stylesheet">
+    <style>
+        body {
+            /* Use the imported pixel font */
+            font-family: 'Press Start 2P', cursive;
+            background-color: #1a1a1a; /* Dark background */
+            color: #00ff00; /* Classic green terminal text */
+            margin: 0;
+            padding: 1rem;
+            /* Prevents font anti-aliasing to keep it crisp */
+            -webkit-font-smoothing: none;
+            -moz-osx-font-smoothing: grayscale;
+            /* Ensures emojis are rendered as pixels */
+            image-rendering: pixelated;
+        }
+        .container {
+            max-width: 900px;
+            margin: 1rem auto;
+            border: 2px solid #00ff00;
+            padding: 1.5rem;
+            /* Hard, blocky shadow for a retro UI feel */
+            box-shadow: 5px 5px 0px #005f00;
+        }
+        h1 {
+            font-size: 1.2rem;
+            color: #ffffff;
+            text-shadow: 2px 2px #00ff00;
+            margin-top: 0;
+        }
+        .meta-info {
+            font-size: 0.7rem;
+            color: #8cff8c;
+            margin-bottom: 2rem;
+            border-bottom: 2px solid #005f00;
+            padding-bottom: 1rem;
+        }
+        ul {
+            list-style-type: none;
+            padding: 0;
+        }
+        li {
+            display: flex;
+            align-items: baseline;
+            margin-bottom: 1.5rem;
+        }
+        .vibe {
+            flex-shrink: 0;
+            margin-right: 1rem;
+            font-size: 1.5rem;
+        }
+        .title a {
+            color: #ffffff; /* Brighter white for main links */
+            text-decoration: none;
+            font-size: 0.8rem;
+            line-height: 1.5;
+        }
+        .title a:hover {
+            background-color: #00ff00;
+            color: #1a1a1a;
+        }
+        .details {
+            font-size: 0.7rem;
+            color: #8cff8c; /* Dimmer green for details */
+            margin-top: 0.5rem;
+        }
+        .details a {
+            color: #00ff00;
+            text-decoration: underline;
+        }
+        .details a:hover {
+            color: #ffffff;
+            background-color: transparent;
+        }
+        /* Make sure code tags also use the pixel font */
+        code {
+            font-family: 'Press Start 2P', cursive;
+            background-color: #005f00;
+            padding: 2px 4px;
+        }
+    </style>
+</head>
+<body>
+    <div class="container">
+        <h1>[ Hacker News Vibe Reader ]</h1>
+        <div class="meta-info">
+            MODEL: <code>{{ model_name }}</code> <br>
+            UPDATED: {{ last_updated }}
+        </div>
+        <ul>
+            {% for item in entries %}
+            <li>
+                <div class="vibe">{{ item.mood.status_html | safe }}</div>
+                <div>
+                    <div class="title"><a href="{{ item.link }}" target="_blank">{{ item.title }}</a></div>
+                    <div class="details">
+                        {{ item.published_time_str }} | <a href="{{ item.comments_link }}" target="_blank">COMMENTS</a>
+                    </div>
+                </div>
+            </li>
+            {% endfor %}
+        </ul>
+    </div>
+</body>
+</html>

vibe_logic.py ADDED Viewed

	@@ -0,0 +1,85 @@

+from dataclasses import dataclass
+from math import floor
+from typing import List
+from sentence_transformers import SentenceTransformer, util
+# --- Data Structures ---
+@dataclass(frozen=True)
+class VibeThreshold:
+    """Defines a threshold for a Vibe status."""
+    score: float
+    status: str
+@dataclass(frozen=True)
+class VibeResult:
+    """Stores the calculated HSL color and status for a given score."""
+    raw_score: float
+    status_html: str  # Pre-formatted HTML for display
+    color_hsl: str    # Raw HSL color string
+# Define the status thresholds from highest score to lowest score
+VIBE_THRESHOLDS: List[VibeThreshold] = [
+    VibeThreshold(score=0.8, status="✨ VIBE:HIGH"),
+    VibeThreshold(score=0.5, status="👍 VIBE:GOOD"),
+    VibeThreshold(score=0.2, status="😐 VIBE:FLAT"),
+    VibeThreshold(score=0.0, status="👎 VIBE:LOW&nbsp;"),  # Base case for scores < 0.2
+]
+# --- Utility Functions ---
+def map_score_to_vibe(score: float) -> VibeResult:
+    """
+    Maps a cosine similarity score to a VibeResult containing status, HTML, and color.
+    """
+    # 1. Clamp score for safety
+    clamped_score = max(0.0, min(1.0, score))
+    # 2. Color Calculation
+    hue = floor(clamped_score * 120)  # Linear interpolation: 0 (Red) -> 120 (Green)
+    color_hsl = f"hsl({hue}, 80%, 50%)"
+    # 3. Status Determination
+    status_text: str = VIBE_THRESHOLDS[-1].status  # Default to the lowest status
+    for threshold in VIBE_THRESHOLDS:
+        if clamped_score >= threshold.score:
+            status_text = threshold.status
+            break
+    # 4. Create the pre-formatted HTML for display
+    status_html = f"<span style='color: {color_hsl}; font-weight: bold;'>{status_text}</span>"
+    return VibeResult(raw_score=score, status_html=status_html, color_hsl=color_hsl)
+# --- Core Logic Class ---
+class VibeChecker:
+    """
+    Handles similarity scoring using a SentenceTransformer model and a pre-set anchor query.
+    """
+    def __init__(self, model: SentenceTransformer, query_anchor: str, task_name: str):
+        self.model = model
+        self.query_anchor = query_anchor
+        self.task_name = task_name
+        # Pre-calculate the anchor embedding for efficiency
+        self.query_embedding = self.model.encode(
+            self.query_anchor,
+            prompt_name=self.task_name,
+            normalize_embeddings=True
+        )
+    def check(self, text: str) -> VibeResult:
+        """
+        Calculates the "vibe" of a given text against the pre-configured anchor.
+        """
+        title_embedding = self.model.encode(
+            text,
+            prompt_name=self.task_name,
+            normalize_embeddings=True
+        )
+        # Use dot product for similarity with normalized embeddings
+        score: float = util.dot_score(self.query_embedding, title_embedding).item()
+        return map_score_to_vibe(score)