Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -98,7 +98,7 @@ These data points should ***not*** be classified as technical outliers because
|
|
| 98 |
|
| 99 |
---
|
| 100 |
|
| 101 |
-
Part 3: Embeddings
|
| 102 |
We have selected three distinct Transformer models to evaluate the trade-off between semantic understanding and computational efficiency for our recipe recommendation engine:
|
| 103 |
|
| 104 |
**sentence-transformers/all-MiniLM-L6-v2** (The Baseline): Chosen for its extreme speed and compact size (80MB). It represents the industry standard for lightweight CPU-based inference, serving as our baseline for "maximum efficiency."
|
|
@@ -127,4 +127,14 @@ We selected **BAAI/bge-small-en-v1.5** as the optimal embedding model for our re
|
|
| 127 |
|
| 128 |
* **Performance:** Crucially, it achieved the **highest similarity score** in our evaluation, demonstrating superior semantic understanding compared to the faster but less accurate `all-MiniLM-L6-v2`.
|
| 129 |
* **Efficiency:** It matched the precision of the resource-heavy `all-mpnet-base-v2` (which requires 420 MB) while maintaining a significantly lighter footprint.
|
| 130 |
-
* **Conclusion:** This specific balance allows our system to deliver the most relevant recipe recommendations without compromising on computational efficiency.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 98 |
|
| 99 |
---
|
| 100 |
|
| 101 |
+
# Part 3: Embeddings
|
| 102 |
We have selected three distinct Transformer models to evaluate the trade-off between semantic understanding and computational efficiency for our recipe recommendation engine:
|
| 103 |
|
| 104 |
**sentence-transformers/all-MiniLM-L6-v2** (The Baseline): Chosen for its extreme speed and compact size (80MB). It represents the industry standard for lightweight CPU-based inference, serving as our baseline for "maximum efficiency."
|
|
|
|
| 127 |
|
| 128 |
* **Performance:** Crucially, it achieved the **highest similarity score** in our evaluation, demonstrating superior semantic understanding compared to the faster but less accurate `all-MiniLM-L6-v2`.
|
| 129 |
* **Efficiency:** It matched the precision of the resource-heavy `all-mpnet-base-v2` (which requires 420 MB) while maintaining a significantly lighter footprint.
|
| 130 |
+
* **Conclusion:** This specific balance allows our system to deliver the most relevant recipe recommendations without compromising on computational efficiency.
|
| 131 |
+
|
| 132 |
+
# Part 4- IO Pipeline:
|
| 133 |
+
on our first try, we tried to use an OCR model:TrOCRProcessor
|
| 134 |
+
it wan't successfull, the model couln't predict some handwritten recipes, and sometimes even hulucinated.
|
| 135 |
+
|
| 136 |
+
we decided to try using: Qwen2.5-VL Vision-Language Model and the results were much better!
|
| 137 |
+
Comparison:
|
| 138 |
+
|
| 139 |
+

|
| 140 |
+
|