mbudisic commited on
Commit
a66160a
Β·
1 Parent(s): ded2340

Minor edits

Browse files
Files changed (1) hide show
  1. ANSWER.md +11 -8
ANSWER.md CHANGED
@@ -27,7 +27,7 @@ Marko Budisic
27
  - [Task 4: Building a Quick End-to-End Prototype](#task-4-building-a-quick-end-to-end-prototype)
28
  - [4.1. The Prototype Application πŸ–₯️](#41-the-prototype-application-️)
29
  - [4.2. Deployment πŸš€ (Hugging Face Space)](#42-deployment--hugging-face-space)
30
- - [Task 5: Creating a Golden Test Data Set](#task-5-creating-a-golden-test-data-set)
31
  - [5.1. RAGAS Framework Assessment \& Results πŸ“Š](#51-ragas-framework-assessment--results-)
32
  - [Task 6: Fine-Tuning Open-Source Embeddings](#task-6-fine-tuning-open-source-embeddings)
33
  - [6.1. Fine-Tuning Process and Model Link πŸ”—:\*\*](#61-fine-tuning-process-and-model-link-)
@@ -132,13 +132,14 @@ The `app.py` script is the core prototype. It uses Chainlit for the UI, LangChai
132
  ### 4.2. Deployment πŸš€ (Hugging Face Space)
133
 
134
  The repository is structured for Hugging Face Space deployment:
135
- - `README.md` contains Hugging Face Space metadata (e.g., `sdk: docker`).
136
- - A `Dockerfile` enables containerization for deployment.
137
- This setup indicates the prototype is packaged for public deployment. 🌍
138
 
139
- ## Task 5: Creating a Golden Test Data Set
 
 
140
 
141
- The creation of the "Golden Test Data Set" is documented in the `create_golden_dataset.ipynb` notebook in the [`PsTuts-VQA-Data-Operations` repository](https://github.com/mbudisic/PsTuts-VQA-Data-Operations). This dataset was then utilized in the `notebooks/evaluate_rag.ipynb` of the current project to assess the initial RAG pipeline with RAGAS. 🌟
 
 
142
 
143
  ### 5.1. RAGAS Framework Assessment & Results πŸ“Š
144
 
@@ -175,7 +176,7 @@ to compute the objective function in the training loop, while `validate` was use
175
  - **Monitoring:** πŸ› οΈ W&B tracked the process and evaluation. πŸ“ˆ
176
  - **Resulting Model:** The fine-tuned model (for the Photoshop example) was saved and pushed to the Hugging Face Hub. πŸ€— [mbudisic/snowflake-arctic-embed-s-ft-pstuts](https://huggingface.co/mbudisic/snowflake-arctic-embed-s-ft-pstuts)
177
 
178
- _(Evidence for this is in `notebooks/Fine_Tuning_Embedding_for_PSTuts.ipynb`, specifically the `model.push_to_hub` call and its output. The `app.py` can be (or is) configured to use such a fine-tuned model for the embedding step in the RAG pipeline.)_
179
 
180
  ## Task 7: Assessing Performance
181
 
@@ -195,6 +196,8 @@ The notebook provides a comparison between "Base", "SOTA" (OpenAI's `text-embedd
195
  | Factual Correctness | 0.654 | 0.598 | -0.056 |
196
  | Context Entity Recall | 0.636 | 0.636 | 0.000 |
197
 
 
 
198
  Additionally, statistical significance of changes `Base -> FT` and `FT -> SOTA` was assessed.
199
 
200
  **Overall conclusion is that all of these models perform similarly.**
@@ -204,7 +207,7 @@ appropriate context and fine-tuning did not bring much benefit.
204
 
205
  The Hugging Face live demo runs the fine-tuned model.
206
 
207
- _(Note: These are mean scores. `Factual Correctness` is `factual_correctness(mode=f1)` in the notebook.)_
208
 
209
  ## 8. Future changes
210
 
 
27
  - [Task 4: Building a Quick End-to-End Prototype](#task-4-building-a-quick-end-to-end-prototype)
28
  - [4.1. The Prototype Application πŸ–₯️](#41-the-prototype-application-️)
29
  - [4.2. Deployment πŸš€ (Hugging Face Space)](#42-deployment--hugging-face-space)
30
+ - [Task 5: Creating a Golden Data Set](#task-5-creating-a-golden-data-set)
31
  - [5.1. RAGAS Framework Assessment \& Results πŸ“Š](#51-ragas-framework-assessment--results-)
32
  - [Task 6: Fine-Tuning Open-Source Embeddings](#task-6-fine-tuning-open-source-embeddings)
33
  - [6.1. Fine-Tuning Process and Model Link πŸ”—:\*\*](#61-fine-tuning-process-and-model-link-)
 
132
  ### 4.2. Deployment πŸš€ (Hugging Face Space)
133
 
134
  The repository is structured for Hugging Face Space deployment:
 
 
 
135
 
136
+ - `README.md` contains Hugging Face Space metadata.
137
+ - `Dockerfile` enables containerization for deployment. Note the load of `web`
138
+ dependencies from `pyproject.toml`.
139
 
140
+ ## Task 5: Creating a Golden Data Set
141
+
142
+ The creation of the "Golden Data Set" is documented in the `create_golden_dataset.ipynb` notebook in the [`PsTuts-VQA-Data-Operations` repository](https://github.com/mbudisic/PsTuts-VQA-Data-Operations). This dataset was then utilized in the `notebooks/evaluate_rag.ipynb` of the current project to assess the initial RAG pipeline with RAGAS. 🌟
143
 
144
  ### 5.1. RAGAS Framework Assessment & Results πŸ“Š
145
 
 
176
  - **Monitoring:** πŸ› οΈ W&B tracked the process and evaluation. πŸ“ˆ
177
  - **Resulting Model:** The fine-tuned model (for the Photoshop example) was saved and pushed to the Hugging Face Hub. πŸ€— [mbudisic/snowflake-arctic-embed-s-ft-pstuts](https://huggingface.co/mbudisic/snowflake-arctic-embed-s-ft-pstuts)
178
 
179
+ _(See `notebooks/Fine_Tuning_Embedding_for_PSTuts.ipynb`, specifically the `model.push_to_hub` call and its output. The `app.py` is configured to use such a fine-tuned model for the embedding step in the RAG pipeline.)_
180
 
181
  ## Task 7: Assessing Performance
182
 
 
196
  | Factual Correctness | 0.654 | 0.598 | -0.056 |
197
  | Context Entity Recall | 0.636 | 0.636 | 0.000 |
198
 
199
+ _(Note: These are mean scores.)_
200
+
201
  Additionally, statistical significance of changes `Base -> FT` and `FT -> SOTA` was assessed.
202
 
203
  **Overall conclusion is that all of these models perform similarly.**
 
207
 
208
  The Hugging Face live demo runs the fine-tuned model.
209
 
210
+
211
 
212
  ## 8. Future changes
213