Spaces:
Running
on
Zero
Running
on
Zero
Uploading FoodExtract-Vision demo app.py
Browse files
app.py
CHANGED
|
@@ -86,6 +86,7 @@ Except one model has been fine-tuned on the structured data whereas the other ha
|
|
| 86 |
Notable next steps would be:
|
| 87 |
* **Remove the input prompt:** Just train the model to go straight from image -> text (no text prompt on input), this would save on inference tokens.
|
| 88 |
* **Fine-tune on more real-world data:** Right now the model is only trained on 1k food images (from Food101) and 500 not food (random internet images), training on real world data would likely significantly improve performance.
|
|
|
|
| 89 |
"""
|
| 90 |
|
| 91 |
demo = gr.Interface(
|
|
|
|
| 86 |
Notable next steps would be:
|
| 87 |
* **Remove the input prompt:** Just train the model to go straight from image -> text (no text prompt on input), this would save on inference tokens.
|
| 88 |
* **Fine-tune on more real-world data:** Right now the model is only trained on 1k food images (from Food101) and 500 not food (random internet images), training on real world data would likely significantly improve performance.
|
| 89 |
+
* **Fix the repetitive generation:** The model can sometimes get stuck in a repetitive generation pattern, e.g. "onions", "onions", "onions", etc. We could look into patterns to help reduce this.
|
| 90 |
"""
|
| 91 |
|
| 92 |
demo = gr.Interface(
|