Spaces:
Sleeping
Sleeping
Update my_model/tabs/home.py
Browse files- my_model/tabs/home.py +85 -17
my_model/tabs/home.py
CHANGED
|
@@ -2,26 +2,94 @@ import streamlit as st
|
|
| 2 |
import streamlit.components.v1 as components
|
| 3 |
|
| 4 |
|
| 5 |
-
def run_home():
|
| 6 |
"""
|
| 7 |
Displays the home page for the Knowledge-Based Visual Question Answering (KB-VQA) project using Streamlit.
|
| 8 |
This function sets up the main home page for demonstrating the project.
|
|
|
|
|
|
|
|
|
|
| 9 |
"""
|
| 10 |
|
| 11 |
-
st.markdown(
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
st.
|
| 22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
-
st.write("""
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
import streamlit.components.v1 as components
|
| 3 |
|
| 4 |
|
| 5 |
+
def run_home() -> None:
|
| 6 |
"""
|
| 7 |
Displays the home page for the Knowledge-Based Visual Question Answering (KB-VQA) project using Streamlit.
|
| 8 |
This function sets up the main home page for demonstrating the project.
|
| 9 |
+
|
| 10 |
+
Returns:
|
| 11 |
+
None
|
| 12 |
"""
|
| 13 |
|
| 14 |
+
st.markdown("""
|
| 15 |
+
<div style="text-align: justify;">
|
| 16 |
+
|
| 17 |
+
\n\n\n**Welcome to the interactive application for the Knowledge-Based Visual Question Answering (KB-VQA)
|
| 18 |
+
project. This application is an integral part of a
|
| 19 |
+
[Master’s dissertation in Artificial Intelligence](https://info.online.bath.ac.uk/msai/) at the
|
| 20 |
+
[University of Bath](https://www.bath.ac.uk/). As we delve into the fascinating world of VQA, I invite you
|
| 21 |
+
to explore the intersection of visual perception, language understanding, and cutting-edge AI research.**
|
| 22 |
+
</div>""",
|
| 23 |
+
unsafe_allow_html=True)
|
| 24 |
+
st.markdown("### Background")
|
| 25 |
+
with st.expander("Read Background"):
|
| 26 |
+
st.write("""
|
| 27 |
+
<div style="text-align: justify;">
|
| 28 |
+
|
| 29 |
+
Since its inception by **Alan Turing** in 1950, the **Turing Test** has been a fundamental benchmark for
|
| 30 |
+
evaluating machine intelligence against human standards. As technology evolves, so too must the criteria
|
| 31 |
+
for assessing AI. The **Visual Turing Test** represents a modern extension that includes visual cognition
|
| 32 |
+
within the scope of AI evaluation. At the forefront of this advancement is **Visual Question Answering
|
| 33 |
+
(VQA)**, a field that challenges AI systems to perceive, comprehend, and articulate insights about
|
| 34 |
+
visual inputs in natural language. This progression reflects the complex interplay between perception
|
| 35 |
+
and cognition that characterizes human intelligence, positioning VQA as a crucial metric for gauging
|
| 36 |
+
AI’s ability to emulate human-like understanding.
|
| 37 |
+
|
| 38 |
+
Mature VQA systems hold transformative potential across various domains. In robotics, VQA systems can
|
| 39 |
+
enhance autonomous decision-making by enabling robots to interpret and respond to visual cues. In
|
| 40 |
+
medical imaging and diagnosis, VQA systems can assist healthcare professionals by accurately
|
| 41 |
+
interpreting complex medical images and providing insightful answers to diagnostic questions, thereby
|
| 42 |
+
enhancing both the speed and accuracy of medical assessments. In manufacturing, VQA systems can optimize
|
| 43 |
+
quality control processes by enabling automated systems to identify defects and ensure product
|
| 44 |
+
consistency with minimal human intervention. These advancements underscore the importance of developing
|
| 45 |
+
robust VQA capabilities, as they push the boundaries of the Visual Turing Test and bring us closer to
|
| 46 |
+
achieving true human-like AI cognition.
|
| 47 |
+
|
| 48 |
+
Unlike other vision-language tasks, VQA requires many CV sub-tasks to be solved in the process,
|
| 49 |
+
including: **Object recognition**, **Object detection**, **Attribute classification**, **Scene
|
| 50 |
+
classification**, **Counting**, **Activity recognition**, **Spatial relationships among objects**,
|
| 51 |
+
and **Commonsense reasoning**. These VQA tasks often do not require external factual knowledge and only
|
| 52 |
+
in rare cases require common-sense reasoning. Furthermore, VQA models cannot derive additional knowledge
|
| 53 |
+
from existing VQA datasets should a question require it, therefore **Knowledge-Based Visual Question
|
| 54 |
+
Answering (KB-VQA)** has been introduced. KB-VQA is a relatively new extension to VQA with datasets
|
| 55 |
+
representing a knowledge-based VQA task where the visual question cannot be answered without external
|
| 56 |
+
knowledge, where the essence of this task is centred around knowledge acquisition and integration with
|
| 57 |
+
the visual contents of the image.
|
| 58 |
+
</div>""",
|
| 59 |
+
unsafe_allow_html=True)
|
| 60 |
|
| 61 |
+
st.write("""
|
| 62 |
+
<div style="text-align: justify;">
|
| 63 |
+
|
| 64 |
+
This application showcases the advanced capabilities of the KB-VQA model, empowering users to seamlessly
|
| 65 |
+
upload images, pose questions, and obtain answers derived from both visual and textual data.
|
| 66 |
+
By leveraging sophisticated Multimodal Learning techniques, this project bridges the gap between visual
|
| 67 |
+
perception and linguistic interpretation, effectively merging these modalities to provide coherent and
|
| 68 |
+
contextually relevant responses. This research not only showcases the cutting-edge progress in artificial
|
| 69 |
+
intelligence but also pushes the boundaries of AI systems towards passing the **Visual Turing Test**, where
|
| 70 |
+
machines exhibit **human-like** understanding and reasoning in processing and responding to visual
|
| 71 |
+
information.
|
| 72 |
+
|
| 73 |
+
## Tools:
|
| 74 |
+
|
| 75 |
+
- **Dataset Analysis**: Provides an overview of the KB-VQA datasets and displays various analysis of the
|
| 76 |
+
OK-VQA dataset.
|
| 77 |
+
- **Model Architecture**: Displays the model architecture and accompanying abstract and design details for
|
| 78 |
+
the Knowledge-Based Visual Question Answering (KB-VQA) model.
|
| 79 |
+
- **Results**: Manages the interactive Streamlit demo for visualizing model evaluation results and analysis.
|
| 80 |
+
It provides an interface for users to explore different aspects of the model performance and evaluation
|
| 81 |
+
samples.
|
| 82 |
+
- **Run Inference**: This tool allows users to run inference to test and use the fine-tuned KB-VQA model
|
| 83 |
+
using various configurations.
|
| 84 |
+
</div>""",
|
| 85 |
+
unsafe_allow_html=True)
|
| 86 |
+
st.markdown("<br>" * 1, unsafe_allow_html=True)
|
| 87 |
+
st.write(" ##### Developed by: [Mohammed H AlHaj](https://www.linkedin.com/in/m7mdal7aj)")
|
| 88 |
+
st.markdown("<br>" * 1, unsafe_allow_html=True)
|
| 89 |
+
st.write("""
|
| 90 |
+
**Credit:**
|
| 91 |
+
* The project predominantly uses [LLaMA-2](https://ai.meta.com/llama/) language inference. It is
|
| 92 |
+
made available under [Meta LlaMA license](https://ai.meta.com/llama/license/).
|
| 93 |
+
* This application is built on [Streamlit](https://streamlit.io), providing an interactive and user-friendly
|
| 94 |
+
interface.
|
| 95 |
+
""")
|