Update readme information
Browse files- README.md +12 -0
- data/data.md +5 -0
README.md
CHANGED
|
@@ -1,6 +1,18 @@
|
|
|
|
|
|
|
|
| 1 |
|
|
|
|
|
|
|
|
|
|
| 2 |
|
|
|
|
|
|
|
| 3 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
## Data
|
| 6 |
Data processing infomation can be found here [Data](./data/data.md).
|
|
|
|
| 1 |
+
# AI Scout
|
| 2 |
+
Scouting has been a part of my life since I was a young man, now as a dad I volunteer my time for my son's troop. As scouting has remained relevant to me I wanted to create an AI that will help with questions around Scouting.
|
| 3 |
|
| 4 |
+
## Required Keys
|
| 5 |
+
Open AI API Key - The user must enter their Open AI API key in the Gradio user interface. The application will display an error message if
|
| 6 |
+
no API Key is entered or if their is an invalid Open AI API Key.
|
| 7 |
|
| 8 |
+
## Cost
|
| 9 |
+
The application is using GPT-4o-mini and the expected costs are as follows:
|
| 10 |
|
| 11 |
+
| Type | Number | Cost |
|
| 12 |
+
| ------ | ------ | -------- |
|
| 13 |
+
| Input | 15,000 | $0.00225 |
|
| 14 |
+
| Output | 5,000 | $0.003 |
|
| 15 |
+
| Total | | $0.00525 |
|
| 16 |
|
| 17 |
## Data
|
| 18 |
Data processing infomation can be found here [Data](./data/data.md).
|
data/data.md
CHANGED
|
@@ -22,6 +22,8 @@ Run the following command to execute the data scraper script:
|
|
| 22 |
|
| 23 |
`python3 data_scraper.py`
|
| 24 |
|
|
|
|
|
|
|
| 25 |
## Post Processing - optional
|
| 26 |
|
| 27 |
The data that is downloaded is approximately 200 mb, to review the data it needs to be formatted, otherwise it will all be on a single line.
|
|
@@ -31,3 +33,6 @@ Run the following command from your terminal:
|
|
| 31 |
`cat scout_information.json | python -m json.tool > pretty_scout_information.json`
|
| 32 |
|
| 33 |
After reviewing the data for completeness delete the pretty_scout_information.json file as it is not needed for processing.
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
`python3 data_scraper.py`
|
| 24 |
|
| 25 |
+
|
| 26 |
+
|
| 27 |
## Post Processing - optional
|
| 28 |
|
| 29 |
The data that is downloaded is approximately 200 mb, to review the data it needs to be formatted, otherwise it will all be on a single line.
|
|
|
|
| 33 |
`cat scout_information.json | python -m json.tool > pretty_scout_information.json`
|
| 34 |
|
| 35 |
After reviewing the data for completeness delete the pretty_scout_information.json file as it is not needed for processing.
|
| 36 |
+
|
| 37 |
+
The second step in the data scraper is creating a vector store. Prior to this repo being created this vector store was uploaded to
|
| 38 |
+
Huggingface Datasets and can be found here: `marty331/scouts_dataset_vector_store`.
|