Spaces:
Build error
Build error
Commit History
Included exact/fuzzy phrase matching. Updated packages and added basic logging. dbad462
Can deal with backslashes in metadata for semantic search load 20b4aa0
Adjusted data file load parameters for keyword search 19f22c9
Improvements with embeddings load and file save 650da6e
Updated Dockerfile and requirements files to create a smaller container 91bd588
Minor bug fix to connections parameter function b1c3d49
Cognito authorisation option added to app, some other minor changes. 759001a
When running on cloud now checks for relevant header details on load f9e3451
Now accepts .zip file as inputs. Moved semantic search option bar. Minor API mode changes. 7f029b5
Changed embedding model to MiniLM-L6 as faster. Compressed embeddings are now int8. General improvements to API mode ea0dd40
Minor changes to function outputs. Attempted Python downgrade to 3.10 to address xlsx output issues 2806807
General code improvements and refinements. a95ef9f
Set bm25 in functions explicitly. Some API updates. Now can get connection params on startup. 2393537
Some package updates and minor changes 2754a2b
embedding files now write to output folder 1dc162b
Changed all intermediate file outputs to save to output folder fea085c
Allowed for custom output folder, returned Dockerfile to work under user account and port 7860 d3ff2e2
Specified boto3 sts region ee015e4
Correct bm25 filename usage 4bb8d6f
Now checks for output folder before saving. Minor code cleaning 2089141
Fixed cleaning for semantic search. Handles text with backslashes in (if cleaned). Updated packages. requirements file for only keyword search added. 8466e45
Assigned AWS bucket name to environmental variable 7bdc986
Added additional password auth for AWS-based files. Changed 'Clean' default to no 651ef78
AWS credentials no longer a requirement for app to work 30b5dc1
Gradio 4.21. Limitations on file size and creating embeddings. Added AWS integration e0fe055
String query not correctly specified in fuzzy search, changed this ff8dfa3
Sean-Case commited on
Now loads in embedding model locally in Dockerfile 3034296
Sean-Case commited on
Improved code for cleaning and outputting files. Added Dockerfile 4ee3470
Sean-Case commited on
Improved xlsx output formatting. Deals better with cleaning data then analysing in same session. 352c02a
Sean-Case commited on
Added highlight search term functionality to keyword search output 36a404e
Updated to Gradio 4.16.0. Now works correctly with BGE embeddings 2bcd818
Upgraded to Gradio 4.16.0. Added Spacy fuzzy search functionality. 4ce2224
Sean-Case commited on
Cut out semantic search temporarily while issues with Jina gated model resolved. Improved error/progress tracking and messaging. Placeholder for Spacy fuzzy search. 739b386
Better error checking. Doesn't load in embeddings file twice now. 63049fe
Sean-Case commited on
Fixed data input for semantic search. Allowed for docs to be loaded in directly for semantic search. 0.2.1 3df8e40
Sean-Case commited on
Minor changes to file path for outputs, documentation, location of pyinstaller build dependencies 200480d
Many changes to code organisation. More efficient searches from using intermediate outputs. Version 0.1 99d6fba
Now works correctly with npz. Minor formatting improvements d3b1ac5
Added semantic search using Jina ceb8617
Faster embedding with GPU, fast document split, writes to chromadb file correctly. No longer needs FAISS or langchain 2cb9977
Now outputs correct dataframe for semantic search. Can join on extra details 2a8aba8
Sean-Case commited on
Added basic semantic search functionality 78d71d4
Sean-Case commited on
Initial commit a9c2120
Sean-Case commited on