RAG_AIEXP_01 / documents_prep.py

Commit History

a new way with keywords
90e6b4c

MrSimple07 commited on

new way of chunking
f85ad1c

MrSimple07 commited on

added the new chunking and loggings
822ef8c

MrSimple07 commited on

a new chunking with 3 000 size for all types of data
c0bcb11

MrSimple07 commited on

Removed all custom table configurations (CUSTOM_TABLE_CONFIGS)
0067c9d

MrSimple07 commited on

Removed duplicate logs throughout all files
c2a83c1

MrSimple07 commited on

Removed duplicate logs throughout all files
c81fd8c

MrSimple07 commited on

max size 25000 + improved table prep
9ad6501

MrSimple07 commited on

table processing + new version of np104
5884230

MrSimple07 commited on

chunk size = 8192
bf0077f

MrSimple07 commited on

token based chunking 2
dd15743

MrSimple07 commited on

token based chunking
04b4160

MrSimple07 commited on

token based chunking
79a7114

MrSimple07 commited on

process_documents_with_chunking improvement
3f09b3e

MrSimple07 commited on

Merge branch 'main' of https://huggingface.co/spaces/MrSimple01/RAG_AIEXP_01
19e03d0

MrSimple07 commited on

added table data + image data, + replaces with the old version for the table processing
74a8708

MrSimple07 commited on

fixing chunk size + overlap
bfd4369

MrSimple07 commited on

fixing chunk size + overlap
cce8eb4

MrSimple07 commited on

new table document + image document processing functions + added more comprehensive loggings
931a79f

MrSimple07 commited on

new table document + image document processing functions + added more comprehensive loggings
e7d927a

MrSimple07 commited on

new table document + image document processing functions + added more comprehensive loggings
2df0370

MrSimple07 commited on

new table document + image document processing functions + added more comprehensive loggings
45d5cbd

MrSimple07 commited on

font is black + fixed table + image downloading issues
52249e8

MrSimple07 commited on

font is black + fixed table + image downloading issues
07d4035

MrSimple07 commited on

new code for showing chunks
1c5766c

MrSimple07 commited on

new code for showing chunks
78b9517

MrSimple07 commited on

added new window for chunking results + added hybrid approach for chunking max limit is 2048"
d65910d

MrSimple07 commited on

added new window for chunking results + added hybrid approach for chunking max limit is 2048"
68ff9c7

MrSimple07 commited on

added new window for chunking results + added hybrid approach for chunking max limit is 2048"
b7082d5

MrSimple07 commited on

added new window for chunking results + added hybrid approach for chunking max limit is 2048"
d490230

MrSimple07 commited on

improved chunk size to 2048
6c977f5

MrSimple07 commited on

improved chunk size to 2048
5d5d2cd

MrSimple07 commited on

improved the docs prep json txt
f7d949d

MrSimple07 commited on

improved the docs prep json txt
2875c88

MrSimple07 commited on

complete new structure
142fd08

MrSimple07 commited on

complete new structure
2e8b03f

MrSimple07 commited on

complete new structure
8f55c9f

MrSimple07 commited on

complete new structure
080a9f6

MrSimple07 commited on

complete new structure
e10965e

MrSimple07 commited on

complete new structure
ba52088

MrSimple07 commited on

added new prompts + removed section showing in html sources
aa1a7c4

MrSimple07 commited on

added new prompts + removed section showing in html sources
eb321a3

MrSimple07 commited on

fixing the json zip file reading
f71c373

MrSimple07 commited on

fixing the json zip file reading
d3d0d1e

MrSimple07 commited on

added text_json files + added the document prep
e02c18a

MrSimple07 commited on

added text_json files + added the document prep
600d58a

MrSimple07 commited on