BotSpace / analisecoderabbit_debug.md
profgabrielramos's picture
Upload folder using huggingface_hub
1a0b19c verified
Starting CodeRabbit review in plain text mode...
Connecting to review service
Setting up
Analyzing
Reviewing
============================================================================
File: analisecoderabbit_debug.md
Line: 1 to 3
Type: potential_issue
Prompt for AI Agent:
Verify each finding against the current code and only fix it if needed.
In @analisecoderabbit_debug.md around lines 1 - 3, Remove the accidental debug/temporary file analisecoderabbit_debug.md from the commit (it contains "Starting CodeRabbit review in plain text mode..." and connection log lines) by deleting that file from the branch and updating the commit/PR so it no longer appears in the changes; ensure you also check .gitignore or CI rules to prevent committing similar debug output in the future.
============================================================================
File: ingest_job.py
Line: 195 to 198
Type: nitpick
Prompt for AI Agent:
Verify each finding against the current code and only fix it if needed.
In @ingest_job.py around lines 195 - 198, The sort key currently uses the absolute path via str(p).lower(), include the full machine prefix and can be non-deterministic for names differing only by case; change the key to sort by the path relative to base and add a deterministic tiebreaker for case differences (e.g., use p.relative_to(base) as the primary key converted to a normalized form and the original relative path as a secondary key) so update the files assignment that builds files (the list comprehension and the key=lambda) to compute and sort by the relative path to base with a stable tie-breaker.
============================================================================
File: ingest_job.py
Line: 232
Type: nitpick
Prompt for AI Agent:
Verify each finding against the current code and only fix it if needed.
In @ingest_job.py at line 232, Replace the hardcoded batch_size=64 with a configurable EMBED_BATCH_SIZE read from environment: add an EMBED_BATCH_SIZE = int(os.getenv("EMBED_BATCH_SIZE", "64")) (or similar) where other hyperparams like CHUNK_CHARS and CHUNK_OVERLAP are defined, ensure os is imported if needed, validate it’s a positive integer, and pass EMBED_BATCH_SIZE into the call instead of the literal 64.
============================================================================
File: ingest_job.py
Line: 230 to 235
Type: nitpick
Prompt for AI Agent:
Verify each finding against the current code and only fix it if needed.
In @ingest_job.py around lines 230 - 235, A chamada model.encode(..., show_progress_bar=False) silencia o progresso do encoding; antes de chamar model.encode (where you build vectors from [chunk["text"] for chunk in chunks]) logue um "starting encoding" com timestamp, capture o tempo (e.g., start = now), call model.encode as-is, then log "finished encoding" with duration and number of chunks to help operators know the job is running; ensure logs are clear and use the same logger used elsewhere in this module.
Review completed: 4 findings ✔