YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Apache ORC String Length Integer Overflow (CWE-190)
Vulnerability
Root cause: StringDirectColumnReader::computeSize() in ColumnReader.cc:697 casts int64_t string lengths to size_t without validating for negative values.
String lengths are decoded by an unsigned RLE decoder (isSigned=false) but stored into int64_t* arrays. When a crafted .orc file encodes a length >= 2^63, the value becomes a negative int64_t. static_cast<size_t>(negative) produces a huge positive value near SIZE_MAX.
Vulnerable Code
// ColumnReader.cc:691-706
size_t totalLength = 0;
for (size_t i = 0; i < numValues; ++i) {
totalLength += static_cast<size_t>(lengths[i]); // NO NEGATIVE CHECK!
}
// ...
byteBatch.blob.resize(totalLength); // OOM or undersized
Impact
- DoS via OOM: Single huge length β
blob.resize(9.2 exabytes)β crash - Wild pointers: Two lengths wrapping totalLength to 0 β empty blob β
ptr += negative_lengthβ OOB read - Stripe offset overflow:
Reader.cc:591β uint64 addition overflow, no checked arithmetic - ORC C++ has zero safe arithmetic in 3600+ lines of core parsing code
- Used by Apache Hive, Spark, Presto β ORC files from external sources
Fix
Add if (lengths[i] < 0) throw ParseError(...) before the cast.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support