Spaces:
Running on CPU Upgrade
DABStep submission validator rejects valid string columns under pandas StringDtype
DABStep submission validator rejects valid string columns under pandas StringDtype
Hi DABStep team,
The submission endpoint appears to reject valid submissions with the error:
Columns with non-string data type: task_id, agent_answer
This happens even for a minimal one-line JSONL where both fields are explicit JSON strings:
{"task_id": "x", "agent_answer": "y"}
Root cause
I checked the public Space source. The validator currently does:
submission_df = pd.read_json(submission_file, lines=True, dtype=str)
non_string_columns = [
col for col in submission_df.columns
if submission_df[col].dtype != "object"
]
This check is brittle under newer pandas string dtype behavior.
With pandas==2.3.3 and string inference enabled,pd.read_json(..., dtype=str) returns StringDtype, not object, even though the column contains valid strings.
Reproduction
pandas 2.3.3, default:
dtypes: object
validator passes
pandas 2.3.3, PANDAS_FUTURE_INFER_STRING=1:
dtypes: StringDtype
validator fails with task_id, agent_answer
Environment (current Space)
datasets==4.8.1
gradio==6.10.0
pandas==2.3.3
pyarrow==24.0.0
This likely explains why submissions that worked before the Apr 27–28 Space updates now fail without any change in JSONL format.
Suggested fixes
Option 1: Use proper string dtype check
from pandas.api.types import is_string_dtype
non_string_columns = [
col for col in submission_df.columns
if not is_string_dtype(submission_df[col])
]
Option 2: Cast before validation
submission_df = submission_df.astype(object)
Option 3: Pin / adjust environment
Ensure pd.read_json(..., dtype=str) returns object dtype as before.
Thanks!
since the day LB went down from HF-side, we are not able to do any submissions basically. It's been 3 days. https://huggingface.co/spaces/adyen/DABstep/discussions/24 Now the issue is on the benchmark side but no-one is answering. I might say that Im really disappointed with the maintenance of this challenge including how late I got answers to my questions in the past week. Very vey disappointing.
I’ve opened a PR for the same issue
hopefully they fix it quick : )
@frisokingma @jeanmarcs @drublackberry @martinigoyanes @antonioramos @MindyKasting @davidlever @rokpopov @JorgeZapa @AaronAtAdyen @andreumora @KoenRoelofs @hannav @sergioadyen @zoranaAtadyen @wolfsinemm @BelleB @moktay @lchumaceiro @olgakostinaadyen @robertAdyen @tomjadams
Can someone from this company, for the sake of kindness and respectfullnes, let us know that if this benchmark is maintained at all? If not, that's fine, just let us know so that we don't need to waste our time.
@frisokingma @jeanmarcs @drublackberry @martinigoyanes @antonioramos @MindyKasting @davidlever @rokpopov @JorgeZapa @AaronAtAdyen @andreumora @KoenRoelofs @hannav @sergioadyen @zoranaAtadyen @wolfsinemm @BelleB @moktay @lchumaceiro @olgakostinaadyen @robertAdyen @tomjadams @iadyen @eggie5-adyen @martinigoyanes @andreumora
Can someone from this company, for the sake of kindness and respectfullnes, let us know that if this benchmark is maintained at all? If not, that's fine, just let us know so that we don't need to waste our time.