kitchoc commited on
Commit
8bd5bb1
·
verified ·
1 Parent(s): 81810c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -102,7 +102,7 @@ To enhance the model's capability in structured table reasoning, we conduct Supe
102
  In order to fully leverage the reasoning capabilities of the trained model in tabular data scenarios, a structured data analysis-oriented workflow is proposed in this work. The workflow consists of 4 core components: table preprocessing, table sensing, tool-integrated reasoning and prompt engineering. These components form an end-to-end pipeline designed to enhance the model's ability to understand and reason over tabular datasets.
103
 
104
  ### Table Preprocessing
105
- Before table analysis and reasoning, a table preprocessing is adopted in our workflow such that the input table gets clean, structured and properly formatted. Table preprocessing involves handling missing values, splitting merged cells, and standardizing column headers, identifying column headers. After the preprocessing, the table is transformed into a normalized structure that can facilitate downstream table understanding and reasoning by the model. Moreover, the structured format reduces ambiguity in table layout and ensures consistent alignment between natural language queries and the corresponding data fields.
106
 
107
  ### Table Sensing
108
  Table sensing refers to the model’s contextual understanding of a table’s structure, semantics, and relationships. In this stage, column headers and sample rows of each table are provided to the model. During the table sensing stage, the model identifies the types of each column (e.g., categorical, numerical, textual), infers potential relationships among columns, and detects any implicit hierarchies or grouping patterns. It also involves the understanding of header semantics, including the disambiguation of abbreviations, units, and domain-specific terminology. By observing sample rows, the model gains insight into typical value ranges, formats, and anomalies, enabling it to develop a robust “sense” of the data context.
@@ -179,11 +179,11 @@ print(response)
179
  </p>
180
  <ul>
181
  <li>
182
- A open-source <a href="https://huggingface.co/datasets/JT-LM/JIUTIAN-TReB">dataset</a> combining cleaned public benchmarks, real-world web tables, and proprietary data,
183
  covering six core capabilities and 26 subtasks to support diverse table reasoning evaluations.
184
  </li>
185
  <li>
186
- A open-source <a href="https://github.com/JT-LM/jiutian-treb">framework code</a> specifically designed to evaluate LLM performance on table reasoning tasks. It integrates diverse inference modes and reliable metrics, enabling precise and multi-dimensional evaluations.
187
  </li>
188
  </ul>
189
 
 
102
  In order to fully leverage the reasoning capabilities of the trained model in tabular data scenarios, a structured data analysis-oriented workflow is proposed in this work. The workflow consists of 4 core components: table preprocessing, table sensing, tool-integrated reasoning and prompt engineering. These components form an end-to-end pipeline designed to enhance the model's ability to understand and reason over tabular datasets.
103
 
104
  ### Table Preprocessing
105
+ Before table analysis and reasoning, a table preprocessing is adopted in our workflow such that the input table gets clean, structured and properly formatted. Table preprocessing involves handling missing values, splitting merged cells, standardizing column headers, and identifying column headers. After the preprocessing, the table is transformed into a normalized structure that can facilitate downstream table understanding and reasoning by the model. Moreover, the structured format reduces ambiguity in table layout and ensures consistent alignment between natural language queries and the corresponding data fields.
106
 
107
  ### Table Sensing
108
  Table sensing refers to the model’s contextual understanding of a table’s structure, semantics, and relationships. In this stage, column headers and sample rows of each table are provided to the model. During the table sensing stage, the model identifies the types of each column (e.g., categorical, numerical, textual), infers potential relationships among columns, and detects any implicit hierarchies or grouping patterns. It also involves the understanding of header semantics, including the disambiguation of abbreviations, units, and domain-specific terminology. By observing sample rows, the model gains insight into typical value ranges, formats, and anomalies, enabling it to develop a robust “sense” of the data context.
 
179
  </p>
180
  <ul>
181
  <li>
182
+ An open-source <a href="https://huggingface.co/datasets/JT-LM/JIUTIAN-TReB">dataset</a> combining cleaned public benchmarks, real-world web tables, and proprietary data,
183
  covering six core capabilities and 26 subtasks to support diverse table reasoning evaluations.
184
  </li>
185
  <li>
186
+ An open-source <a href="https://github.com/JT-LM/jiutian-treb">framework code</a> specifically designed to evaluate LLM performance on table reasoning tasks. It integrates diverse inference modes and reliable metrics, enabling precise and multi-dimensional evaluations.
187
  </li>
188
  </ul>
189