Usage
  • 185 views
  • 173 downloads

Improving Table Reasoning through Table Decomposition and Normalization

  • Author / Creator
    Nahid, Md Mahadi Hasan
  • Table reasoning is a challenging task that requires understanding both natural language questions and structured tabular data. While Large Language Models (LLMs) have shown impressive capabilities in natural language understanding and generation, they often struggle with large tables due to their limited input length. Additionally, LLMs face challenges in tasks involving tabular data—especially those requiring symbolic reasoning—due to the structural variance and inconsistency in table cell values commonly found in web tables. In this thesis, we address these challenges with two novel approaches. First, we propose TabSQLify, a method that leverages Text-to- SQL generation to decompose tables into smaller, relevant sub tables containing only essential information. This approach significantly reduces the input context length, making the task more scalable and efficient for large-scale table reasoning applications.
    Our evaluation on challenging datasets, including WikiTableQuestion and TabFact, demonstrates that TabSQLify achieves superior performance compared to prevailing methods and shows notable accuracy improvements, surpassing LLM-based baseline models.
    In the second part of our study, we focus on enhancing the symbolic reasoning performance of LLMs when dealing with tabular data, particularly web tables with structural variance and inconsistency in cell values. We introduce NormTab, a framework for normalizing web tables as a one-time preprocessing step. This normalization improves consistency and structure, thereby supporting symbolic reasoning on tabular data. Our experimental evaluation on challenging datasets shows that NormTab significantly enhances symbolic reasoning performance, highlighting the importance and effectiveness of table normalization in LLM-based reasoning tasks.

  • Subjects / Keywords
  • Graduation date
    Fall 2024
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-ckmh-a783
  • License
    This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.