Welcome back to the Visualization for Machine Learning Lab!

Week 12: Visualization for NLP

Miscellaneous

  • Proposal grades and feedback are on Brightspace
    • Check PDFs for comments
    • Longer comments may need to be viewed in a PDF reader (like Acrobat) rather than the browser
  • Presentations 5/2
    • Written report due sometime during finals week (will get back to you with a date on Monday)
  • Homework 4 (OPTIONAL - extra credit) on Brightspace
    • Amount of extra credit TBD
    • Due April 19th

NLP + Vis Applications

  • These slides are adapted from a half-day tutorial at the 2023 EMNLP Conference by Shafiq Joty (Salesforce AI Research and Nanyang Technological University), Enamul Hoque (York University), and Jesse Vig (Salesforce AI Research)
  • You can find their full tutorial slides here

What Can NLP + Vis Be Used For?

  • Visual text analytics
  • Natural language interfaces for visualizations
  • Text generations for visualizations
  • Automatic visual story generation
  • Visualization retrieval and recommendation
  • Etc.

What Can NLP + Vis Be Used For?

  • We will first divide these applications into two categories:
    • Natural language as input
    • Natural language as output

NL as Input: ChartQA

ChartQA Dataset

  • Real-world charts crawled from various online sources
  • 9.6k human-authored and 23.1K Machine-generated questions

ChartQA Approach

ChartQA Evaluation

  • VisionTaPas achieves SOTA performance.
  • Lower accuracies in authors’ dataset compared to previous datasets (mainly due to the human-written visual and logical reasoning questions)

OpenAI’s Study of GPT-4 on ChartQA Benchmark

NL as Input: Multimodal Inputs for Visualizations

  • Ambiguity Widgets: Eviza (Setlur et al., 2016)
  • Allows users to rectify queries

NL as Input: Multimodal Inputs for Visualizations

  • Query completion through text and interactive vis: Sneak Pique (Setlur et al., 2020)

NL as Input: Multimodal Inputs for Visualizations

NL as Output: Chart-to-Text

NL as Output: Chart-to-Text

Chart-to-Text Example Models

  • Full fine-tuning BART/T5 on authors’ datasets
  • Setup 1: Linearizes the table as the input
  • Setup 2: Send OCR text from the chart image as the input
  • Prefix to T5: “translate Chart to Text:”

Chart-to-Text Sample Output

VisText

  • 12.4K Charts with generated + crowd-sourced captions
  • Scene graphs with a hierarchical representation of a chart’s visual elements

VisText Sample Output

  • Correctly identifies upword trends, but repeats this claim twice

NL as Output: Open-Ended Question Answering with Charts

Combining Language and Visualizations as Output

  • Roles of natural language
    • Generating explanatory answer
    • Explaining the answer

Combining Language and Visualizations as Output

  • An example of combining text and vis as a multimodal output

Combining Language and Visualizations as Output

  • DataShot (Yun et al., 2019)

Conversational QA With Visualization

  • Evizeon (Hoque et al., TVCG 2017)

Open Challenges & Ongoing Research

  • Design of natural language interfaces
    • Must consider richness and ambiguities of natural language
    • Complex reasoning required to predict the answer
    • Computer vision challenges for automatic understanding of image charts
    • Inherently interdisciplinary (HCI, ML, NLP, InfoVis, Computer Vision)

Open Challenges & Ongoing Research

  • Dataset creation
    • Need for large-scale, real-world benchmark datasets
    • Most existing datasets lack realism
    • For many problem setups, there is no benchmark

Open Challenges & Ongoing Research

  • Challenges with natural language generation
    • Hallucinations
    • Factual errors
    • Perceptual and reasoning aspects
    • Computer Vision Challenges

Open Challenges & Ongoing Research

Open Challenges & Ongoing Research

Open Challenges & Ongoing Research

  • Improving logical and visual reasoning

Open Challenges & Ongoing Research

  • Computer vision challenges (e.g. chart data extraction)

Open Challenges & Ongoing Research

  • How can we effectively combine text and visualization in data stories?

Open Challenges & Ongoing Research

  • NLP for Visualization accessiblity

Open Challenges & Ongoing Research

  • NLP for Visualization accessiblity

Project Proposal Feedback

  • For the remainder of the lab, please look over the feedback on your project proposals with your group members
    • Any questions? Concerns? Any progress you want feedback on? Feel free to ask now

References

Alam, Md Zubair Ibne, Shehnaz Islam, and Enamul Hoque. 2023. “SeeChart: Enabling Accessible Visualizations Through Interactive Natural Language Interface for People with Visual Impairments.” IUI ’23. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3581641.3584099.
Kantharaj, Shankar, Xuan Long Do, Rixie Tiffany Leong, Jia Qing Tan, Enamul Hoque, and Shafiq Joty. 2022. OpenCQA: Open-Ended Question Answering with Charts.” In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, edited by Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, 11817–37. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.811.
Kantharaj, Shankar, Rixie Tiffany Leong, Xiang Lin, Ahmed Masry, Megh Thakkar, Enamul Hoque, and Shafiq Joty. 2022. “Chart-to-Text: A Large-Scale Benchmark for Chart Summarization.” In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), edited by Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, 4005–23. Dublin, Ireland: Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.277.
Masry, Ahmed, Xuan Long Do, Jia Qing Tan, Shafiq Joty, and Enamul Hoque. 2022. ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning.” In Findings of the Association for Computational Linguistics: ACL 2022, edited by Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, 2263–79. Dublin, Ireland: Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-acl.177.
Setlur, Vidya, Enamul Hoque, Dae Hyun Kim, and Angel X. Chang. 2020. “Sneak Pique: Exploring Autocompletion as a Data Discovery Scaffold for Supporting Visual Analysis.” In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology, 966–78. UIST ’20. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3379337.3415813.
Sharif, Ather, Olivia H. Wang, Alida T. Muongchan, Katharina Reinecke, and Jacob O. Wobbrock. 2022. “VoxLens: Making Online Data Visualizations Accessible with an Interactive JavaScript Plug-in.” In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. CHI ’22. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3491102.3517431.
Shi, Danqing, Xinyue Xu, Fuling Sun, Yang Shi, and Nan Cao. 2020. “Calliope: Automatic Visual Data Story Generation from a Spreadsheet.” CoRR abs/2010.09975. https://arxiv.org/abs/2010.09975.
Tang, Benny, Angie Boggust, and Arvind Satyanarayan. 2023. VisText: A Benchmark for Semantically Rich Chart Captioning.” In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), edited by Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, 7268–98. Toronto, Canada: Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.401.
Wang, Yun, Zhida Sun, Haidong Zhang, Weiwei Cui, Ke Xu, Xiaojuan Ma, and Dongmei Zhang. 2020. “DataShot: Automatic Generation of Fact Sheets from Tabular Data.” IEEE Transactions on Visualization and Computer Graphics 26: 895–905. https://api.semanticscholar.org/CorpusID:201093978.