The First International Workshop on

Bringing Semantic Knowledge into
Vision and Text Understanding

In conjunction with IJCAI-2019, Macao, China

Submission Deadline: April 18, 2019

Extracting and understanding the high-level semantic information in vision and text data is considered as one of the key capabilities of effective artificial intelligence (AI) systems, which has been explored in many areas of AI, including computer vision, natural language processing, machine learning, data mining, knowledge representation, etc. Due to the success of deep representation learning, we have observed increasing research efforts in the intersection between vision and language for a better understanding of semantics, such as image captioning, visual question answering, etc. Besides, exploiting external semantic knowledge (e.g., semantic relations, knowledge graphs) for vision and text understanding also deserves more attention: The vast amount of external semantic knowledge could assist in having a “deeper” understanding of vision and/or text data, e.g., describing the contents of images in a more natural way, constructing a comprehensive knowledge graph for movies, building a dialog system equipped with commonsense knowledge, etc.

This workshop will provide a forum for researchers to review the recent progress of vision and text understanding, with an emphasis on novel approaches that involve deeper and better semantic understanding of vision and text data. The workshop is targeting a broad audience, including the researchers and practitioners in computer vision, natural language processing, machine learning, data mining, etc.

Workshop Topics

Image and Video Captioning

Visual Question Answering and Visual Dialog

Scene Graph Generation from Visual Data

Video Prediction and Reasoning

Scene Understanding

Knowledge Graph Construction

Knowledge Graph Embedding

Representation Learning

Question Answering over Knowledge Bases

Dialog Systems using Knowledge Graph

Adversarial Generation of Language & Images

Graphical Causal Models

Multimodal Representation and Fusion

Transfer Learning across Vision and Text

Submission Guidelines

Three types of submissions are invited to the workshop: long papers (up to 7 pages), short papers (up to 4 pages) and demo papers (up to 4 pages).

All submissions should be formatted according to the IJCAI'2019 Formatting Instructions and Templates. Authors are required to submit their papers electronically in PDF format to the Microsoft CMT submission site.

At least one author of each accepted paper must register for the workshop, and the registration information can be found on the IJCAI-2019 website. The authors of accepted papers should present their work at the workshop.

Any question regarding paper submission, please email us:[AT] or[AT]

  • Submission Deadline: April 18, 2019 (11:59PM UTC-12)
  • Notification: May 10, 2019 (11:59PM UTC-12)
  • Camera Ready: June 1, 2019 (11:59PM UTC-12)
  • TBD



    Sheng Li

    Assistant Professor
    University of Georgia


    Yaliang Li

    Research Scientist
    Alibaba Group


    Jing Gao

    Associate Professor
    University at Buffalo


    Yun Fu

    Associate Professor
    Northeastern University

    Program Committee