Resource List by category
[pdf]: paper PDF online link
[repo]: paper PDF repo link
[github]: github link
[web]: website link
- multimodal knowledge graph
- multimodal representation learning
- information extraction
- tutorials
- datasets
Year | Author |
Conf. |
Title | Links |
---|---|---|---|---|
2015.11 | Zhu et al. | arXiv | Building a Large-scale Multimodal Knowledge Base System for Answering Visual Queries | [pdf] [repo] |
2017.08 | Xie et al. | IJCAI'17 | Image-embodied Knowledge Representation Learning | [pdf] [repo] |
2018.01 | Saha et al. | AAAI'18 | Towards Building Large Scale Multimodal Domain-Aware Conversation Systems | [pdf] [repo] |
2018.06 | Mousselly-Sergieh et al. | SEM'18 | A Multimodal Translation-Based Approach for Knowledge Graph Representation Learning | [pdf] [repo] |
2018.11 | Pezeshkpour et al. | EMNLP'18 | Embedding Multimodal Relational Data for Knowledge Base Completion | [pdf] [repo] |
2019.03 | Liu et al. | ESWC'19 | MMKG: Multi-Modal Knowledge Graphs | [pdf] [repo] |
2019.05 | Rubio et al. | AKBC'19 | Answering Visual-Relational Queries in Web-Extracted Knowledge Graphs | [pdf] [repo] |
2019.06 | Wang et al. | IJCNN'19 | Multimodal Data Enhanced Representation Learning for Knowledge Graphs | [pdf] [repo] |
2019.10 | Liu et al. | ACM-MM'19 | Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding | [pdf] [repo] |
2019.10 | Zhang et al. | ACM-MM'19 | Multi-modal Knowledge-aware Hierarchical Attention Network for Explainable Medical Question Answering | [pdf] [repo] |
2020.07 | Li et al. | ACL'20 | GAIA: A Fine-grained Multimedia Knowledge Extraction System | [pdf] [repo] |
2020.08 | Chen et al. | KSEM'20 | MMEA: Entity Alignment for Multi-modal Knowledge Graph | [pdf] [repo] |
2020.08 | Xie et al. | EasyChair | Construction of Multi-modal Chinese Tourism Knowledge Graph | [pdf] [repo] |
2020.10 | Wang et al. | ICMR'20 | Fake News Detection via Knowledge-drive Multimodal Graph Convolutional Networks | [pdf] [repo] |
2020.10 | Kannan et al. | CIKM'20 | Multimodal Knowledge Graph for Deep Learning Papers and Code | [pdf] [repo] |
2020.10 | Sun et al. | CIKM'20 | Multi-modal Knowledge Graphs for Recommender Systems | [pdf] [repo] |
2020.11 | Wang et al. | EMNLP'20 | Incorporating Multimodal Information in Open-Domain Web Keyphrase Extraction | [pdf] [repo] [github] |
2021.02 | Liu et al. | AAAI'21 | Visual Pivoting for (Unsupervised) Entity Alignment | [pdf] [repo] |
2021.02 | Sun et al. | AAAI'21 | RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER | [pdf] [repo] [github] |
2021.06 | Wang et al. | CVPR'21 | Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation | [pdf] [repo] |
2021.06 | Zhang et al. | CVPR'21 | Explicit Knowledge Incorporation for Visual Reasoning | [pdf] [repo] |
2021.10 | Wang et al. | ACM-MM'21 | Is Visual Context Really Helpful for Knowledge Graph? A Representation Learning Perspective | [pdf] [repo] |
2022.02 | Zhu et al. | IEEE | Multi-Modal Knowledge Graph Construction and Application: A Survey | [pdf] [repo] |
2022.05 | Chen et al. | SIGIR’22 | Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion | [pdf] [repo] [github] |
2022.05 | Wang et al. | ACL'22 | WikiDiverse: A Multimodal Entity Linking Dataset with Diversified Contextual Topics and Entity Types | [pdf] [repo] [dataset] |
2022.05 | Chen et al. | NAACL'22 | Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction | [pdf] [repo] [github] |
2022.06 | Ding et al. | CVPR'22 | MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering | [pdf] [repo] |
2022.06 | Chang et al. | CVPR'22 | WebQA: Multihop and Multimodal QA | [pdf] [repo] |
2022.08 | Chen et al. | KDD'22 | Multi-modal Siamese Network for Entity Alignment | [pdf] [repo] [github] |
2022.10 | Cao et al. | ACM-MM'22 | Cross-modal Knowledge Graph Contrastive Learning for Machine Learning Method Recommendation | [pdf] [repo] |
2022.10 | Xu et al. | ACM-MM'22 | Relation-enhanced Negative Sampling for Multimodal Knowledge Graph Completion | [pdf] [repo] |
2022.11 | Cao et al. | NeurIPS'22 | OTKGE: Multi-modal Knowledge Graph Embeddings via Optimal Transport | [pdf] [repo] |
2022.11 | Lin et al. | NeurIPS'22 | REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering | [pdf] [repo] |
2022.11 | Pan et al. | NeurIPS'22 | Contrastive Language-Image Pre-Training with Knowledge Graphs | [pdf] [repo] |
2022.11 | Yang et al. | NeurIPS'22 | Rethinking Knowledge Graph Evaluation Under the Open-World Assumption | [pdf] [repo] |
2022.12 | Zhao et al. | EMNLP'22 | MoSE: Modality Split and Ensemble for Multimodal Knowledge Graph Completion | [pdf] [repo] [github] |
2022.12 | Zhou et al. | EMNLP'22 | A Span-based Multimodal Variational Autoencoder for Semi-supervised Multimodal Named Entity Recognition | [pdf] [repo] [github] |
2023.02 | Feng et al. | TOMM'23 | MKVSE: Multimodal Knowledge Enhanced Visual-Semantic Embedding for Image-Text Retrieval | [pdf] [repo] [github] |
2023.04 | Li et al. | WWW'23 | Attribute-Consistent Knowledge Graph Representation Learning for Multi-Modal Entity Alignment | [pdf] [repo] |
2023.04 | Yao et al. | WWW'23 | CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge | [pdf] [repo] |
2023.07 | Si et al. | ACL'23 | Combo of Thinking and Observing for Outside-Knowledge VQA | [pdf] [repo] |
2023.07 | Yao et al. | ACL'23 | VisKoP: Visual Knowledge oriented Programming for Interactive Knowledge Base Question Answering | [pdf] [repo] |
2023.07 | Luo et al. | KDD'23 | Multi-Grained Multimodal Interaction Network for Entity Linking | [pdf] [repo] |
2023.07 | Wu et al. | KDD'23 | Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph Propagation | [pdf] [repo] |
2023.11 | Deng et al. | ICDE'23 | Construction and Applications of Billion-Scale Pre-Trained Multimodal Business Knowledge Graph | [pdf] [repo] |
2023.11 | Wen et al. | IEEE'23 | IMKGA-SM: Interpretable Multimodal Knowledge Graph Answer Prediction via Sequence Modeling | [pdf] [repo] |
2024.01 | Mondal et al. | AAAI'24 | KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning | [pdf] [repo] |
2024.01 | Liu et al. | AAAI'24 | Beyond Entities: A Large-Scale Multi-Modal Knowledge Graph with Triplet Fact Grounding | [pdf] [repo] [github] |
2024.01 | Liang et al. | AAAI'24 | Self-Supervised Multi-Modal Knowledge Graph Contrastive Hashing for Cross-Modal Search | [pdf] [repo] |
2024.01 | Shang et al. | AAAI'24 | LAFA: Multimodal Knowledge Graph Completion with Link Aware Fusion and Aggregation | [pdf] [repo] |
2024.02 | Chen et al. | arXiv | Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey | [pdf] [repo] [github] |
2024.02 | Zhang et al. | COLING'24 | Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion | [pdf] [repo] [github] |
Year | Author |
Conf. |
Title | Links |
---|---|---|---|---|
2016.03 | Vendrov et al. | ICLR'16 | Order-Embeddings of Image and Language | [pdf] [repo] [github] |
2018.09 | Huang et al. | ECCV'18 | Multimodal Unsupervised Image-to-Image Translation | [pdf] [repo] |
2018.11 | Wang et al. | EMNLP'18 | Associative Multichannel Autoencoder for Multimodal Word Representation | [pdf] [repo] |
2019.04 | Yang et al. | arXiv | Shared Predictive Cross-Modal Deep Quantization | [pdf] [repo] |
2019.10 | Guo et al. | ACM-MM'19 | Aligning Linguistic Words and Visual Semantic Units for Image Captioning | [pdf] [repo] [github] |
2019.10 | He et al. | ACM-MM'19 | A New Benchmark and Approach for Fine-grained Cross-media Retrieval | [pdf] [repo] [github] |
2019.10 | Huang et al. | ACM-MM'19 | Annotation Efficient Cross-Modal Retrieval with Adversarial Attentive Alignment | [pdf] [repo] |
2019.10 | Nie et al. | ACM-MM'19 | Multimodal Dialog System: Generating Responses via Adaptive Decoders | [pdf] [repo] |
2020.01 | Xi et al. | ICMLSC'20 | Multimodal Sentiment Analysis based on Multi-head Attention Mechanism | [pdf] [repo] |
2020.01 | Park et al. | WACV'20 | MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding | [pdf] [repo] |
2020.02 | Kim et al. | AAAI'20 | MULE: Multimodal Universal Language Embedding | [pdf] [repo] |
2020.02 | Mai et al. | AAAI'20 | Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion | [pdf] [repo] |
2020.03 | Zhang et al. | JSTSP/IEEE | Multimodal Intelligence: Representation Learning, Information Fusion, and Applications | [pdf] [repo] |
2020.08 | Wang et al. | KDD'20 | Multimodal Learning with Incomplete Modalities by Knowledge Distillation | [pdf] [repo] |
2020.10 | Chiou et al. | arXiv | Visual Relationship Detection with Visual-Linguistic Knowledge from Multimodal Representations | [pdf] [repo] |
2020.11 | Tsai et al. | EMNLP'20 | Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis | [pdf] [repo] [github] |
2020.12 | Wang et al. | NeurIPS'20 | Deep Multimodal Fusion by Channel Exchanging | [pdf] [repo] [github] |
2021.04 | Zhu et al. | EACL'21 | Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation | [pdf] [repo] [github] |
2021.04 | Sun et al. | EACL'21 | A New View of Multi-modal Language Analysis: Audio and Video Features as Text Styles | [pdf] [repo] |
2021.04 | Sahu et al. | EACL'21 | Adaptive Fusion Techniques for Multimodal Data | [pdf] [repo] [github] |
2021.06 | Xu et al. | ACL'21 | LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding | [pdf] [repo] |
2021.06 | Cao et al. | ACL'21 | Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases | [pdf] [repo] |
2021.06 | Xing et al. | ACL'21 | KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation | [pdf] [repo] |
2021.06 | Su et al. | ACL'21 | GEM: A General Evaluation Benchmark for Multimodal Tasks | [pdf] [repo] |
2021.06 | Marino et al. | CVPR'21 | KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA | [pdf] [repo] |
2021.06 | Yuan et al. | CVPR'21 | Multimodal Contrastive Training for Visual Representation Learning | [pdf] [repo] |
2022.01 | Salin et al. | AAAI'22 | Are Vision-Language Transformers Learning Multimodal Representations? A Probing Perspective | [pdf] [repo] |
2022.04 | Cai et al. | WWW'22 | Multimodal Continual Graph Learning with Neural Architecture Search | [pdf] [repo] |
2022.04 | Eyuboglu et al. | ICLR'22 | Domino: Discovering Systematic Errors with Cross-model Embeddings | [pdf] [repo] |
2022.05 | Wu et al. | ACL'22 | Understanding Multimodal Procedural Knowledge by Sequencing Multimodal Instructional Manuals | [pdf] [repo] |
2022.05 | Zhang et al. | ACL'22 | Modeling Temporal-Modal Entity Graph for Procedural Multimodal Machine Comprehension | [pdf] [repo] |
2022.05 | Wang et al. | NAACL'22 | ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition | [pdf] [repo] |
2022.06 | Ma et al. | CVPR'22 | Are Multimodal Transformers Robust to Missing Modality? | [pdf] [repo] |
2022.10 | Jia et al. | ACM-MM'22 | Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition | [pdf] [repo] |
2022.10 | Zhao et al. | ACM-MM'22 | Learning from Different text-image Pairs: A Relation-enhanced Graph Convolutional Network for Multimodal NER | [pdf] [repo] |
2022.11 | Kim et al. | NeurIPS'22 | Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching | [pdf] [repo] |
2022.11 | Huang et al. | NeurIPS'22 | MACK: Multimodal Aligned Conceptual Knowledge for Unpaired Image-text Matching | [pdf] [repo] |
2022.11 | Liang et al. | NeurIPS'22 | Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning | [pdf] [repo] |
2023.02 | Zeng et al. | AAAI'23 | Multi-Modal Knowledge Hypergraph for Diverse Image Retrieval | [pdf] [repo] |
2023.07 | Luo et al. | ACL'23 | End-to-end Knowledge Retrieval with Multi-modal Queries | [pdf] [repo] [github] |
2023.08 | Peng et al. | IJCAI'23 | An Empirical Study on the Language Modal in Visual Question Answering | [pdf] [repo] |
2023.08 | Yan et al. | IJCAI'23 | Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration for Video Captioning | [pdf] [repo] |
2024.01 | Cui et al. | AAAI'24 | Continual Vision-Language Retrieval via Dynamic Knowledge Rectification | [pdf] [repo] |
2024.01 | Kim et al. | AAAI'24 | Improving Open Set Recognition via Visual Prompts Distilled from Common-Sense Knowledge | [pdf] [repo] |
2024.06 | Chen et al. | CVPR'24 | LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge | [pdf] [repo] |
2024.07 | Lin et al. | arXiv | Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge | [pdf] [repo] |
Year | Author |
Conf. |
Title | Links |
---|---|---|---|---|
2009.08 | Mintz et al. | ACL'09 | Distant Supervision for Relation Extraction without Labeled Data | [pdf] [repo] |
2013.12 | Bordes et al. | NIPS'13 | Translating Embeddings for Modeling Multi-relational Data | [pdf] [repo] |
2014.07 | Wang et al. | AAAI'14 | Knowledge Graph Embedding by Translating on Hyperplanes | [pdf] [repo] |
2014.08 | Zeng et al. | COLING'14 | Relation Classification via Convolutional Deep Neural Network | [pdf] [repo] |
2015.08 | Zeng et al. | EMNLP'15 | Distance Supervision for Relation Extraction via Piecewise Convolutional Neural Networks | [pdf] [repo] |
2015.10 | Ji et al. | ACL'15 | Knowledge Graph Embedding via Dynamic Mapping Matrix | [pdf] [repo] |
2016.02 | Ji et al. | AAAI'16 | Knowledge Graph Completion with Adaptive Sparse Transfer Matrix | [pdf] [repo] |
2016.08 | Lin et al. | ACL'16 | Neural Relation Extraction with Selective Attention over Instances | [pdf] [repo] |
2016.08 | Xiao et al. | ACL'16 | TransG: A Generative Model for Knowledge Graph Embedding | [pdf] [repo] |
2021.11 | Yuan et al. | EMNLP'21 | Interactive Machine Comprehension with Dynamic Knowledge Graphs | [pdf] [repo] |
2021.11 | Guo et al. | EMNLP'21 | BiQUE: Biquaternionic Embeddings of Knowledge Graphs | [pdf] [repo] |
2021.11 | Dash et al. | EMNLP'21 | Open Knowledge Graphs Canonicalization using Variational Autoencoders | [pdf] [repo] |
2021.11 | Meng et al. | EMNLP'21 | Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT | [pdf] [repo] |
2021.11 | Oliya et al. | EMNLP'21 | End-to-End Entity Resolution and Question Answering Using Differentiable Knowledge Graphs | [pdf] [repo] |
2023.02 | Chung et al. | AAAI'23 | Learning Representations of Bi-level Knowledge Graph for Reasoning beyond Link Prediction | [pdf] [repo] |
2023.02 | Huang et al. | AAAI'23 | Enabling Knowledge Refinement upon New Concepts in Abductive Learning | [pdf] [repo] |
2023.04 | Zhang et al. | WWW'23 | Structure Pretraining and Prompt Tuning for Knowledge Graph Transfer | [pdf] [repo] |
2023.07 | Baek et al. | ACL'23 | Direct Fact Retrieval from Knowledge Graphs without Entity Linking | [pdf] [repo] |
2023.07 | Kim et al. | ACL'23 | FACTKG: Fact Verification via Reasoning on Knowledge Graphs | [pdf] [repo] |
2023.07 | Li et al. | ACL'23 | To Copy Rather Than Memorize: A Vertical Learning Paradigm for Knowledge Graph Completion | [pdf] [repo] [github] |
Author | Conf. / Uni |
Title | Links |
---|---|---|---|
Heiko Paulheim | Universität Mannheim | Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge Extraction or Knowledge Hallucination? | [pdf] |
Ilharco et al. | ICDM'20 | Online Multimodal Knowledge Discovery | [pdf] |
Shih-Fu Chang | CVPR'19 & Uni. Columbia | Multimodal Knowledge Graphs: Automatic Extraction & Applications | [pdf] [repo] |
Dihong Gong | Uni. Florida | Towards Building Large-Scale Multimodal Knowledge Bases | [pdf] [repo] |
Elias Karle & Umutcan Simsek | Uni Innsbruck | How To Build a Knowledge Graph | [pdf] [repo] |
Jay Pujara & Sameer Singh | WSDM‘18 | Mining Knowledge Graphs from Text | [pdf] [repo1] [repo2A] [repo2B] [repo3] [repo4] [repo5] |
Sameer Singh | Uni California, Irvine | Injecting Prior Information and Multiple Modalities into Knowledge Base Embeddings | [pdf] [repo] |
Year | Author |
Conf. |
Title | Links |
---|---|---|---|---|
2022.11 | Lu et al. | NeurIPS'22 | Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering | [pdf] [repo] |