Resource List by category

[pdf]: paper PDF online link
[repo]: paper PDF repo link
[github]: github link
[web]: website link

multimodal knowledge graph
multimodal representation learning
information extraction
tutorials
datasets

multimodal knowledge graph

Year	Author	Conf.	Title	Links
2015.11	Zhu et al.	arXiv	Building a Large-scale Multimodal Knowledge Base System for Answering Visual Queries	[pdf] [repo]
2017.08	Xie et al.	IJCAI'17	Image-embodied Knowledge Representation Learning	[pdf] [repo]
2018.01	Saha et al.	AAAI'18	Towards Building Large Scale Multimodal Domain-Aware Conversation Systems	[pdf] [repo]
2018.06	Mousselly-Sergieh et al.	SEM'18	A Multimodal Translation-Based Approach for Knowledge Graph Representation Learning	[pdf] [repo]
2018.11	Pezeshkpour et al.	EMNLP'18	Embedding Multimodal Relational Data for Knowledge Base Completion	[pdf] [repo]
2019.03	Liu et al.	ESWC'19	MMKG: Multi-Modal Knowledge Graphs	[pdf] [repo]
2019.05	Rubio et al.	AKBC'19	Answering Visual-Relational Queries in Web-Extracted Knowledge Graphs	[pdf] [repo]
2019.06	Wang et al.	IJCNN'19	Multimodal Data Enhanced Representation Learning for Knowledge Graphs	[pdf] [repo]
2019.10	Liu et al.	ACM-MM'19	Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding	[pdf] [repo]
2019.10	Zhang et al.	ACM-MM'19	Multi-modal Knowledge-aware Hierarchical Attention Network for Explainable Medical Question Answering	[pdf] [repo]
2020.07	Li et al.	ACL'20	GAIA: A Fine-grained Multimedia Knowledge Extraction System	[pdf] [repo]
2020.08	Chen et al.	KSEM'20	MMEA: Entity Alignment for Multi-modal Knowledge Graph	[pdf] [repo]
2020.08	Xie et al.	EasyChair	Construction of Multi-modal Chinese Tourism Knowledge Graph	[pdf] [repo]
2020.10	Wang et al.	ICMR'20	Fake News Detection via Knowledge-drive Multimodal Graph Convolutional Networks	[pdf] [repo]
2020.10	Kannan et al.	CIKM'20	Multimodal Knowledge Graph for Deep Learning Papers and Code	[pdf] [repo]
2020.10	Sun et al.	CIKM'20	Multi-modal Knowledge Graphs for Recommender Systems	[pdf] [repo]
2020.11	Wang et al.	EMNLP'20	Incorporating Multimodal Information in Open-Domain Web Keyphrase Extraction	[pdf] [repo] [github]
2021.02	Liu et al.	AAAI'21	Visual Pivoting for (Unsupervised) Entity Alignment	[pdf] [repo]
2021.02	Sun et al.	AAAI'21	RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER	[pdf] [repo] [github]
2021.06	Wang et al.	CVPR'21	Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation	[pdf] [repo]
2021.06	Zhang et al.	CVPR'21	Explicit Knowledge Incorporation for Visual Reasoning	[pdf] [repo]
2021.10	Wang et al.	ACM-MM'21	Is Visual Context Really Helpful for Knowledge Graph? A Representation Learning Perspective	[pdf] [repo]
2022.02	Zhu et al.	IEEE	Multi-Modal Knowledge Graph Construction and Application: A Survey	[pdf] [repo]
2022.05	Chen et al.	SIGIR’22	Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion	[pdf] [repo] [github]
2022.05	Wang et al.	ACL'22	WikiDiverse: A Multimodal Entity Linking Dataset with Diversified Contextual Topics and Entity Types	[pdf] [repo] [dataset]
2022.05	Chen et al.	NAACL'22	Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction	[pdf] [repo] [github]
2022.06	Ding et al.	CVPR'22	MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering	[pdf] [repo]
2022.06	Chang et al.	CVPR'22	WebQA: Multihop and Multimodal QA	[pdf] [repo]
2022.08	Chen et al.	KDD'22	Multi-modal Siamese Network for Entity Alignment	[pdf] [repo] [github]
2022.10	Cao et al.	ACM-MM'22	Cross-modal Knowledge Graph Contrastive Learning for Machine Learning Method Recommendation	[pdf] [repo]
2022.10	Xu et al.	ACM-MM'22	Relation-enhanced Negative Sampling for Multimodal Knowledge Graph Completion	[pdf] [repo]
2022.11	Cao et al.	NeurIPS'22	OTKGE: Multi-modal Knowledge Graph Embeddings via Optimal Transport	[pdf] [repo]
2022.11	Lin et al.	NeurIPS'22	REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering	[pdf] [repo]
2022.11	Pan et al.	NeurIPS'22	Contrastive Language-Image Pre-Training with Knowledge Graphs	[pdf] [repo]
2022.11	Yang et al.	NeurIPS'22	Rethinking Knowledge Graph Evaluation Under the Open-World Assumption	[pdf] [repo]
2022.12	Zhao et al.	EMNLP'22	MoSE: Modality Split and Ensemble for Multimodal Knowledge Graph Completion	[pdf] [repo] [github]
2022.12	Zhou et al.	EMNLP'22	A Span-based Multimodal Variational Autoencoder for Semi-supervised Multimodal Named Entity Recognition	[pdf] [repo] [github]
2023.02	Feng et al.	TOMM'23	MKVSE: Multimodal Knowledge Enhanced Visual-Semantic Embedding for Image-Text Retrieval	[pdf] [repo] [github]
2023.04	Li et al.	WWW'23	Attribute-Consistent Knowledge Graph Representation Learning for Multi-Modal Entity Alignment	[pdf] [repo]
2023.04	Yao et al.	WWW'23	CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge	[pdf] [repo]
2023.07	Si et al.	ACL'23	Combo of Thinking and Observing for Outside-Knowledge VQA	[pdf] [repo]
2023.07	Yao et al.	ACL'23	VisKoP: Visual Knowledge oriented Programming for Interactive Knowledge Base Question Answering	[pdf] [repo]
2023.07	Luo et al.	KDD'23	Multi-Grained Multimodal Interaction Network for Entity Linking	[pdf] [repo]
2023.07	Wu et al.	KDD'23	Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph Propagation	[pdf] [repo]
2023.11	Deng et al.	ICDE'23	Construction and Applications of Billion-Scale Pre-Trained Multimodal Business Knowledge Graph	[pdf] [repo]
2023.11	Wen et al.	IEEE'23	IMKGA-SM: Interpretable Multimodal Knowledge Graph Answer Prediction via Sequence Modeling	[pdf] [repo]
2024.01	Mondal et al.	AAAI'24	KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning	[pdf] [repo]
2024.01	Liu et al.	AAAI'24	Beyond Entities: A Large-Scale Multi-Modal Knowledge Graph with Triplet Fact Grounding	[pdf] [repo] [github]
2024.01	Liang et al.	AAAI'24	Self-Supervised Multi-Modal Knowledge Graph Contrastive Hashing for Cross-Modal Search	[pdf] [repo]
2024.01	Shang et al.	AAAI'24	LAFA: Multimodal Knowledge Graph Completion with Link Aware Fusion and Aggregation	[pdf] [repo]
2024.02	Chen et al.	arXiv	Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey	[pdf] [repo] [github]
2024.02	Zhang et al.	COLING'24	Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion	[pdf] [repo] [github]

multimodal representation learning

Year	Author	Conf.	Title	Links
2016.03	Vendrov et al.	ICLR'16	Order-Embeddings of Image and Language	[pdf] [repo] [github]
2018.09	Huang et al.	ECCV'18	Multimodal Unsupervised Image-to-Image Translation	[pdf] [repo]
2018.11	Wang et al.	EMNLP'18	Associative Multichannel Autoencoder for Multimodal Word Representation	[pdf] [repo]
2019.04	Yang et al.	arXiv	Shared Predictive Cross-Modal Deep Quantization	[pdf] [repo]
2019.10	Guo et al.	ACM-MM'19	Aligning Linguistic Words and Visual Semantic Units for Image Captioning	[pdf] [repo] [github]
2019.10	He et al.	ACM-MM'19	A New Benchmark and Approach for Fine-grained Cross-media Retrieval	[pdf] [repo] [github]
2019.10	Huang et al.	ACM-MM'19	Annotation Efficient Cross-Modal Retrieval with Adversarial Attentive Alignment	[pdf] [repo]
2019.10	Nie et al.	ACM-MM'19	Multimodal Dialog System: Generating Responses via Adaptive Decoders	[pdf] [repo]
2020.01	Xi et al.	ICMLSC'20	Multimodal Sentiment Analysis based on Multi-head Attention Mechanism	[pdf] [repo]
2020.01	Park et al.	WACV'20	MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding	[pdf] [repo]
2020.02	Kim et al.	AAAI'20	MULE: Multimodal Universal Language Embedding	[pdf] [repo]
2020.02	Mai et al.	AAAI'20	Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion	[pdf] [repo]
2020.03	Zhang et al.	JSTSP/IEEE	Multimodal Intelligence: Representation Learning, Information Fusion, and Applications	[pdf] [repo]
2020.08	Wang et al.	KDD'20	Multimodal Learning with Incomplete Modalities by Knowledge Distillation	[pdf] [repo]
2020.10	Chiou et al.	arXiv	Visual Relationship Detection with Visual-Linguistic Knowledge from Multimodal Representations	[pdf] [repo]
2020.11	Tsai et al.	EMNLP'20	Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis	[pdf] [repo] [github]
2020.12	Wang et al.	NeurIPS'20	Deep Multimodal Fusion by Channel Exchanging	[pdf] [repo] [github]
2021.04	Zhu et al.	EACL'21	Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation	[pdf] [repo] [github]
2021.04	Sun et al.	EACL'21	A New View of Multi-modal Language Analysis: Audio and Video Features as Text Styles	[pdf] [repo]
2021.04	Sahu et al.	EACL'21	Adaptive Fusion Techniques for Multimodal Data	[pdf] [repo] [github]
2021.06	Xu et al.	ACL'21	LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding	[pdf] [repo]
2021.06	Cao et al.	ACL'21	Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases	[pdf] [repo]
2021.06	Xing et al.	ACL'21	KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation	[pdf] [repo]
2021.06	Su et al.	ACL'21	GEM: A General Evaluation Benchmark for Multimodal Tasks	[pdf] [repo]
2021.06	Marino et al.	CVPR'21	KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA	[pdf] [repo]
2021.06	Yuan et al.	CVPR'21	Multimodal Contrastive Training for Visual Representation Learning	[pdf] [repo]
2022.01	Salin et al.	AAAI'22	Are Vision-Language Transformers Learning Multimodal Representations? A Probing Perspective	[pdf] [repo]
2022.04	Cai et al.	WWW'22	Multimodal Continual Graph Learning with Neural Architecture Search	[pdf] [repo]
2022.04	Eyuboglu et al.	ICLR'22	Domino: Discovering Systematic Errors with Cross-model Embeddings	[pdf] [repo]
2022.05	Wu et al.	ACL'22	Understanding Multimodal Procedural Knowledge by Sequencing Multimodal Instructional Manuals	[pdf] [repo]
2022.05	Zhang et al.	ACL'22	Modeling Temporal-Modal Entity Graph for Procedural Multimodal Machine Comprehension	[pdf] [repo]
2022.05	Wang et al.	NAACL'22	ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition	[pdf] [repo]
2022.06	Ma et al.	CVPR'22	Are Multimodal Transformers Robust to Missing Modality?	[pdf] [repo]
2022.10	Jia et al.	ACM-MM'22	Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition	[pdf] [repo]
2022.10	Zhao et al.	ACM-MM'22	Learning from Different text-image Pairs: A Relation-enhanced Graph Convolutional Network for Multimodal NER	[pdf] [repo]
2022.11	Kim et al.	NeurIPS'22	Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching	[pdf] [repo]
2022.11	Huang et al.	NeurIPS'22	MACK: Multimodal Aligned Conceptual Knowledge for Unpaired Image-text Matching	[pdf] [repo]
2022.11	Liang et al.	NeurIPS'22	Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning	[pdf] [repo]
2023.02	Zeng et al.	AAAI'23	Multi-Modal Knowledge Hypergraph for Diverse Image Retrieval	[pdf] [repo]
2023.07	Luo et al.	ACL'23	End-to-end Knowledge Retrieval with Multi-modal Queries	[pdf] [repo] [github]
2023.08	Peng et al.	IJCAI'23	An Empirical Study on the Language Modal in Visual Question Answering	[pdf] [repo]
2023.08	Yan et al.	IJCAI'23	Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration for Video Captioning	[pdf] [repo]
2024.01	Cui et al.	AAAI'24	Continual Vision-Language Retrieval via Dynamic Knowledge Rectification	[pdf] [repo]
2024.01	Kim et al.	AAAI'24	Improving Open Set Recognition via Visual Prompts Distilled from Common-Sense Knowledge	[pdf] [repo]
2024.06	Chen et al.	CVPR'24	LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge	[pdf] [repo]
2024.07	Lin et al.	arXiv	Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge	[pdf] [repo]

information extraction

Year	Author	Conf.	Title	Links
2009.08	Mintz et al.	ACL'09	Distant Supervision for Relation Extraction without Labeled Data	[pdf] [repo]
2013.12	Bordes et al.	NIPS'13	Translating Embeddings for Modeling Multi-relational Data	[pdf] [repo]
2014.07	Wang et al.	AAAI'14	Knowledge Graph Embedding by Translating on Hyperplanes	[pdf] [repo]
2014.08	Zeng et al.	COLING'14	Relation Classification via Convolutional Deep Neural Network	[pdf] [repo]
2015.08	Zeng et al.	EMNLP'15	Distance Supervision for Relation Extraction via Piecewise Convolutional Neural Networks	[pdf] [repo]
2015.10	Ji et al.	ACL'15	Knowledge Graph Embedding via Dynamic Mapping Matrix	[pdf] [repo]
2016.02	Ji et al.	AAAI'16	Knowledge Graph Completion with Adaptive Sparse Transfer Matrix	[pdf] [repo]
2016.08	Lin et al.	ACL'16	Neural Relation Extraction with Selective Attention over Instances	[pdf] [repo]
2016.08	Xiao et al.	ACL'16	TransG: A Generative Model for Knowledge Graph Embedding	[pdf] [repo]
2021.11	Yuan et al.	EMNLP'21	Interactive Machine Comprehension with Dynamic Knowledge Graphs	[pdf] [repo]
2021.11	Guo et al.	EMNLP'21	BiQUE: Biquaternionic Embeddings of Knowledge Graphs	[pdf] [repo]
2021.11	Dash et al.	EMNLP'21	Open Knowledge Graphs Canonicalization using Variational Autoencoders	[pdf] [repo]
2021.11	Meng et al.	EMNLP'21	Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT	[pdf] [repo]
2021.11	Oliya et al.	EMNLP'21	End-to-End Entity Resolution and Question Answering Using Differentiable Knowledge Graphs	[pdf] [repo]
2023.02	Chung et al.	AAAI'23	Learning Representations of Bi-level Knowledge Graph for Reasoning beyond Link Prediction	[pdf] [repo]
2023.02	Huang et al.	AAAI'23	Enabling Knowledge Refinement upon New Concepts in Abductive Learning	[pdf] [repo]
2023.04	Zhang et al.	WWW'23	Structure Pretraining and Prompt Tuning for Knowledge Graph Transfer	[pdf] [repo]
2023.07	Baek et al.	ACL'23	Direct Fact Retrieval from Knowledge Graphs without Entity Linking	[pdf] [repo]
2023.07	Kim et al.	ACL'23	FACTKG: Fact Verification via Reasoning on Knowledge Graphs	[pdf] [repo]
2023.07	Li et al.	ACL'23	To Copy Rather Than Memorize: A Vertical Learning Paradigm for Knowledge Graph Completion	[pdf] [repo] [github]

tutorials

Author	Conf. / Uni	Title	Links
Heiko Paulheim	Universität Mannheim	Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge Extraction or Knowledge Hallucination?	[pdf]
Ilharco et al.	ICDM'20	Online Multimodal Knowledge Discovery	[pdf]
Shih-Fu Chang	CVPR'19 & Uni. Columbia	Multimodal Knowledge Graphs: Automatic Extraction & Applications	[pdf] [repo]
Dihong Gong	Uni. Florida	Towards Building Large-Scale Multimodal Knowledge Bases	[pdf] [repo]
Elias Karle & Umutcan Simsek	Uni Innsbruck	How To Build a Knowledge Graph	[pdf] [repo]
Jay Pujara & Sameer Singh	WSDM‘18	Mining Knowledge Graphs from Text	[pdf] [repo1] [repo2A] [repo2B] [repo3] [repo4] [repo5]
Sameer Singh	Uni California, Irvine	Injecting Prior Information and Multiple Modalities into Knowledge Base Embeddings	[pdf] [repo]

datasets

Year	Author	Conf.	Title	Links
2022.11	Lu et al.	NeurIPS'22	Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering	[pdf] [repo]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resource_list_by_category.md

resource_list_by_category.md

multimodal knowledge graph

multimodal representation learning

information extraction

tutorials

datasets

Files

resource_list_by_category.md

Latest commit

History

resource_list_by_category.md

File metadata and controls

multimodal knowledge graph

multimodal representation learning

information extraction

tutorials

datasets