Publications in CVPR 2024

Transformers

Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers
FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer
Compositional Video Understanding with Spatiotemporal Structure-based Transformers
MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers
Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs

Multiple Modalities

Adaptive Hyper-graph Aggregation for Modality-Agnostic Federated Learning
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
Compositional Chain-of-Thought Prompting for Large Multimodal Models
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective

Contrastive Learning

Improving Graph Contrastive Learning via Adaptive Positive Sampling
CLIP-Driven Open-Vocabulary 3D Scene Graph Generation via Cross-Modality Contrastive Learning

Scene Graphs

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks
Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships
GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs
Multi-Level Neural Scene Graphs for Dynamic Urban Environments
Composing Object Relations and Attributes for Image-Text Matching
Neighbor Relations Matter in Video Scene Detection

Scene Graph Generation

DSGG: Dense Relation Transformer for an End-to-end Scene Graph Generation
EGTR: Extracting Graph from Transformer for Scene Graph Generation
HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation
HIG: Hierarchical Interlacement Graph Approach to Scene Graph Generation in Video Understanding
OED: Towards One-stage End-to-End Dynamic Scene Graph Generation
LLM4SGG: Large Language Models for Weakly Supervised Scene Graph Generation
From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models
Leveraging Predicate and Triplet Learning for Scene Graph Generation

Point Clouds

DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching
Denoising Point Clouds in Latent Space via Graph Convolution and Invertible Neural Network
Object Dynamics Modeling with Hierarchical Point Cloud-based Representations

LiDAR

GLiDR: Topologically Regularized Graph Generative Network for Sparse LiDAR Point Clouds
LiDAR-based Person Re-identification

Dynamic Graphs

Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis
GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs

Generative Models

Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline
HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation

Layout Generation

Constrained Layout Generation with Factor Graphs
MaskPLAN: Masked Generative Layout Planning from Partial Input

Diffusion

HHMR: Holistic Hand Mesh Recovery by Enhancing the Multimodal Controllability of Graph Diffusion Models
Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs
DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly
Relation Rectification in Diffusion Model
Neural Sign Actors: A Diffusion Model for 3D Sign Language Production from Text

Molecules

Molecular Data Programming: Towards Molecule Pseudo-labeling with Systematic Weak Supervision
Clustering for Protein Representation Learning

Trajectories

Multi-agent Long-term 3D Human Pose Forecasting via Interaction-aware Trajectory Conditioning
Higher-order Relational Reasoning for Pedestrian Trajectory Prediction

Healthcare

Tumor Micro-environment Interactions Guided Graph Learning for Survival Analysis of Human Cancers from Whole-slide Pathological Images
XFibrosis: Explicit Vessel-Fiber Modeling for Fibrosis Staging from Liver Pathology Images

Skeleton-based Modelling

BlockGCN: Redefine Topology Awareness for Skeleton-Based Action Recognition
Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing

Benchmarks

SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge
Advancing Saliency Ranking with Human Fixations: Dataset Models and Benchmarks

Matching

Neural Markov Random Field for Stereo Matching
MESA: Matching Everything by Segmenting Anything
CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition

3D Data

MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance Segmentation
CAGE: Controllable Articulation GEneration
G-FARS: Gradient-Field-based Auto-Regressive Sampling for 3D Part Grouping
3D Feature Tracking via Event Camera
TeMO: Towards Text-Driven 3D Stylization for Multi-Object Meshes
Category-Level Multi-Part Multi-Joint 3D Shape Assembly
VS: Reconstructing Clothed 3D Human from Single Image via Vertex Shift

Miscellaneous

FC-GNN: Recovering Reliable and Accurate Correspondences from Interferences
Domain Separation Graph Neural Networks for Saliency Object Ranking
Improving Out-of-Distribution Generalization in Graphs via Hierarchical Semantic Environments
SignGraph: A Sign Sequence is Worth Graphs of Nodes
Image Processing GNN: Breaking Rigidity in Super-Resolution
Learning Structure-from-Motion with Graph Attention Networks
MemoNav: Working Memory Model for Visual Navigation
Error Detection in Egocentric Procedural Task Videos
Semantic-Aware Multi-Label Adversarial Attacks
Learning for Transductive Threshold Calibration in Open-World Recognition

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Publications in CVPR 2024

Transformers

Multiple Modalities

Contrastive Learning

Scene Graphs

Scene Graph Generation

Point Clouds

LiDAR

Dynamic Graphs

Generative Models

Layout Generation

Diffusion

Molecules

Trajectories

Healthcare

Skeleton-based Modelling

Benchmarks

Matching

3D Data

Miscellaneous

Files

README.md

Latest commit

History

README.md

File metadata and controls

Publications in CVPR 2024

Transformers

Multiple Modalities

Contrastive Learning

Scene Graphs

Scene Graph Generation

Point Clouds

LiDAR

Dynamic Graphs

Generative Models

Layout Generation

Diffusion

Molecules

Trajectories

Healthcare

Skeleton-based Modelling

Benchmarks

Matching

3D Data

Miscellaneous