- Transformer with Implicit Edges for Particle-based Physics Simulation
- VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
- IGFormer: Interaction Graph Transformer for Skeleton-based Human Interaction Recognition
- GTCaR: Graph Transformer for Camera Re-localization
- Video Graph Transformer for Video Question Answering
- Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
- MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes
- Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification
- Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation
- PoserNet: Refining Relative Camera Poses Exploiting Object Detections
- Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph
- AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection
- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
- GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs
- Generative Subgraph Contrast for Self-Supervised Graph Representation Learning
- Self-supervised Social Relation Representation for Human Group Detection
- Unsupervised Segmentation in Real-World Images via Spelke Object Inference
- Self-Supervised Learning of Visual Graph Matching
- Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective
- CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
- FRT-PAD: Effective Presentation Attack Detection Driven by Face Related Task
- Adversarial Label Poisoning Attack on Graph Neural Networks via Label Propagation
- Exploring the Devil in Graph Spectral Domain for 3D Point Cloud Attacks
- diffConv: Analyzing Irregular Point Clouds with an Irregular View
- GraphFit: Learning Multi-scale Graph-Convolutional Representation for Point Cloud Normal Estimation
- Revisiting Point Cloud Simplification: A Learnable Feature Preserving Approach
- 3D Human Pose Estimation Using Möbius Graph Convolutional Networks
- Pose Forecasting in Industrial Human-Robot Collaboration
- DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks
- Diverse Human Motion Prediction Guided by Multi-level Spatial-Temporal Anchors
- Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction
- Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction
- Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos
- S2Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning
- Graph Neural Network for Cell Tracking in Microscopy Videos
- Robust Landmark-based Stent Tracking in X-ray Fluoroscopy
- PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?
- Panoramic Human Activity Recognition
- D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights
- TIDEE: Tidying Up Novel Rooms Using Visuo-Semantic Commonsense Priors
- End-to-end Graph-constrained Vectorized Floorplan Generation with Panoptic Refinement
- SelectionConv: Convolutional Neural Networks for Non-rectilinear Image Data
- Learning Graph Neural Networks for Image Style Transfer
- TD-Road: Top-Down Road Network Extraction with Holistic Graph Construction
- Learning Self-prior for Mesh Denoising using Dual Graph Convolutional Networks
- Relationship Spatialization for Depth Estimation
- The Shape Part Slot Machine: Contact-Based Reasoning for Generating 3D Shapes from Parts
- An Efficient Person Clustering Algorithm for Open Checkout-free Groceries