- SimMatchV2: Semi-Supervised Learning with Graph Consistency
- Improved Knowledge Transfer for Semi-Supervised Domain Adaptation via Trico Training Strategy
- VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural Networks for Visual Question Answering
- SGAligner: 3D Scene Alignment with Scene Graphs
- Visual Traffic Knowledge Graph Generation from Scene Images
- Vision HGNN: An Image is More than a Graph of Nodes
- Detecting Objects with Context-Likelihood Graphs and Graph Refinement
- Face Clustering via Graph Convolutional Networks with Confidence Edges
- CLIP-Cluster: CLIP-Guided Attribute Hallucination for Face Clustering
- GraphEcho: Graph-Driven Unsupervised Domain Adaptation for Echocardiogram Video Segmentation
- ShapeScaffolder: Structure-Aware 3D Shape Generation from Text
- End2End Multi-View Feature Matching with Differentiable Pose Optimization
- GlueStick: Robust Image Matching by Sticking Points and Lines Together
- DMNet: Delaunay Meshing Network for 3D Shape Representation
- PC-Adapter: Topology-Aware Adapter for Efficient Domain Adaption on Point Clouds with Rectified Pseudo-label
- Chasing Clouds: Differentiable Volumetric Rasterisation of Point Clouds as a Highly Efficient and Accurate Loss for Large-Scale Deformable 3D Registration
- CO-PILOT: Dynamic Top-Down Point Cloud with Conditional Neighborhood Aggregation for Multi-Gigapixel Histopathology Image Representation
- Learning Adaptive Neighborhoods for Graph Neural Networks
- 3DMOTFormer: Graph Transformer for Online 3D Multi-Object Tracking
- Learn TAROT with MENTOR: A Meta-Learned Self-Supervised Approach for Trajectory Prediction
- Efficient Deep Space Filling Curve
- DDG-Net: Discriminability-Driven Graph Network for Weakly-supervised Temporal Action Localization
- WaterMask: Instance Segmentation for Underwater Imagery
- Re:PolyWorld - A Graph Neural Network for Polygonal Scene Parsing
- Reconstructing Groups of People with Hypergraph Relational Reasoning
- Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image
- Dynamic Hyperbolic Attention Network for Fine Hand-object Reconstruction
- Spectral Graphormer: Spectral Graph-Based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images
- DCPB: Deformable Convolution Based on the Poincare Ball for Top-view Fisheye Cameras
- Video Action Segmentation via Contextually Refined Temporal Keypoints
- HopFIR: Hop-wise GraphFormer with Intragroup Joint Refinement for 3D Human Pose Estimation
- GLA-GCN: Global-local Adaptive Graph Convolutional Network for 3D Human Pose Estimation from Monocular Video
- HDG-ODE: A Hierarchical Continuous-Time Model for Human Pose Forecasting
- CheckerPose: Progressive Dense Keypoint Localization for Object Pose Estimation with Graph Neural Network
- HandR2N2: Iterative 3D Hand Pose Estimation Using a Residual Recurrent Neural Network
- Normalizing Flows for Human Pose Anomaly Detection
- SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence Pre-training
- Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching
- FSAR: Federated Skeleton-based Action Recognition with Adaptive Topology Structure and Knowledge Distillation
- SkeleTR: Towards Skeleton-based Action Recognition in the Wild
- Hierarchically Decomposed Graph Convolutional Networks for Skeleton-Based Action Recognition
- Physics-Augmented Autoencoder for 3D Skeleton-Based Gait Recognition
- GPGait: Generalized Pose-based Gait Recognition
- Leveraging Spatio-Temporal Dependency for Skeleton-Based Action Recognition
- ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking
- Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model
- Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning
- Scene Graph Contrastive Learning for Embodied Navigation
- MAPConNet: Self-supervised 3D Pose Transfer with Mesh and Point Contrastive Learning
- ReactioNet: Learning High-Order Facial Behavior from Universal Stimulus-Reaction by Dyadic Relation Reasoning
- GlobalMapper: Arbitrary-Shaped Urban Layout Generation
- Persistent-Transient Duality: A Multi-Mechanism Approach for Modeling Human-Object Interaction
- RLSAC: Reinforcement Learning Enhanced Sample Consensus for End-to-End Robust Estimation
- Virtual Try-On with Pose-Garment Keypoints Guided Inpainting
- VertexSerum: Poisoning Graph Neural Networks for Link Inference
- MolGrapher: Graph-based Visual Recognition of Chemical Structures
- Open-vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models
- CoSign: Exploring Co-occurrence Signals in Skeleton-based Continuous Sign Language Recognition
- Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views
- HM-ViT: Hetero-Modal Vehicle-to-Vehicle Cooperative Perception with Vision Transformer