ECCV2020将于2020年8月23-28日在线上举行,今年共接受了1361篇论文,本文是接收论列表的第一部分,第二部见链接


| Paper ID | Paper Title | Category |
|---|---|---|
| 267 | Quaternion Equivariant Capsule Networks for 3D Point Clouds | Oral |
| 283 | DeepFit: 3D Surface Fitting by Neural Network Weighted Least Squares | Oral |
| 343 | MoSaNAS: Multi-Objective Surrogate-Assisted Neural Architecture Search | Oral |
| 384 | Describing Textures using Natural Language | Oral |
| 410 | Empowering Relational Network by Self-Attention Augmented Conditional Random Fields for Group Activity Recognition | Oral |
| 445 | AiR: Attention with Reasoning Capability | Oral |
| 500 | Self6D: Self-Supervised Monocular 6D Object Pose Estimation | Oral |
| 529 | Invertible Image Rescaling | Oral |
| 612 | Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation | Oral |
| 677 | House-GAN: Relational Generative Adversarial Networks for Graph-constrained House Layout Generation | Oral |
| 736 | Crowdsampling the Plenoptic Function | Oral |
| 738 | End-to-End Estimation of Multi-Person 3D Poses from Multiple Cameras | Oral |
| 832 | End-to-End Object Detection with Transformers | Oral |
| 840 | DeepSFM: Structure From Motion Via Deep Bundle Adjustment | Oral |
| 1044 | Ladybird: Deep Implicit Field Based 3D Reconstruction with Sampling and Symmetry | Oral |
| 1059 | Segment as Points for Efficient Online Multi-Object Tracking and Segmentation | Oral |
| 1105 | Conditional Convolutions for Instance Segmentation | Oral |
| 1196 | MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution | Oral |
| 1203 | Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset | Oral |
| 1273 | Privacy Preserving Structure-from-Motion | Oral |
| 1326 | Rewriting a Deep Generative Model | Oral |
| 1417 | Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets | Oral |
| 1448 | Long-term Human Motion Prediction with Scene Context | Oral |
| 1473 | NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis | Oral |
| 1501 | ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes | Oral |
| 1737 | MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images | Oral |
| 1793 | Learning and aggregating deep local descriptors for instance-level recognition | Oral |
| 1969 | A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem | Oral |
| 2096 | Learn to Recover Visible Color for Video Surveillance in a Day | Oral |
| 2149 | Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single-view Images | Oral |
| 2193 | Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation | Oral |
| 2211 | BorderDet: Border Feature for Dense Object Detection | Oral |
| 2258 | Regularization with Latent Space Virtual Adversarial Training | Oral |
| 2263 | Du$^2$Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels | Oral |
| 2307 | Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot learning | Oral |
| 2463 | Targeted Attack for Deep Hashing based Retrieval | Oral |
| 2471 | Gradient Centralization: A New Optimization Technique for Deep Neural Networks | Oral |
| 2503 | Content-Aware Unsupervised Deep Homography Estimation | Oral |
| 2556 | Multi-View Optimization of Local Feature Geometry | Oral |
| 2597 | Efï¬cient Model Fitting by Combining Lifted Optimization with Phong Surface Models | Oral |
| 2641 | Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video | Oral |
| 2683 | Learning Stereo from Single Images | Oral |
| 2748 | Prototype Rectification for Few-Shot Learning | Oral |
| 2784 | Learning Feature Descriptors using Camera Pose Supervision | Oral |
| 2785 | Semantic Flow for Fast and Accurate Scene Parsing | Oral |
| 2788 | Appearance Consensus Driven Self-Supervised Human Mesh Recovery | Oral |
| 2825 | Diffraction Line Imaging | Oral |
| 2834 | Aligning and Projecting Images to Class-conditional Generative Networks | Oral |
| 2852 | Suppress and Balance: A Simple Gated Network for Salient Object Detection | Oral |
| 2904 | Visual Memorability for Robotic Interestingness Prediction via Unsupervised Online Learning | Oral |
| 2949 | Post-Training Piecewise Linear Quantization for Deep Neural Networks | Oral |
| 2974 | Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification | Oral |
| 2978 | In-Home Daily-Life Captioning Using Radio Signals | Oral |
| 3018 | Self-Challenging Improves Cross-Domain Generalization | Oral |
| 3029 | A Competence-aware Curriculum for Visual Concepts Learning via Question Answering | Oral |
| 3047 | Multi-task Learning Increases Adversarial Robustness | Oral |
| 3054 | S2DNAS: Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search | Oral |
| 3112 | Improving Deep Video Compression by Resolution-adaptive Flow Coding | Oral |
| 3158 | Motion Capture from Internet Videos | Oral |
| 3183 | Appearance-Preserving 3D Convolution for Video-based Person Re-identification | Oral |
| 3241 | Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization | Oral |
| 3265 | Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation | Oral |
| 3312 | Deep Spatial-angular Regularization for Compressive Light Field Reconstruction over Coded Apertures | Oral |
| 3331 | Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling | Oral |
| 3356 | Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction | Oral |
| 3376 | Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network | Oral |
| 3387 | Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation | Oral |
| 3439 | Coherent full scene 3D reconstruction from a single RGB image | Oral |
| 3482 | Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs | Oral |
| 3526 | RAFT: Recurrent All-Pairs Field Transforms for Optical Flow | Oral |
| 3528 | Domain-invariant Stereo Matching Networks | Oral |
| 3538 | DeepHandMesh: Weakly-supervised Deep Encoder-Decoder Framework for High-fidelity Hand Mesh Modeling from a Single RGB Image | Oral |
| 3544 | Content Adaptive and Error Propagation Aware Deep Video Compression | Oral |
| 3553 | Towards Streaming Image Understanding | Oral |
| 3570 | Towards Automated Testing and Robustification by Semantic Adversarial Data Generation | Oral |
| 3582 | Adversarial Generative Grammars for Human Activity Prediction | Oral |
| 3587 | Greedy Sampler and Dumb Learner: A Surprisingly Effective Approach for Continual Learning | Oral |
| 3622 | Learning Lane Graph Representations for Motion Forecasting | Oral |
| 3651 | What Matters in Unsupervised Optical Flow | Oral |
| 3678 | Synthesis and Completion of Facades from Satellite Imagery | Oral |
| 3772 | Mapillary Planet-Scale Depth Dataset | Oral |
| 3838 | V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction | Oral |
| 3891 | Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters | Oral |
| 3948 | EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning | Oral |
| 3975 | Intrinsic Point Cloud Interpolation via Dual Latent Space Navigation | Oral |
| 3976 | Cross-Domain Cascaded Deep Translation | Oral |
| 4043 | "Look Ma, no landmarks!" - Unsupervised, model-based dense face alignment | Oral |
| 4158 | Online Invariance Selection for Local Feature Descriptors | Oral |
| 4179 | Rethinking image inpainting via a mutual encoder-decoder with feature equalization | Oral |
| 4358 | TextCaps: a Dataset for Image Captioning with Reading Comprehension | Oral |
| 4423 | It is not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction | Oral |
| 4440 | Learning What to Learn for Video Object Segmentation | Oral |
| 4732 | SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing | Oral |
| 4866 | LIMP: Learning Latent Shape Representations with Metric Preservation Priors | Oral |
| 5277 | Unsupervised Sketch-to-Photo Synthesis | Oral |
| 5360 | A simple way to make neural networks robust against diverse image corruptions | Oral |
| 5457 | SoftpoolNet: Shape Descriptor for Point Cloud Completion and Classification | Oral |
| 5800 | Hierarchical Face Aging through Disentangled Latent Characteristics | Oral |
| 5859 | Hybrid Models for Open Set Recognition | Oral |
| 5932 | TopoGAN: A Topology-Aware Generative Adversarial Network | Oral |
| 6101 | Learning to Localize Actions from Moments | Oral |
| 6147 | ForkGAN: Seeing into the Rainy Night | Oral |
| 6209 | TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning | Oral |
| 6502 | ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval | Oral |
| 22 | A Simple and Versatile Framework for Image-to-Image Translation | Spotlight |
| 43 | ProxyBNN: Learning Binarized Neural Networks via Proxy Matrices | Spotlight |
| 87 | Fair Attribute Classification through Latent Space De-biasing | Spotlight |
| 148 | HMOR: Hierarchical Multi-person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation | Spotlight |
| 193 | Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve | Spotlight |
| 223 | A Unified Framework of Surrogate Loss by Refactorization and Interpolation | Spotlight |
| 362 | Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images | Spotlight |
| 366 | Memory-augmented Dense Predictive Coding for Video Representation Learning | Spotlight |
| 378 | PointMixup: Augmentation for Point Clouds | Spotlight |
| 415 | Identity-Guided Human Semantic Parsing Learning for Person Re-Identification | Spotlight |
| 462 | Learning Gradient Fields for Shape Generation | Spotlight |
| 467 | Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder | Spotlight |
| 492 | Corner Proposal Network for Anchor-free, Two-stage Object Detection | Spotlight |
| 495 | PhraseClick: Toward Achieving Flexible Interactive Segmentation by Phrase and Click | Spotlight |
| 513 | Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing | Spotlight |
| 526 | Learning Delicate Local Representations for Multi-Person Pose Estimation | Spotlight |
| 544 | Learning to plan with uncertain topological maps | Spotlight |
| 574 | Neural Design Network: Graphic Layout Generation with Constraints | Spotlight |
| 591 | Learning Open Set Network with Discriminative Reciprocal Points | Spotlight |
| 597 | Convolutional Occupancy Networks | Spotlight |
| 672 | Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry | Spotlight |
| 849 | A General Toolbox for Understanding Errors in Object Detection | Spotlight |
| 893 | PointContrast: Unsupervised Pretraining for 3D Point Cloud Understanding | Spotlight |
| 922 | DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation | Spotlight |
| 990 | Circumventing Outliers of AutoAugment with Knowledge Distillation | Spotlight |
| 997 | S2DNet: Learning accurate correspondences for sparse-to-dense feature matching | Spotlight |
| 1054 | RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving | Spotlight |
| 1062 | Video Object Segmentation with Graph Memory Network | Spotlight |
| 1101 | Rethinking Bottleneck Structure for Efficient Mobile Network Design | Spotlight |
| 1104 | Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks | Spotlight |
| 1121 | Towards Part-aware Monocular 3D Human Pose Estimation: An Architecture Search Approach | Spotlight |
| 1207 | A Tool for Measuring and Mitigating Bias in Visual Datasets | Spotlight |
| 1327 | Contrastive Learning for Weakly Supervised Phrase Grounding | Spotlight |
| 1362 | Collaborative Learning of Gesture Recognition and 3D Hand Pose Estimation with Multi-Order Feature Analysis | Spotlight |
| 1425 | Studying the Transferability of Adversarial Attacks on Object Detectors | Spotlight |
| 1449 | TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images | Spotlight |
| 1479 | Semi-Siamese Training for Shallow Face Learning | Spotlight |
| 1488 | GAN Slimming: All-in-One Unified GAN Compression | Spotlight |
| 1526 | Human Interaction Learning on 3D Skeleton Point Clouds for Video Violence Recognition | Spotlight |
| 1530 | Binarized Neural Network for Single Image Super Resolution | Spotlight |
| 1564 | Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation | Spotlight |
| 1605 | Adaptive Computationally Efficient Network for Monocular 3D Hand Pose Estimation | Spotlight |
| 1624 | Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking | Spotlight |
| 1631 | Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets | Spotlight |
| 1676 | Hamiltonian Dynamics for Real-World Shape Interpolation | Spotlight |
| 1694 | Learning to Scale Multilingual Representations for Vision-Language Tasks | Spotlight |
| 1710 | Multi-modal Transformer for Video Retrieval | Spotlight |
| 1761 | Matching Feature Matters: End-to-End Learning for Neural Texture Transfer | Spotlight |
| 1802 | RobustFusion: Human Volumetric Capture with Data-driven Visual Cues using a RGBD Camera | Spotlight |
| 1886 | Surface Normal Estimation of Tilted Images via Spatial Rectifier | Spotlight |
| 1915 | Multimodal Shape Completion via Conditional Generative Adversarial Networks | Spotlight |
| 1977 | Generative Sparse Detection Network for 3D Single-shot Object Detection | Spotlight |
| 1987 | Grounded Situation Recognition | Spotlight |
| 2019 | Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos | Spotlight |
| 2157 | Unpaired Learning of Deep Blind Image Denoising | Spotlight |
| 2191 | Self-supervising Fine-grained Region Similarities for Large-scale Image Localization | Spotlight |
| 2215 | Rotationally-Temporally Consistent Novel-View Synthesis of Human Performance Video | Spotlight |
| 2272 | Side-Aware Boundary Localization for More Precise Object Detection | Spotlight |
| 2314 | SF-Net: Single-Frame Supervision for Temporal Action Localization | Spotlight |
| 2317 | Negative Margin Matters: Understanding Margin in Few-shot Classification | Spotlight |
| 2323 | Particularity beyond Commonality: Unpaired Identity Transfer with Multiple References | Spotlight |
| 2342 | Tracking objects as points | Spotlight |
| 2390 | CPGAN: Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis | Spotlight |
| 2402 | Transporting Labels via Hierarchical Optimal Transport for Semi-Supervised Learning | Spotlight |
| 2449 | MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning | Spotlight |
| 2473 | Learning to Factorize a City | Spotlight |
| 2495 | Region Graph Embedding Network for Zero-Shot Learning | Spotlight |
| 2534 | GRAB: A Dataset of Whole-Body Human Grasping of Objects | Spotlight |
| 2616 | DEMEA: Deep Mesh Autoencoders for Non-Rigidly Deforming Objects | Spotlight |
| 2623 | RANSAC-Flow: generic two-stage image alignment | Spotlight |
| 2632 | Semantic Object Prediction with Binaural Sounds | Spotlight |
| 2636 | Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images | Spotlight |
| 2666 | Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking | Spotlight |
| 2707 | Pixel-Pair Occlusion Relationship Map (P2ORM): Formulation, Inference & Application | Spotlight |
| 2710 | MovieNet: A Holistic Dataset for Movie Understanding | Spotlight |
| 2723 | Short-Term and Long-Term Context Aggregation Network for Video Inpainting | Spotlight |
| 2754 | Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DOF Relocalization | Spotlight |
| 2755 | Face Super-Resolution Guided by 3D Facial Priors | Spotlight |
| 2763 | Label Propagation with Augmented Anchors: A Simple Semi-Supervised Learning baseline for Unsupervised Domain Adaptation | Spotlight |
| 2767 | Are Labels Necessary for Neural Architecture Search? | Spotlight |
| 2776 | BLSM: A Bone-Level Skinned Model of the Human Mesh | Spotlight |
| 2826 | Associative Alignment for Few-shot Image Classification | Spotlight |
| 2873 | Cyclic Functional Mapping:Self-supervised correspondence between non-isometric deformable shapes | Spotlight |
| 2905 | View-Invariant Probabilistic Embedding for Human Pose | Spotlight |
| 2918 | Contact and Human Dynamics from Monocular Video | Spotlight |
| 2950 | PointPWC-Net: Cost Volume on Point Clouds for (Self-)Supervised Scene Flow Estimation | Spotlight |
| 2965 | Point2Surf: Learning Implicit Surfaces from Point Cloud Patches | Spotlight |
| 2983 | Few-Shot Scene-Adaptive Anomaly Detection | Spotlight |
| 2986 | Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting | Spotlight |
| 2988 | Entropy Minimisation Framework for Event-based Vision Model Estimation | Spotlight |
| 2992 | Reconstructing NBA Players | Spotlight |
| 3087 | PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments | Spotlight |
| 3089 | TENet: Triple Excitation Network for Video Salient Object Detection | Spotlight |
| 3099 | Deep Feedback Inverse Problem Solver | Spotlight |
| 3119 | Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification | Spotlight |
| 3120 | Hallucinating Visual Instances in Total Absentia | Spotlight |
| 3125 | Unsupervised 3D Shape Completion in the Wild | Spotlight |
| 3335 | DTVNet: Dynamic Time-lapse Video Generation via Single Still Image | Spotlight |
| 3365 | CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss | Spotlight |
| 3385 | Collaborative Video Object Segmentation by Foreground-Background Integration | Spotlight |
| 3456 | Adaptive Margin Diversity Regularizer for handling Data Imbalance in Zero-Shot SBIR | Spotlight |
| 3477 | XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation | Spotlight |
| 3499 | Calibration-free Structure-from-Motion with Calibrated Radial Trifocal Tensors | Spotlight |
| 3594 | Occupancy anticipation for efficient navigation | Spotlight |
| 3601 | Unified Image and Video Saliency Modeling | Spotlight |
| 3604 | TAO: A Large-scale Benchmark for Tracking Any Object | Spotlight |
| 3657 | A Generalization of Otsu's Method and Minimum Error Thresholding | Spotlight |
| 3663 | A Cordial Sync: Moving Furniture by Moving Beyond Marginal Policies | Spotlight |
| 3665 | Big Transfer (BiT): General Visual Representation Learning | Spotlight |
| 3684 | Visual Commonsense Graphs: Reasoning about the Dynamic Context of a Still Image | Spotlight |
| 3831 | Few-shot Action Recognition via Permutation-invariant Attention | Spotlight |
| 3913 | Character Grounding and Re-Identification in Story of Videos and Text Descriptions | Spotlight |
| 3977 | AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian Sub-sampling | Spotlight |
| 3984 | Learning Visual Context by Comparison | Spotlight |
| 3994 | Large scale holistic video understanding | Spotlight |
| 3995 | Indirect Local Attacks for Context-aware Semantic Segmentation Networks | Spotlight |
| 4294 | Inferring Visual Overlap of Images through Interpretable Non-Metric Embeddings | Spotlight |
| 4296 | Connecting Vision and Language with Localized Narratives | Spotlight |
| 4383 | Adversarial T-shirt! Evading Person Detectors in A Physical World | Spotlight |
| 4404 | Bounding-box Channels for Visual Relationship Detection | Spotlight |
| 4407 | Minimal Rolling Shutter Absolute Pose with Unknown Focal Length and Radial Distortion | Spotlight |
| 4442 | SRFlow: Learning the Super-Resolution Space with Normalizing Flow | Spotlight |
| 4452 | DeepGMR: Learning Latent Gaussian Mixture Models for Registration | Spotlight |
| 4458 | Active 3D Perception using Light Curtains | Spotlight |
| 4521 | Invertible Neural BRDF for Object Inverse Rendering | Spotlight |
| 4545 | Semi-supervised Semantic Segmentation via Strong-weak Dual-branch Network | Spotlight |
| 4571 | Practical Deep Raw Image Denoising on Mobile Devices | Spotlight |
| 4577 | Audio-Visual Embodied Navigation | Spotlight |
| 4602 | Two-Stream Consensus Networks for Weakly-Supervised Temporal Action Localization | Spotlight |
| 4677 | Erasing Appearance Preservation in Image Smoothing | Spotlight |
| 4727 | Counterfactual Vision-and-Language Navigation via Adversarial Path Sampler | Spotlight |
| 4749 | Guided Deep Decoder: Unsupervised Image Pair Fusion | Spotlight |
| 4809 | Filter Style Transfer between Photos | Spotlight |
| 4860 | JGR-P2O: Joint Graph Reasoning based Pixel-to-Offset Prediction Network for 3D Hand Pose Estimation from a Single Depth Image | Spotlight |
| 4867 | Dynamic Group Convolution for Accelerating Convolutional Neural Networks | Spotlight |
| 4880 | RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and Rendering | Spotlight |
| 5021 | Object-Contextual Representations for Semantic Segmentation | Spotlight |
| 5116 | Spatio-Temporal Efficient Recurrent Neural Network for Video Deblurring | Spotlight |
| 5393 | The Semantic Mutex Watershed for Efficient Bottom-Up Semantic Instance Segmentation | Spotlight |
| 5471 | Photon-Efficient 3D Imaging with A Non-Local Neural Network | Spotlight |
| 5554 | Generative Latent Textured Proxies for Category-Level Object Modeling | Spotlight |
| 5672 | Improving Vision-and-Language Navigation with Image-Text Pairs from the Web | Spotlight |
| 5685 | Directional Temporal Modeling for Action Recognition | Spotlight |
| 5714 | Shonan Rotation Averaging: Global Optimality by Surfing $SO(p)^n$ | Spotlight |
| 5723 | Semantic Curiosity for Visual Navigation | Spotlight |
| 5821 | Multi-Temporal Recurrent Neural Networks For Progressive Non-Uniform Single Image Deblurring With Incremental Temporal Training | Spotlight |
| 5975 | ProgressFace: Scale-Aware Progressive Learning for Face Detection | Spotlight |
| 6025 | Learning Multi-layer Latent Variable Model with Short Run Inference Dynamics | Spotlight |
| 6053 | CoTeRe-Net: Discovering Collaborative Ternary Relations in Videos | Spotlight |
| 6100 | Modeling the Effects of Windshield Refraction for Camera Calibration | Spotlight |
| 6124 | Skin Segmentation from NIR Images using Unsupervised Domain Adaptation through Generative Latent Search | Spotlight |
| 6254 | PROFIT: A Novel Training Method for sub-4-bit MobileNet Models | Spotlight |
| 6277 | Visual Relation Grounding in Videos | Spotlight |
| 6296 | Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows | Spotlight |
| 6314 | Controlling semantics and style in conditional image synthesis | Spotlight |
| 6360 | Jointly learning visual motion and confidence from local patches in event cameras | Spotlight |
| 6406 | SODA: Story Oriented Dense Video Captioning Evaluation Framework | Spotlight |
| 6490 | Sketch-Guided Object Localization in Natural Images | Spotlight |
| 6496 | Metric learning: cross-entropy vs. pairwise losses | Spotlight |
| 6959 | Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models | Spotlight |
| 7231 | The Hessian Penalty: A Weak Prior for Unsupervised Disentanglement | Spotlight |
| 5 | STAR: Sparse Trained Articulated Human Body Regressor | Poster |
| 13 | Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer | Poster |
| 15 | Collaboration by Competition: Self-coordinated Knowledge Amalgamation for Multi-talent Student Learning | Poster |
| 25 | Do Not Disturb Me: Person Re-identification Under the Interference of Other Pedestrians | Poster |
| 31 | Learning 3D Part Assembly from A Single Image | Poster |
| 32 | PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions | Poster |
| 50 | Highly Efficient Salient Object Detection with 100K Parameters | Poster |
| 69 | HardGAN: A Haze-Aware Representation Distillation GAN for Single Image Dehazing | Poster |
| 88 | Lifespan Age Transformation Synthesis | Poster |
| 90 | Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation | Poster |
| 106 | Synthesizing Content Consistent Vehicle Datasets with Attribute Descent | Poster |
| 116 | Multiview Pedestrian Detection with Feature Perspective Transformation | Poster |
| 121 | Learning Object Relation Graph and Tentative Policy for Visual Navigation | Poster |
| 123 | Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition | Poster |
| 132 | Across Scales & Across Dimensions: Temporal Super-Resolution using Deep Internal Learning | Poster |
| 138 | Inducing Optimal Attributes Representations for Conditional GANs | Poster |
| 152 | AR-Net: Adaptive Frame Resolution for Efficient Action Recognition | Poster |
| 156 | Image-to-Voxel Model Translation for 3D Scene Reconstruction and Segmentation | Poster |
| 157 | Consistency Guided Scene Flow Estimation | Poster |
| 160 | Autoregressive Unsupervised Image Segmentation | Poster |
| 169 | Controllable Image Synthesis via SegVAE | Poster |
| 173 | Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search | Poster |
| 177 | Efficient Non-Line-of-Sight Imaging by Circular and Confocal Scanning | Poster |
| 181 | Texture Hallucination for Large-Factor Painting Super-Resolution | Poster |
| 183 | Learning Progressive Joint Propagation for Human Motion Prediction | Poster |
| 184 | Rolling Shutter Image Stitching and Rectification via Differential Homography | Poster |
| 186 | ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds | Poster |
| 188 | The Group Loss for Deep Metric Learning | Poster |
| 203 | Learning Object Depth from Camera Motion and Video Object Segmentation | Poster |
| 206 | OnlineAugment: Online Data Augmentation with Less Domain Knowledge | Poster |
| 209 | Learning Inter-Plane Relations for Piecewise Planar Reconstruction | Poster |
| 230 | Intra-class Compactness Distillation for Semantic Segmentation | Poster |
| 233 | Temporal Distinct Representation Learning for 2D-CNN-based Action Recognition | Poster |
| 241 | Representative Graph Neural Network | Poster |
| 264 | Deformation-Aware 3D Shape Embedding and Retrieval | Poster |
| 277 | Atlas: End-to-End 3D Scene Reconstruction from Posed Images | Poster |
| 278 | Multiple Class Novelty Detection Under the Data Distribution Shift | Poster |
| 281 | Colorization of Depth Map via Disentanglement | Poster |
| 287 | Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes | Poster |
| 292 | GeoGraph: Learning graph-based multi-view object detection with geometric cues end-to-end | Poster |
| 300 | Localizing the Common Action Among a Few Videos | Poster |
| 306 | TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification | Poster |
| 312 | Traffic Accident Analysis by Cause and Effect Events Localization | Poster |
| 318 | Face Anti-Spoofing with Human Material Perception | Poster |
| 328 | How Can I See My Future? FvTraj: Using First-person View for Pedestrian Trajectory Prediction | Poster |
| 338 | Multiple Expert Brainstorming for Domain Adaptive Person Re-identification | Poster |
| 344 | NASA: Neural Articulated Shape Approximation | Poster |
| 350 | Towards Unique and Informative Captioning of Images | Poster |
| 352 | When Does Self-supervision Improve Few-shot Learning? | Poster |
| 355 | Two-branch Recurrent Network for Isolating Deepfakes in Videos | Poster |
| 360 | Incremental Few-Shot Meta-Learning via Indirect Feature Alignment | Poster |
| 363 | BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models | Poster |
| 386 | Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation | Poster |
| 392 | Global Distance-distributions Separation for Unsupervised Person Re-identification | Poster |
| 397 | I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image | Poster |
| 398 | Pose2Mesh: Graph Convolutional Network for 3D human Pose and Mesh Recovery from 2D Human Pose | Poster |
| 402 | ALRe: Outlier Detection for Guided Refinement | Poster |
| 414 | Weakly-Supervised Crowd Counting Learns from Sorting rather than Locations | Poster |
| 429 | Unsupervised Domain Attention Adaptation Network for Caricature Attribute Recognition | Poster |
| 438 | Many-shot from Low-shot: Learning to Annotate using Mixed Supervision for Object Detection | Poster |
| 441 | Curriculum DeepSDF | Poster |
| 444 | Meshing Point Clouds with Predicted Intrinsic-Extrinsic Ratio Guidance | Poster |
| 457 | Improved Adversarial Training via Learned Optimizer | Poster |
| 471 | Component Divide-and-Conquer for Real-World Image Super-Resolution | Poster |
| 479 | Enabling Deep Residual Networks for Weakly Supervised Object Detection | Poster |
| 494 | Deep near-light photometric stereo for spatially varying reflectances | Poster |
| 498 | Learning Visual Representations with Caption Annotations | Poster |
| 509 | Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier | Poster |
| 512 | Regression of Instance Boundary by Aggregated CNN and GCN | Poster |
| 520 | Social Adaptive Module for Weakly-supervised Group Activity Recognition | Poster |
| 521 | RGB-D Salient Object Detection with Cross-Modality Modulation and Selection | Poster |
| 524 | RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval | Poster |
| 536 | Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection | Poster |
| 566 | Faster Person Re-Identification | Poster |
| 570 | Quantization Guided JPEG Artifact Correction | Poster |
| 571 | 3PointTM: Faster Measurement of High-Dimensional Transmission Matrices | Poster |
| 575 | Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer | Poster |
| 581 | Beyond 3DMM Space: Towards Fine-grained 3D Face Reconstruction | Poster |
| 587 | World-Consistent Video-to-Video Synthesis | Poster |
| 596 | Commonality-Parsing Network across Shape and Appearance for Partially Supervised Instance Segmentation | Poster |
| 598 | GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation in the Wild | Poster |
| 600 | Event-based Asynchronous Sparse Convolutional Networks | Poster |
| 604 | AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World Assumption | Poster |
| 607 | Spatiotemporal Attention Cell Search for Video Classification | Poster |
| 609 | REMIND Your Neural Network to Prevent Catastrophic Forgetting | Poster |
| 611 | Image Classification in the dark using Quanta Image Sensors | Poster |
| 615 | $n$-Reference Transfer Learning for Saliency Prediction | Poster |
| 618 | Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection | Poster |
| 622 | Bottom-Up Temporal Action Localization with Mutual Regularization | Poster |
| 623 | On Learning to Modulate the Gradient for Fast Adaptation of Neural Networks | Poster |
| 634 | Domain-Specific Mappings for Generative Adversarial Style Transfer | Poster |
| 636 | DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning | Poster |
| 637 | DHP: Differentiable Meta Pruning via HyperNetworks | Poster |
| 639 | Deep Transferring Quantization | Poster |
| 645 | Deep Credible Metric Learning for Unsupervised Domain Adaptation Person Re-identification | Poster |
| 648 | Temporal Coherence or Temporal Motion: Which is More Critical for Video-based Person Re-identification? | Poster |
| 666 | Arbitrary-Oriented Object Detection with Circular Smooth Label | Poster |
| 671 | Learning Event-Driven Video Deblurring and Interpolation | Poster |
| 678 | Vectorizing world buildings: planar graph reconstruction by primitive detection and relationship inference | Poster |
| 692 | Learning to Combine: Knowledge Aggregation for Multi-Source Domain Adaptation | Poster |
| 696 | CSCL: Critical Semantic-Consistent Learning for Unsupervised Domain Adaptation | Poster |
| 700 | Prototype Mixture Models for Few-shot Semantic Segmentation | Poster |
| 701 | Webly Supervised Image Classification with Self-Contained Confidence | Poster |
| 704 | Search what you want: Barrier Panelty NAS for mixed precision quantization | Poster |
| 709 | Monocular 3D Object Detection via Feature Domain Adaptation | Poster |
| 718 | Talking-head Generation with Rhythmic Head Motion | Poster |
| 719 | AUTO3D: Novel view synthesis through unsupervised-learned variational viewpoints and global 3D representations | Poster |
| 720 | VPN: Learning Video-Pose Embedding for Activities of Daily Living | Poster |
| 721 | Soft Anchor-Point Object Detection | Poster |
| 735 | Deformable Grid | Poster |
| 751 | Soft Expert Reward Learning for Vision-and-Language Navigation | Poster |
| 754 | Part-aware Prototype Network for Few-shot Semantic Segmentation | Poster |
| 759 | Learning from Extrinsic and Intrinsic Supervisions for Domain Generalization | Poster |
| 761 | Joint Learning of Social Groups, Individuals Action and Sub-group Activities in Videos | Poster |
| 768 | Whole-Body Human Pose Estimation in the Wild | Poster |
| 770 | Relative Pose Estimation of Calibrated Cameras with Known $mathrm{SE}(3)$ Invariants | Poster |
| 777 | A Novel Compressed Sensing Approach on Convolutions and Runge-Kutta Methods | Poster |
| 779 | Deep Hough Transform for Semantic Line Detection | Poster |
| 781 | Cross-domain Structured Landmark Detection via Progressive Topology-Adapting Deep Graph Learning | Poster |
| 787 | 3D Human Shape and Pose from a Single Low-Resolution Image | Poster |
| 790 | Learning to Balance Specificity and Invariance for In and Out of Domain Generalization | Poster |
| 792 | Contrastive Learning for Conditional Image Generation | Poster |
| 794 | DLow: Diversifying Latent Flows for Diverse Human Motion Prediction | Poster |
| 798 | GRNet: Gridding Residual Network for Dense Point Cloud Completion | Poster |
| 800 | Learning Discriminative and Compact Representations for Gait Recognition | Poster |
| 806 | Blind Face Restoration via Deep Multi-scale Component Dictionaries | Poster |
| 866 | Robust Neural Networks inspired by Strong Stability Preserving Runge-Kutta methods | Poster |
| 867 | Inequality-Constrained and Robust 3D Face Model Fitting | Poster |
| 869 | Gabor Layers Enhance Network Robustness | Poster |
| 871 | Conditional Image Repainting via Semantic Bridge and Piecewise Value Function | Poster |
| 872 | Learnable Cost Volume using the Cayley Representation | Poster |
| 884 | Learning to Adapt: Towards Resource-Efficient On-Device Adaptation Beyond Gradient Descent | Poster |
| 890 | Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling | Poster |
| 894 | BroadFace: Looking at Tens of Thousands of People at Once for Face Recognition | Poster |
| 895 | Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision | Poster |
| 896 | Domain Adaptive Semantic Segmentation Using Weak Labels | Poster |
| 898 | Knowledge Distillation Meets Self-Supervision | Poster |
| 909 | Efficient Neighbourhood Consensus Networks via Submanifold Sparse Convolutions | Poster |
| 910 | Reconstructing the Noise Manifold for Image Denoising | Poster |
| 916 | Occlusion-Aware Depth Estimation with Adaptive Normal Constraints | Poster |
| 927 | VisualEchoes: Spatial Image Representation Learning through Echolocation | Poster |
| 929 | Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval | Poster |
| 942 | Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation | Poster |
| 946 | Spatially Aware Multimodal Transformers for TextVQA | Poster |
| 948 | Every Pixel Matters: Center-aware Feature Alignment for Domain Adaptive Object Detector | Poster |
| 960 | URIE: Universal Image Enhancement for Visual Recognition in the Wild | Poster |
| 961 | Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation | Poster |
| 977 | SPL-MLL: Selecting Predictable Landmarks for Multi-Label Learning | Poster |
| 978 | Unpaired Image-to-Image Translation using Adversarial Consistency Loss | Poster |
| 981 | Discriminability Distillation in Group Representation Learning | Poster |
| 983 | Monocular Expressive Body Regression through Body-Driven Attention | Poster |
| 984 | Dual Adversarial Network: Toward Real Noise Removal and Noise Generation | Poster |
| 986 | Linguistic Structure Guided Context Modeling for Referring Image Segmentation | Poster |
| 988 | Meta-Learning across Meta-Tasks for Few-Shot Learning | Poster |
| 994 | Federated Visual Classification with Real-World Data Distribution | Poster |
| 996 | Robust Re-Identification by Multiple Views Knowledge Distillation | Poster |
| 1003 | Defocus Deblurring Using Dual-Pixel Data | Poster |
| 1008 | RhyRNN: Rhythmic RNN for Recognizing Events in Long and Complex Videos | Poster |
| 1012 | Take an Emotion Walk: Perceiving Emotions from Gaits Using Hierarchical Attention Pooling and Affective Mapping | Poster |
| 1022 | Weighting Counts: Sequential Crowd Counting by Reinforcement Learning | Poster |
| 1024 | Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks | Poster |
| 1035 | Learning to Learn with Variational Information Bottleneck for Domain Generalization | Poster |
| 1045 | Deep Positional and Relational Feature Learning for Rotation-Invariant Point Cloud Analysis | Poster |
| 1046 | Thanks for Nothing: Predicting Zero-Valued Activations with Lightweight Convolutional Neural Networks | Poster |
| 1051 | Layered Neighborhood Expansion for Incremental Multiple Graph Matching | Poster |
| 1057 | Learning To Classify Images Without Labels | Poster |
| 1060 | Graph convolutional networks for learning with few clean and many noisy labels | Poster |
| 1078 | Object-and-Action Aware Model for Visual Language Navigation | Poster |
| 1079 | A Comprehensive Study of Weight Sharing in Graph Networks for 3D Human Pose Estimation | Poster |
| 1086 | MuCAN: Multi-Correspondence Aggregation Network for Video Super-Resolution | Poster |
| 1094 | Efficient Semantic Video Segmentation with Per-frame Inference | Poster |
| 1097 | Increasing the Robustness of Semantic Segmentation Models with Painting-by-Numbers | Poster |
| 1103 | Deep Spiking Neural Network: Energy Efficiency Through Time based Coding | Poster |
| 1137 | InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic Information Modeling | Poster |
| 1139 | Utilizing Patch-level Category Activation Patterns for Multiple Class Novelty Detection | Poster |
| 1143 | People as Scene Probes | Poster |
| 1147 | Mapping in a Cycle: Sinkhorn Regularized Unsupervised Learning for Point Cloud Shapes | Poster |
| 1148 | Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions | Poster |
| 1152 | TexMesh: Reconstructing Human Texture and Geometry from Monocular Video | Poster |
| 1153 | Consistency-based Semi-supervised Active Learning: Towards Minimizing Labeling Cost | Poster |
| 1162 | Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation | Poster |
| 1163 | Modeling 3D shapes by Reinforcement Learning | Poster |
| 1164 | LST-Net: Learning a Convolutional Neural Networkwith a Learnable Sparse Transform | Poster |
| 1165 | Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision | Poster |
| 1171 | CN: Channel Normalization in Point Cloud | Poster |
| 1182 | Rethinking the Defocus Blur Detection Problem and A Real-Time Deep DBD Model | Poster |
| 1184 | AutoMix: Mixup Networks for Sample Interpolation via Cooperative Barycenter Learning | Poster |
| 1186 | Scene Text Image Super-Resolution in the Wild | Poster |
| 1220 | Coupling Explicit and Implicit Surface Representations for Generative 3D Modeling | Poster |
| 1227 | Learning Disentangled Representations with Latent Variation Predictability | Poster |
| 1232 | Deep Space-Time Video Upsampling Networks | Poster |
| 1242 | Large-Scale Few-Shot Learning via Multi-Modal Knowledge Discovery | Poster |
| 1248 | Fast Video Object Segmentation using Global Context Module | Poster |
| 1263 | Uncertainty-aware Weakly Supervised Action Detection from Long Videos | Poster |
| 1267 | Selecting Relevant Features from a Universal Representation for Few-shot Learning | Poster |
| 1276 | MessyTable: Instance Association in Multiple Camera Views | Poster |
| 1277 | A Unified Framework for Shot Type Classification Based on Subject Centric Lens | Poster |
| 1279 | BSL-1K: Scaling up co-articulated sign recognition using mouthing cues | Poster |
| 1280 | Parametric Hand Texture Model for 3D Hand Reconstruction and Personalization | Poster |
| 1290 | CycAs: Self-supervised Cycle Association for Learning Re-identifiable Person Descriptions | Poster |
| 1291 | Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions | Poster |
| 1292 | Towards Real-time MOT: A Joint Solution for Detection and Appearance Embedding | Poster |
| 1294 | A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation | Poster |
| 1295 | Unsupervised Deep Metric Learning with Transformed Attention Consistency and Contrastive Clustering Loss | Poster |
| 1299 | STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos | Poster |
| 1302 | Hierarchical Style-based Networks for Motion Synthesis | Poster |
| 1303 | Who left the dogs out? 3D Animal Reconstruction with Expectation Maximization in the Loop | Poster |
| 1308 | Learning to Count in the Crowd from Limited Labeled Data | Poster |
| 1314 | SPOD: Selective Point Cloud Densification for Better Localization in Point Cloud Object Detection | Poster |
| 1319 | Explainable Face Recognition | Poster |
| 1321 | From Shadow Segmentation to Shadow Removal | Poster |
| 1322 | Diverse and Admissible Trajectory Prediction through Multimodal Context Understanding | Poster |
| 1332 | CONFIG: Controllable Neural Face Image Generation | Poster |
| 1337 | Scene Scale Estimation from Single Image in the Wild | Poster |
| 1340 | Procedure Planning in Instructional Videos | Poster |
| 1342 | Funnel Activation for Visual Recognition | Poster |
| 1354 | GIQA: Generated Image Quality Assessment | Poster |
| 1355 | Adversarial Continual Learning | Poster |
| 1358 | Adapting Object Detectors with Conditional Domain Normalization | Poster |
| 1360 | HARD-Net: Hardness-AwaRe Discrimination Network for 3D Early Activity Prediction | Poster |
| 1363 | Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction | Poster |
| 1369 | Interpretable and Generalizable Person Re-identification with Query-adaptive Convolution and Temporal Lifting | Poster |
| 1372 | Unsupervised Bayesian Deep Learning for Image Reconstruction in Compressive Sensing | Poster |
| 1380 | Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement | Poster |
| 1381 | Semi-supervised Learning with a Teacher-student Network for Generalized Attribute Prediction | Poster |
| 1391 | Unsupervised Domain Adaptation with Noise Resistible Mutual-Training for Person Re-identification | Poster |
| 1395 | DPDist : Comparing Point Clouds Using Deep Point Cloud Distance | Poster |
| 1399 | Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation | Poster |
| 1408 | FaceMix: Privacy-Preserving Facial Attribute Classification on the Cloud | Poster |
| 1415 | Neural Re-Rendering of Humans from a Single Image | Poster |
| 1420 | Reversing the cycle: self-supervised deep stereo through enhanced monocular distillation | Poster |
| 1421 | PIPAL: a Large-Scale Image Quality Assessment Dataset for Perceptual Image Restoration | Poster |
| 1422 | Why do These Match? Explaining the Behavior of Image Similarity Models | Poster |
| 1426 | CooGAN: A Memory-Efficient Framework for High-Resolution Facial Attribute Editing | Poster |
| 1430 | Progressive Transformers for End-to-End Sign Language Production | Poster |
| 1436 | Mask TextSpotter V3: Segmentation Proposal Network for Robust Scene Text Spotting | Poster |
| 1440 | Making Affine Correspondences Work in Camera Geometry Computation | Poster |
| 1445 | Sub-center ArcFace: Boosting Face Recognition by Large-scale Noisy Web Faces | Poster |
| 1450 | Foley Music: Learning to Generate Music from Videos | Poster |
| 1453 | Contrastive Multiview Coding | Poster |
| 1456 | Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses | Poster |
| 1469 | Generative Low-bitwidth Data Free Quantization | Poster |
| 1470 | Local Correlation Consistency for Knowledge Distillation | Poster |
| 1474 | Perceiving 3D Human-Object SpatialArrangements from a Single Image in the Wild | Poster |
| 1483 | Sep-Stereo: Visual-Guided Stereophonic Audio Generation by Associating Source Separation | Poster |
| 1485 | CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations | Poster |
| 1486 | Thinking in Frequency: Face Forgery Detection by Mining Frequency-aware Clues | Poster |
| 1489 | Weakly-Supervised Cell Tracking via Backward-and-Forward Propagation | Poster |
| 1491 | SeqHAND:RGB-Sequence-Based 3D Hand Pose and Shape Estimation | Poster |
| 1493 | Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization | Poster |
| 1509 | AMLN: Adversarial-based Mutual Learning Network for Online Knowledge Distillation | Poster |
| 1514 | Online Multi-modal Person Search in Videos | Poster |
| 1520 | Single Image Super-Resolution via a Holistic Attention Network | Poster |
| 1535 | Can You Read Me Now? Content Aware Rectification using Angle Supervision | Poster |
| 1538 | Momentum Batch Normalization for Deep Learning with Small Batch Size | Poster |
| 1541 | AdvPC: Transferable Adversarial Perturbations on 3D Point Clouds | Poster |
| 1543 | Edge-aware Graph Representation Learning and Reasoning for Face Parsing | Poster |
| 1547 | BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network | Poster |
| 1557 | G-LBM: Generative Low-dimensional Background Model Estimation from Video Sequences | Poster |
| 1561 | H3DNet: 3D Object Detection Using Hybrid Geometric Primitives | Poster |
| 1567 | Expressive Telepresence via Modular Codec Avatar | Poster |
| 1571 | Cascade Graph Neural Networks for RGB-D Salient Object Detection | Poster |
| 1585 | FairALM: Augmented Lagrangian Method for Training Fair Models with Little Regret | Poster |
| 1586 | Generating Videos of Zero-Shot Compositions of Actions and Objects | Poster |
| 1593 | ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language | Poster |
| 1600 | Renovating Parsing R-CNN for Accurate Multiple Human Parsing | Poster |
| 1612 | Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning | Poster |
| 1615 | Gradient-Induced Co-Saliency Detection | Poster |
| 1616 | Nighttime Defogging Using High-Low Frequency Decomposition and Grayscale-Color Networks | Poster |
| 1633 | SegFix: Model-Agnostic Boundary Refinement for Segmentation | Poster |
| 1636 | Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction | Poster |
| 1637 | Fast Bi-layer Neural Synthesis of One-Shot Realistic Head Avatars | Poster |
| 1644 | Neural Geometric Parser for Single Image Camera Calibration | Poster |
| 1647 | Learning Flow-based Feature Warping for Face Frontalization with Illumination Inconsistent Supervision | Poster |
| 1652 | Learning Architectures for Binary Networks | Poster |
| 1653 | Semantic View Synthesis | Poster |
| 1659 | An Analysis of Sketched IRLS for Accelerated Sparse Residual Regression | Poster |
| 1677 | Relative pose from deep learned depth and affine correspondences | Poster |
| 1698 | Video Super-Resolution with Recurrent Structure-Detail Network | Poster |
| 1702 | Shape Adaptor: A Learnable Resizing Module | Poster |
| 1712 | Shuffle and Attend: Video Domain Adaptation | Poster |
| 1714 | DRG: Dual Relation Graph for Human-Object Interaction Detection | Poster |
| 1715 | Flow-edge Guided Video Completion | Poster |
| 1721 | Deep End-to-End Trainable Active Contours for Building Footprint Delineation | Poster |
| 1728 | Towards End-to-end Video-based Eye-Tracking | Poster |
| 1732 | Generating Handwriting via Decoupled Style Descriptors | Poster |
| 1742 | LEED: Label-Free Expression Editing via Disentanglement | Poster |
| 1763 | Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards | Poster |
| 1765 | Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder | Poster |
| 1766 | Unsupervised Cross-Modal Alignment For Multi-Person 3D Pose Estimation | Poster |
| 1769 | Class-Incremental Domain Adaptation | Poster |
| 1789 | Anti-Bandit Neural Architecture Search for Model Defense | Poster |
| 1792 | Wavelet-Based Dual-Branch Neural Network for Image Demoireing | Poster |
| 1809 | Low light video Enhancement using Synthetic Data Produced with an Intermediate Domain Mapping | Poster |
| 1810 | Non-Local Spatial Propagation Network for Depth Completion | Poster |
| 1816 | DanbooRegion: Illustration and Cartoon Region Dataset Annotated by Real-life Artists | Poster |
| 1819 | Event Enhanced High-Quality Image Recovery | Poster |
| 1821 | PackDet: Packed Long-Head Object Detector | Poster |
| 1825 | A Generic Graph-based Neural Architecture Encoding Scheme for Predictor-based NAS | Poster |
| 1829 | Learning Semantic Neural Tree for Human Parsing | Poster |
| 1834 | Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation | Poster |
| 1848 | Burst Denoising via Temporally Shifted Wavelet Transforms | Poster |
| 1849 | JSSR: Joint Synthesis Segmentation and Registration System for 3D Multi-Model Image Analysis | Poster |
| 1850 | SimAug: Learning Robust Representations from 3D Simulation for Pedestrian Trajectory Prediction in Unseen Cameras | Poster |
| 1851 | ScribbleBox: Interactive Annotation Framework for Video Object Segmentation | Poster |
| 1862 | Rethinking Pseudo-LiDAR Representation | Poster |
| 1868 | Deep Multi Depth Panoramas for View Synthesis | Poster |
| 1880 | MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection | Poster |
| 1889 | ContactPose: A Dataset of Grasps with Object Contact and Hand Pose | Poster |
| 1895 | API-Net: Robust Generative Classifier via a Single Discriminator | Poster |
| 1905 | Bias-based Universal Adversarial Patch Attack for Automatic Check-out | Poster |
| 1912 | Imbalanced Continual Learning with Partitioning Reservoir Sampling | Poster |
| 1932 | Guided Collaborative Training for Pixel-wise Semi-Supervised Learning | Poster |
| 1938 | Stacking Networks Dynamically for Image Restoration Based on the Plug-and-Play Framework | Poster |
| 1942 | Efficient Transfer Learning via Joint Adaptation of Network Architecture and Weight | Poster |
| 1951 | Spatial Attention Pyramid Network for Unsupervised Domain Adaptation | Poster |
| 1955 | GSIR: Generalizable 3D Shape Interpretation and Reconstruction | Poster |
| 1956 | Weakly Supervised 3D Object Detection from Lidar Point Cloud | Poster |
| 1960 | Two-phase Pseudo Label Densification for Self-training based Domain Adaptation | Poster |
| 1972 | Adaptive Offline Quintuplet Loss for Image-text Matching | Poster |
| 1973 | Learning Object Placement by Inpainting for Compositional Data Augmentation | Poster |
| 1978 | Deep Vectorization of Technical Drawings | Poster |
| 1979 | Shape Fitting with Deformable CAD Models | Poster |
| 1991 | An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices | Poster |
| 2006 | AutoTrajectory: Label-free Trajectory Extraction and Prediction from Videos using Dynamic Points | Poster |
| 2013 | Multi-Agent Embodied Question Answering in Interactive Environments via 3D Reconstruction | Poster |
| 2014 | Conditional Sequential Modulation for Efficient Image Retouching | Poster |
| 2016 | Segmenting Transparent Objects in the Wild | Poster |
| 2035 | Length Controllable Image Captioning | Poster |
| 2042 | Few-Shot Semantic Segmentation with Democratic Attention Networks | Poster |
| 2044 | Defocus Blur Detection via Depth Distillation | Poster |
| 2054 | Motion Guided 3D Pose Estimation from Video | Poster |
| 2055 | Reflection Separation via Multi-bounce Polarization State Tracing | Poster |
| 2057 | SIP: Spatial Information Preservation for Fast Instance Segmentation | Poster |
| 2059 | SemanticAdv: Generating Adversarial Examples via Attribute-conditioned Image Editing | Poster |
| 2062 | Learning with Noisy Class Labels for Instance Segmentation | Poster |
| 2085 | Deep Image Clustering with Category-Style Representation | Poster |
| 2090 | Self-supervised Learning of Motion Representation via Scattering Local Motion Cues | Poster |
| 2094 | Improving Monocular Depth Estimation by Leveraging Structural Awareness and Complementary Datasets | Poster |
| 2095 | BMBC:Bilateral Motion Estimation with Bilateral Cost Volume for Video Interpolation | Poster |
| 2100 | Hard negatives examples are hard, but useful | Poster |
| 2106 | ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions | Poster |
| 2107 | Video Object Detection via Object-level Temporal Aggregation | Poster |
| 2113 | Object Detection with a Unified Label Space from Multiple Datasets | Poster |
| 2114 | Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D | Poster |
| 2115 | Comprehensive Image Captioning via Scene Graph Decomposition | Poster |
| 2116 | Symbiotic Adversarial Learning for Attribute-Based Person Search | Poster |
| 2117 | Amplifying Key Cues for Human-Object-Interaction Detection | Poster |
| 2118 | Rethinking few-shot image classification: a good embedding is all you need? | Poster |
| 2121 | Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization | Poster |
| 2129 | Action Localization through Continual Predictive Learning | Poster |
| 2130 | Generative View-Correlation Adaptation for Semi-Supervised Multi-View Learning | Poster |
| 2135 | ReAD: Reciprocal Attention Discriminator for Image-to-Video Re-Identification | Poster |
| 2136 | Detailed Human Shape and Pose Estimation from a Single Polarization Image | Poster |
| 2142 | The Devil is in the Details: Self-Supervised Attention for Vehicle Re-Identification | Poster |
| 2152 | Improving One-stage Visual Grounding by Recursive Sub-query Construction | Poster |
| 2160 | Multi-level Wavelet-based Generative Adversarial Network for Perceptual Quality Enhancement of Compressed Video | Poster |
| 2168 | Example-Guided Image Synthesis across Arbitrary Scenes using Masked Spatial-Channel Attention and Self-Supervision | Poster |
| 2178 | Content-Consistent Matching for Domain Adaptive Semantic Segmentation | Poster |
| 2183 | AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting | Poster |
| 2186 | History Repeats Itself: Human Motion Prediction via Motion Attention | Poster |
| 2189 | Unsupervised Video Object Segmentation with Joint Hotspot Tracking | Poster |
| 2201 | SRNet: Improving Generalization in 3D Human Pose Estimation with a Split-and-Recombine Approach | Poster |
| 2202 | CAFE-GAN: Arbitrary Face Attribute Editing with Complementary Attention Feature | Poster |
| 2209 | MimicDet: Bridging the Gap Between One-Stage and Two-Stage Object Detection | Poster |
| 2212 | Topic-aware Multi-Label Classification | Poster |
| 2216 | Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning | Poster |
| 2235 | Attract, Perturb, and Explore: Learning a Feature Alignment Network for Semi-supervised Domain Adaptation | Poster |
| 2238 | Curriculum Manager for Source Selection in Multi-Source Domain Adaptation | Poster |
| 2244 | Powering One-shot Topological NAS with Stabilized Share-parameter Proxy | Poster |
| 2246 | Classes Matter: A Fine-grained Adversarial Approach to Cross-domain Semantic Segmentation | Poster |
| 2252 | Boundary-preserving Mask R-CNN | Poster |
| 2253 | Self-supervised Single-view 3D Reconstruction via Semantic Consistency | Poster |
| 2255 | MetaDistiller: Network Self-boosting via Meta-learned Top-down Distillation | Poster |
| 2256 | Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling | Poster |
| 2257 | The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation | Poster |
| 2266 | What is Learned in Deep Uncalibrated Photometric Stereo? | Poster |
| 2270 | Prior-based Domain Adaptive Object Detection for Hazy and Rainy Conditions | Poster |
| 2274 | Adversarial Ranking Attack and Defense | Poster |
| 2279 | ReDro: Efficiently Learning Large-sized SPD Visual Representation | Poster |
| 2287 | Graph-Based Social Relation Reasoning | Poster |
| 2290 | EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection | Poster |
| 2293 | Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency | Poster |
| 2295 | Asynchronous Interaction Aggregation for Action Detection | Poster |
| 2305 | Shape and Viewpoint without Keypoints | Poster |
| 2306 | Learning Attentive and Hierarchical Representations for 3D Shape Recognition | Poster |
| 2308 | TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture Search | Poster |
| 2313 | Associative3D: Volumetric Reconstruction from Sparse Views | Poster |
| 2318 | PlugNet: Degradation Aware Scene Text Recognition Supervised by a Pluggable Super-Resolution Unit | Poster |
| 2319 | Memory Selection Network for Video Propagation | Poster |
| 2325 | Disentangled Non-local Neural Networks | Poster |
| 2327 | URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark | Poster |
| 2329 | Generalizing Person Re-Identification by Camera-Aware Invariance Learning and Cross-Domain Mixup | Poster |
| 2330 | Semi-supervised Crowd Counting via Self-training on Surrogate Tasks | Poster |
| 2335 | Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training | Poster |
| 2336 | Boosting Decision-based Black-box Adversarial Attacks with Random Sign Flip | Poster |
| 2338 | Knowledge Transfer via Dense Cross-layer Mutual-distillation | Poster |
| 2339 | Matching Guided Distillation | Poster |
| 2341 | Clustering-driven Deep Autoencoder for Video Anomaly Detection | Poster |
| 2343 | Learning to Compose Hypercolumns for Visual Correspondence | Poster |
| 2348 | Stochastic Bundle Adjustment for Efficient and Scalable Structure from Motion | Poster |
| 2353 | Object-based Illumination Estimation with Rendering-aware Neural Networks | Poster |
| 2354 | Progressive Point Cloud Deconvolution Generation Network | Poster |
| 2356 | SSCGAN: Facial Attribute Editing via Style Skip Connections | Poster |
| 2374 | Negative Pseudo Labeling using Class Proportion for Semantic Segmentation in Pathology | Poster |
| 2376 | Learn to Propagate Reliably on Noisy Affinity Graphs | Poster |
| 2382 | Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search | Poster |
| 2383 | TANet: Towards Fully Automatic Tooth Arrangement | Poster |
| 2391 | UnionDet: Union-Level Detector Towards Real-Time Human-Object Interaction Detection | Poster |
| 2393 | GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision | Poster |
| 2394 | Resolution Switchable Networks for Runtime Efficient Image Classification | Poster |
| 2395 | SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation | Poster |
| 2396 | Learning to Detect Open Classes for Universal Domain Adaptation | Poster |
| 2400 | Visual Compositional Learning for Human Object Interaction Detection | Poster |
| 2422 | Deep Plastic Surgery: Robust and Controllable Image Editing with Human-Drawn Sketches | Poster |
| 2423 | Rethinking Class Activation Mapping for Weakly Supervised Object Localization | Poster |
| 2424 | OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features | Poster |
| 2426 | Interpretable Neural Networks Decoupling | Poster |
| 2433 | Omni-sourced Webly-supervised Video Recognition | Poster |
| 2437 | CurveLane-NAS: Unifying Lane-Sensitive Architecture Search and Adaptive Point Blending | Poster |
| 2442 | Contextual-Relation Consistent Domain Adaptation for Semantic Segmentation | Poster |
| 2455 | Estimating People Flows to Better Count Them in Crowded Scenes | Poster |
| 2456 | RAN: Resolution Adaption Network for Low-resolution Face Recognition | Poster |
| 2460 | Learning Feature Embeddings for Discriminant Model based Tracking | Poster |
| 2461 | WeightNet: Revisiting the Design Space of Weight Networks | Poster |
| 2472 | Partially-Shared Variational Auto-encoders for Unsupervised Domain Adaptation with Target Shift | Poster |
| 2475 | Learning Where to Focus for Efficient Video Object Detection | Poster |
| 2481 | Learning Object Permanence from Video | Poster |
| 2492 | Adaptive Text Recognition through Visual Matching | Poster |
| 2497 | Actions as Moving Points | Poster |
| 2499 | Learning to Exploit Multiple Vision Modalities by Using Grafted Networks | Poster |
| 2501 | Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in the Wild | Poster |
| 2505 | 3D Fluid Flow Reconstruction Using Compact Light Field PIV | Poster |
| 2510 | Contextual Diversity for Active Learning | Poster |

pic from pexels.com
最后
以上就是缓慢手套最近收集整理的关于【ECCV2020】接收论文列表part1的全部内容,更多相关【ECCV2020】接收论文列表part1内容请搜索靠谱客的其他文章。
本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
发表评论 取消回复