概述
英文题目 | 中文题目 | |
Finding Task-Relevant Features for Few-Shot Learning by Category Traversal | 少镜头学习中用类别遍历法寻找任务相关特征 | |
Edge-Labeling Graph Neural Network for Few-Shot Learning | 用于少镜头学习的边缘标记图神经网络 | |
Generating Classification Weights With GNN Denoising Autoencoders for Few-Shot Learning | 用GNN去噪自编码器生成分类权重实现少镜头学习 | |
Kervolutional Neural Networks | 核化卷积神经网络 | 神经网络中传统卷积运算的扩展——kervolution(Kernel Convolution):对于传统卷积的非线性化——利用非线性映射(针对输入特征,及卷积核)后,再卷积(公式4) |
Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem | 为什么ReLU网络产生远离训练数据的高置信度预测以及如何缓解问题 | |
On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions | 深度卷积网络对傅立叶基函数方向的结构灵敏度 | |
Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization | 神经再生:通过提高计算资源利用率改进深度网络训练 | |
Hardness-Aware Deep Metric Learning | 硬度感知深度测量学习 | |
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation | Auto-DeepLab:语义图像分割的层次神经结构搜索 | 神经网络结构的自动搜索优化(而不是预先定义) |
Learning Loss for Active Learning | 主动学习的学习损失 | |
Striking the Right Balance With Uncertainty | 以不确定性达到正确的平衡 | |
AutoAugment: Learning Augmentation Strategies From Data | 自增强:从数据中学习增强策略 | |
SDRSAC: Semidefinite-Based Randomized Approach for Robust Point Cloud Registration Without Correspondences | SDRSAC:无通信的基于半定的随机方法实现鲁棒点云配准 | |
BAD SLAM: Bundle Adjusted Direct RGB-D SLAM | BAD SLAM:Bundle Adjusted直接RGB-D SLAM | 提出了实时的密集SLAM的BA方法(传统的密集BA方法比较耗时) 算法贡献主要在于提出使用Surfel的概念,从而利用Surfel来估计一组像素,因而达到密集BA的目的 代价函数见公式1,BA优化算法见Algo.1 代码:www.eth3d.net |
Revealing Scenes by Inverting Structure From Motion Reconstructions | 通过structure From Motion重建反转来显示场景 | |
Strand-Accurate Multi-View Hair Capture | 精确的多视图头发捕捉 | |
DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation | deepSDF:学习连续符号距离函数的形状表示 | |
Pushing the Boundaries of View Extrapolation With Multiplane Images | 使用多平面图像推送视图外推边界 | |
GA-Net: Guided Aggregation Net for End-To-End Stereo Matching | GA-Net:端到端立体匹配的引导聚合网 | 提出两种cost aggregation方法:semi-global和local,分别对应无纹理区和细结构/边缘区 |
Real-Time Self-Adaptive Deep Stereo | 实时自适应深度立体 | MADNet:在线自适应来解决domain shift问题(训练集为合成数据,而真实测试集为真实场景)。在实际使用中,每帧数据(对)不仅用来计算视差,同时用来在线更新网络权值,达到自适应的目的 |
LAF-Net: Locally Adaptive Fusion Networks for Stereo Confidence Estimation | LAF-Net:用于立体置信估计的局部(L)自适应(A)融合(F)网络 | 置信图(Confidence map)用以衡量每个点的(估计后)视差的置信度(如图1),进而对不同置信度像素点的视差可以refine等后处理。 |
NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences | NM-Net:挖掘可靠的邻域,以实现强大的特征对应 | 特征点对应一般有SIFT等局部特征对应初始化,但是初始化的对应特征点不可避免包含错误的对应,因此需要后处理来“选择”正确的对应特征点。本文主要关注基于学习的方法,来实现正确地“选择”对应特征点。 |
Coordinate-Free Carlsson-Weinshall Duality and Relative Multi-View Geometry | 无坐标Carlsson-Weinshall对偶及相对多视图几何 | |
Deep Reinforcement Learning of Volume-Guided Progressive View Inpainting for 3D Point Scene Completion From a Single Depth Image | 利用深度强化学习实现单深度图像的基于体引导渐进视图修补的三维点场景补全 | |
Video Action Transformer Network | 视频动作转换网络 | |
Timeception for Complex Action Recognition | 复杂动作识别的时间感知 | |
STEP: Spatio-Temporal Progressive Learning for Video Action Detection | STEP:视频动作检测的时空渐进学习 | |
Relational Action Forecasting | 关系动作预测 | |
Long-Term Feature Banks for Detailed Video Understanding | 详细视频理解的长期功能库 | |
Which Way Are You Going? Imitative Decision Learning for Path Forecasting in Dynamic Scenes | 你往哪边走?动态场景中路径预测的模拟决策学习 | |
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment | 你的表现如何?行动质量评估的多任务学习方法 | |
MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation | MHP-VOS:视频对象分割的多假设传播 | |
2.5D Visual Sound | 2.5D视觉声音 | |
Language-Driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model | 语言驱动的时间活动定位:语义匹配的强化学习模型 | |
Gaussian Temporal Awareness Networks for Action Localization | 用于动作定位的高斯时间感知网络 | |
Efficient Video Classification Using Fewer Frames | 使用更少帧的高效视频分类 | |
Parsing R-CNN for Instance-Level Human Analysis | 解析R-CNN实现实例级的人分析 | |
Large Scale Incremental Learning | 大规模增量学习 | 增量学习:不断增加新类别的学习。由于不断增加新类别,导致旧类别的样本减少,造成数据不平衡,从而使得旧类别的识别度下降。本文关注类别不平衡问题的解决 |
TopNet: Structural Point Cloud Decoder | TopNet:结构化点云解码器 | |
Perceive Where to Focus: Learning Visibility-Aware Part-Level Features for Partial Person Re-Identification | 感知关注点:学习可见性感知部分级特征实现部分人重识别 | |
Meta-Transfer Learning for Few-Shot Learning | 元转移学习实现少镜头学习 | |
Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation | 用于精确图像分类和语义分割的结构化二元神经网络 | 由原始网络经过网络结构改进及权值二元化,实现网络的轻量级 |
Deep RNN Framework for Visual Sequential Applications | 用于视觉序列应用的深度RNN框架 | |
Graph-Based Global Reasoning Networks | 基于图的全局推理网络 | 通过引入全局信息,改善卷积操作的局部性缺陷。如图1,2,首先将空间(笛卡尔坐标)像素投影到交互空间(interaction space),在交互空间通过全连接(图)网络,获取全局信息,然后再反投影到原始空间。 |
SSN: Learning Sparse Switchable Normalization via SparsestMax | SSN:通过SparsestMax学习稀疏可切换规范化 | |
Spherical Fractal Convolutional Neural Networks for Point Cloud Recognition | 用于点云识别的球形分形卷积神经网络 | |
Learning to Generate Synthetic Data via Compositing | 学习通过合成生成合成数据 | |
Divide and Conquer the Embedding Space for Metric Learning | 划分并征服嵌入空间实现度量学习 | |
Latent Space Autoregression for Novelty Detection | 新颖性检测的潜在空间自回归 | |
Attending to Discriminative Certainty for Domain Adaptation | 注意判别确定性实现域适应 | |
Feature Denoising for Improving Adversarial Robustness | 特征去噪提高对抗鲁棒性 | |
Selective Kernel Networks | 选择性核网络 | |
On Implicit Filter Level Sparsity in Convolutional Neural Networks | 卷积神经网络的隐式滤波级稀疏性 | 研究比较网络中采用不同方法(正则、优化等)情形下的网络系数稀疏性情况 |
FlowNet3D: Learning Scene Flow in 3D Point Clouds | FlowNet3D:学习三维点云中的场景流 | |
Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks | 远程任务中基于场景记忆变换器的嵌入式代理 | |
Co-Occurrent Features in Semantic Segmentation | 语义分割中的共现特征 | 考虑的语义分割中不同语义之间的关系(共现:Co-occurrent,图3),实际上是考虑不同位置之间的点积信息 |
Bag of Tricks for Image Classification with Convolutional Neural Networks | 基于卷积神经网络的图像分类中采用的技巧 | |
Learning Channel-Wise Interactions for Binary Convolutional Neural Networks | 二元卷积神经网络的通道交互学习 | |
Knowledge Adaptation for Efficient Semantic Segmentation | 有效语义分割的知识自适应 | 基于知识蒸馏的方法(利用复杂的teacherNet指导简单的studentNet,从而得到更快速、效果更佳的推断),实现语义分割 |
Parametric Noise Injection: Trainable Randomness to Improve Deep Neural Network Robustness Against Adversarial Attack | 参数噪声注入:可训练的随机性以提高深度神经网络对抗攻击的鲁棒性 | |
Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-Identification | 不变性问题:基于范例记忆的域适应人再识别 | 同时利用source域带标签的训练样本和target域无标签的训练样本,训练具备域自适应的跨域ReID。如图2,其中target域样本考虑三种不变性,形成记忆范例模块,辅助训练 |
Dissecting Person Re-Identification From the Viewpoint of Viewpoint | 从视角的视角剖析人再识别 | 两点贡献:1.提出了一个构建不同视角训练集的引擎(算法);2. 分析不同视角对ReID的影响 |
Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification | 学习减少双级差异实现红外可见人再识别 | 红外图像的ReID,两两差异(discrepancy),采用两个不同子网来处理 |
Progressive Feature Alignment for Unsupervised Domain Adaptation | 基于渐进特征对齐的无监督域自适应 | |
Feature-Level Frankenstein: Eliminating Variations for Discriminative Recognition | 特征级Frankenstein:基于差异消除的判别性识别 | |
Learning a Deep ConvNet for Multi-Label Classification With Partial Labels | 基于深度ConvNet学习的局部标签多标签分类 | |
Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression | 联合上的广义交集:用于BoundingBox回归的度量和损失 | |
Densely Semantically Aligned Person Re-Identification | 基于密集语义对齐的人再识别 | 首先利用DensePose模型,将人体进行语义分割(24种语义),然后对于分割后的人体部分进行对齐(alignmeng)。最后,将这些24幅对齐后的图像组作为输入,输入到辅助网络中,帮助提高主网的ReID能力(图3) |
Generalising Fine-Grained Sketch-Based Image Retrieval | 基于细粒度草图的图像检索 | |
Adapting Object Detectors via Selective Cross-Domain Alignment | 选择性跨域对齐实现目标检测器调整 | |
Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation | 基于循环引导的弱监督联合检测与分割 | |
Thinking Outside the Pool: Active Training Image Creation for Relative Attributes | 池化外思维:基于主动训练图像创建的相关属性 | |
Generalizable Person Re-Identification by Domain-Invariant Mapping Network | 基于域不变映射网络的可推广人再识别 | 利用多个domain的数据训练,得到domain可推广的ReID(新的domain无需再update)。采用元学习的思想,网络图见图1 |
Visual Attention Consistency Under Image Transforms for Multi-Label Image Classification | 图像变换下视觉注意一致性实现多标签图像分类 | |
Re-Ranking via Metric Fusion for Object Retrieval and Person Re-Identification | 基于度量融合的重新排序实现目标检索和人再识别 | 人再识别后的re-rank,将几种Fusion算法统一起来。 目标函数:公式10 |
Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization | 基于语义差异最小化的无监督开放域识别 | |
Weakly Supervised Person Re-Identification | 弱监督人再识别 | 所谓“弱监督”,在这里指的是Gallery集合中的是视频帧,每帧有若干个人,而对于标签只指出含有哪些人,而不指出对应哪个人。Probe只单个人的patch,且标签为确定的人。这是一个多标签,多实例的问题 |
PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud | PointRCNN:从点云实现三维对象Proposal生成和检测 | |
Automatic Adaptation of Object Detectors to New Domains Using Self-Training | 利用自训练使目标探测器自动适应新领域 | |
Deep Sketch-Shape Hashing With Segmented 3D Stochastic Viewing | 基于分段三维随机视图的深度草图形状散列 | |
Generative Dual Adversarial Network for Generalized Zero-Shot Learning | 基于生成对偶对抗网络的广义零镜头学习 | |
Query-Guided End-To-End Person Search | 基于查询引导的端到端人员搜索 | |
Libra R-CNN: Towards Balanced Learning for Object Detection | Libra R-CNN:目标检测的平衡学习 | 在RNN网络中,作者认为存在三种不平衡(采样不平衡、各层级特征不平衡、损失函数中各项之间不平衡),从而导致效果下降。本文在网络中的不同位置,添加不同的再平衡模块(如图2) 效果有两个点的提升(表1),代码:https://github.com/OceanPang/Libra_R-CNN |
Learning a Unified Classifier Incrementally via Rebalancing | 通过重新平衡实现统一分类器的逐步学习 | |
Feature Selective Anchor-Free Module for Single-Shot Object Detection | 基于特征选择无锚模块的单镜头目标检测 | |
Bottom-Up Object Detection by Grouping Extreme and Center Points | 通过对极值点和中心点进行分组的自下而上目标检测 | |
Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples | 特征蒸馏:基于DNN的JPEG压缩与对抗性示例 | |
SCOPS: Self-Supervised Co-Part Segmentation | SCOPS:自监督共部分分割 | |
Unsupervised Moving Object Detection via Contextual Information Separation | 基于上下文信息分离的无监督运动目标检测 | |
Pose2Seg: Detection Free Human Instance Segmentation | Pose2Seg:无需检测人实例分割 | 专门针对相互遮挡情形下的人实例分割,利用人体骨架特征预测 |
DrivingStereo: A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios | 驾驶立体:用于自动驾驶场景中立体匹配的大规模数据集 | |
PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding | PartNet:一个用于精细和层次化Part-Level三维对象理解的大规模基准 | |
A Dataset and Benchmark for Large-Scale Multi-Modal Face Anti-Spoofing | 大型多模人脸防欺骗的数据集与基准 | |
Unsupervised Learning of Consensus Maximization for 3D Vision Problems | 三维视觉问题共识最大化的无监督学习 | |
VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People | VizWiz-Priv:一个数据集,用于识别盲人拍摄的图像中私人视觉信息的存在和目的。 | |
Structural Relational Reasoning of Point Clouds | 点云的结构关系推理 | |
MVF-Net: Multi-View 3D Face Morphable Model Regression | MVF-Net:多视图三维人脸形态模型回归 | |
Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction | 光度网格优化实现基于视频对齐的三维对象重建 | |
Guided Stereo Matching | 引导立体匹配 | 给定稀疏(正确)的深度值(可以容易地转化为对应点的视差值),利用这部分信息作为引导,辅助实现立体视觉(公式1-4,通过图2b,c可以看出,其对性能的提升也是有明显的好处的) |
Unsupervised Event-Based Learning of Optical Flow, Depth, and Egomotion | 无监督的基于事件的光流、深度和自我学习 | |
Modeling Local Geometric Structure of 3D Point Clouds Using Geo-CNN | 基于Geo-CNN的三维点云的局部几何结构建模 | |
3D Point Capsule Networks | 三维点的胶囊网络 | |
GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving | GS3D:一种高效的自动驾驶三维目标检测框架 | |
Single-Image Piece-Wise Planar 3D Reconstruction via Associative Embedding | 基于关联嵌入的单幅图像平面三维重建 | |
3DN: 3D Deformation Network | 3DN:3D变形网络 | |
HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch Data Augmentation | HorizonNet:基于一维表示和Pano拉伸数据扩充的室布局学习 | |
Deep Fitting Degree Scoring Network for Monocular 3D Object Detection | 基于深度拟合度评分网络的单目三维目标检测 | |
Pushing the Envelope for RGB-Based Dense 3D Hand Pose Estimation via Neural Rendering | 利用神经渲染实现基于RGB的密集三维手部姿态估计 | |
Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry | 基于多视图几何的三维人体姿态自监督学习 | |
FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image | FSA-Net:细粒度结构聚合学习实现单个图像头部姿势估计 | |
Dense 3D Face Decoding Over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders | 2500fps以上密集三维人脸解码:联合纹理和形状卷积网格解码器 | |
Does Learning Specific Features for Related Parts Help Human Pose Estimation? | 学习相关部分的特定特征是否有助于人体姿势估计? | |
Linkage Based Face Clustering via Graph Convolution Network | 基于图卷积网络的人脸聚类 | 所谓“图”,即两两距离矩阵;所谓“图卷积”,及矩阵相乘再(公式2) |
Towards High-Fidelity Nonlinear 3D Face Morphable Model | 面向高保真非线性三维人脸变形模型 | |
RegularFace: Deep Face Recognition via Exclusive Regularization | RegularFace:基于排他性规则化的深度人脸识别 | |
BridgeNet: A Continuity-Aware Probabilistic Network for Age Estimation | BridgeNet:一种连续性感知概率网络实现年龄估计 | 骨干网是CNN,剩下的其实是高斯混合模型(GMM)的网络表示。两个branch,一个是高斯模型(组),另一个是权值,权值网络采用bridgeTree(决策树的改进,图3)表示. 网络结构如图2 |
GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction | GANFIT:高保真三维人脸重建的GAN拟合 | |
Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition With Multimodal Training | 多模训练提高单模动态手势识别性能 | |
Learning to Reconstruct People in Clothing From a Single RGB Camera | 学习从一台RGB相机中重建穿着衣服的人 | |
Distilled Person Re-Identification: Towards a More Scalable System | 蒸馏人再识别:朝着更可扩展的系统发展 | 不考虑网络结构,本文重点在于采用知识蒸馏的方法,将sourceDomain训练出的teacher网络信息,迁移到未知sourceDomain,但已知少量带标签和大量无标签target上的轻量级student网络上。主要采用样本间相似度的信息(公式3),来实现迁移学习的。 |
A Perceptual Prediction Framework for Self Supervised Event Segmentation | 一种用于自监督事件分割的感知预测框架 | |
COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis | COIN:用于综合教学视频分析的大规模数据集 | |
Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization | 用于联合人群计数和精确定位的反复关注缩放 | |
An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition | 基于骨架的动作识别的注意力增强图卷积LSTM网络 | |
Graph Convolutional Label Noise Cleaner: Train a Plug-And-Play Action Classifier for Anomaly Detection | 图形卷积标签噪声清洗器:用于异常检测的训练即插即用动作分类器 | |
MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment | MAN:矩对齐网络实现基于迭代图调整的自然语言矩检索 | |
Less Is More: Learning Highlight Detection From Video Duration | 少即是多:从视频持续时间中学习Highlight检测 | |
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition | DMC-Net:生成识别运动线索,用于快速压缩视频动作识别 | |
AdaFrame: Adaptive Frame Selection for Fast Video Recognition | AdaFrame:用于快速视频识别的自适应帧选择 | |
Spatio-Temporal Video Re-Localization by Warp LSTM | 基于Warp LSTM的时空视频重定位 | |
Completeness Modeling and Context Separation for Weakly Supervised Temporal Action Localization | 基于完整性建模与上下文分离的弱监督时间行为定位 | |
Unsupervised Deep Tracking | 无监督深度跟踪 | 利用前向/后向跟踪,实现无监督的学习 网络结构如图2,先利用深度网络抽取特征,再利用相关滤波实现跟踪 代码:https://github.com/594422814/UDT |
Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers | 动画跟踪:多目标注意力跟踪器的无监督学习 | 无监督的多目标跟踪——定义若干个跟踪器,利用跟踪器的结果,将前一帧数据变换到下一帧,则误差是变换后的数据与下一帧真实数据的差。 |
Fast Online Object Tracking and Segmentation: A Unifying Approach | 快速在线目标跟踪与分割:一种统一的方法 | 基于孪生网络的传统的SiamFC和SiamRPN网络基础上,添加Mask的概念,实现像素级的分割(而不是BoundingBox)跟踪。它的思想好像是借鉴有Faster-RCNN到MaskRCNN的改进 代码:http://www.robots.ox.ac.uk/˜qwang/SiamMask |
Object Tracking by Reconstruction With View-Specific Discriminative Correlation Filters | 基于特定视图判别相关滤波器重构的目标跟踪 | 针对RGB-D输入的视觉跟踪,对于CSR-DCF的改进算法 1. 结合跟踪与三维重建(估计R,T),互相促进 2. 记录多幅视觉图像(多视角),提高跟踪精度 代码:https://github.com/ugurkart |
SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints | SoPhie:基于注意力GAN的符合社会和身体约束的路径预测。 | |
Leveraging Shape Completion for 3D Siamese Tracking | 利用形状补全进行三维孪生跟踪 | |
Target-Aware Deep Tracking | 目标感知深度跟踪 | |
Spatiotemporal CNN for Video Object Segmentation | 用于视频对象分割的时空CNN | |
Towards Rich Feature Discovery With Class Activation Maps Augmentation for Person Re-Identification | 基于类激活图的扩充的丰富的特征发现,实现人再识别 | 如图2,相对传统ReID,添加新的branch,用以增加新的(可判别空间位置)的限制(公式5,6) |
Wide-Context Semantic Image Extrapolation | 宽上下文语义图像外推 | |
End-To-End Time-Lapse Video Synthesis From a Single Outdoor Image | 从单个室外图像端到端延时视频合成 | |
GIF2Video: Color Dequantization and Temporal Interpolation of GIF Images | GIF2video:GIF图像的颜色去量化和时间插值 | |
Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis | 基于模式搜索GAN的多种图像合成 | |
Pluralistic Image Completion | 多元图像补全 | |
Salient Object Detection With Pyramid Attention and Salient Edges | 基于金字塔注意和显著边缘的显著物体检测 | |
Latent Filter Scaling for Multimodal Unsupervised Image-To-Image Translation | 基于潜在滤波尺度的多模态无监督图像到图像转换 | |
Attention-Aware Multi-Stroke Style Transfer | 基于注意力感知的多笔画风格转换 | |
Feedback Adversarial Learning: Spatial Feedback for Improving Generative Adversarial Networks | 反馈对抗学习:基于空间反馈的改进GAN | |
Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting | 学习金字塔-上下文编码器网络实现高质量图像修复 | |
Example-Guided Style-Consistent Image Synthesis From Semantic Labeling | 基于语义标记的示例引导风格一致性图像合成 | |
MirrorGAN: Learning Text-To-Image Generation by Redescription | MirrorGAN:通过重新描述学习文本到图像生成 | |
Light Field Messaging With Deep Photographic Steganography | 基于深度摄影隐写术的光场信息发送 | |
Im2Pencil: Controllable Pencil Illustration From Photographs | Im2Pencil:照片中的可控制铅笔插图 | |
When Color Constancy Goes Wrong: Correcting Improperly White-Balanced Images | 当颜色恒定性出错时:纠正不正确的白平衡图像 | |
Beyond Volumetric Albedo -- A Surface Optimization Framework for Non-Line-Of-Sight Imaging | 超越体积反照率--非视线成像的表面优化框架 | |
Reflection Removal Using a Dual-Pixel Sensor | 使用双像素传感器消除反射 | |
Practical Coding Function Design for Time-Of-Flight Imaging | 基于实用编码函数设计的飞行时间成像 | |
Meta-SR: A Magnification-Arbitrary Network for Super-Resolution | Meta-SR:基于放大任意网络的超分辨率 | |
Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net | 基于MS/HS融合网的多光谱和高光谱图像融合 | |
Learning Attraction Field Representation for Robust Line Segment Detection | 基于吸引场表示学习的鲁棒线段检测 | 将基于深度学习的语义分割技术,应用到线段分割上。首先,定义线段与基于线段的整个图像分割结果映射起来,然后利用语义分割,分割图像,将分割后的图像反映射到线段检测。 |
Blind Super-Resolution With Iterative Kernel Correction | 基于迭代核校正的盲超分辨 | 分别定义3个深度网络,对于超分辨率、模糊核估计、模糊核校正。利用训练后的网络,迭代优化模糊核及超分辨率结果(Algo.1) |
Video Magnification in the Wild Using Fractional Anisotropy in Temporal Distribution | 基于时间分布中分数各向异性的野外视频放大 | |
Attentive Feedback Network for Boundary-Aware Salient Object Detection | 边界感知反馈显著目标检测的注意力反馈网络 | 1. 编解码结构网络,在相对应编码与解码过程中,采用两阶段(待反馈)的融合方式;2. 损失函数中,考虑GroundTruth的边缘信息 |
Heavy Rain Image Restoration: Integrating Physics Model and Conditional Adversarial Learning | 暴雨图像恢复:物理模型与条件对抗学习的集成 | 1.给出暴雨下的图像模型(公式2) 2.网络分两个阶段(如图2):1.模型估计及图像恢复;2.利用cGAN对恢复图像refine 3.用于训练的图像及其暴雨参数都是合成的,用于进行监督训练(见8-12) 4.待处理图像(利用颜色通道残差引导,公式6,防止细节平滑)分割为高频和低频两个通道,分别处理(见图2) |
Learning to Calibrate Straight Lines for Fisheye Image Rectification | 鱼眼图像校正中直线标定学习 | |
Camera Lens Super-Resolution | 相机镜头超分辨率 | |
Frame-Consistent Recurrent Video Deraining With Dual-Level Flow | 基于双级流的连续视频去雨 | |
Deep Plug-And-Play Super-Resolution for Arbitrary Blur Kernels | 面向任意模糊核的深度即插即用超分辨率 | 两种超分辨率的退化模型(公式1,2),公式1的解决方法往往没有现成代码,而公式2简单但效果不好。本文提出新的退化模型(公式3),并将公式2的基于DNN的方法,(利用迭代的方法)扩展为针对新的退化模型的方法。DNN是基于公式2的退化模型,预先训练好了的 |
Sea-Thru: A Method for Removing Water From Underwater Images | 海底穿越:一种从水下图像中去除水的方法 | 输入RGBD图像,对于水下采集图像进行建模和恢复 |
Deep Network Interpolation for Continuous Imagery Effect Transition | 基于深度网络插值的连续图像效果转换 | |
Spatially Variant Linear Representation Models for Joint Filtering | 基于空间可变线性表示模型的联合滤波 | |
Toward Convolutional Blind Denoising of Real Photographs | 真实照片的卷积盲去噪 | |
Towards Real Scene Super-Resolution With Raw Images | 使用原始图像实现真实场景的超分辨率 | |
ODE-Inspired Network Design for Single Image Super-Resolution | 基于ODE激励网络的单图像超分辨率 | |
Blind Image Deblurring With Local Maximum Gradient Prior | 基于局部最大梯度先验的盲图像去模糊 | |
Attention-Guided Network for Ghost-Free High Dynamic Range Imaging | 基于注意引导网络的无幽灵高动态范围成像 | |
Searching for a Robust Neural Architecture in Four GPU Hours | 在四个GPU小时内寻找一个强大的神经结构 | |
Hierarchy Denoising Recursive Autoencoders for 3D Scene Layout Prediction | 用于三维场景布局预测的层次去噪递归自编码器 | |
Adaptively Connected Neural Networks | 自适应连接神经网络 | |
CrDoCo: Pixel-Level Domain Transfer With Cross-Domain Consistency | CrDoCo:基于跨域一致性的像素级域迁移 | |
Temporal Cycle-Consistency Learning | 时间周期一致性学习 | |
Predicting Future Frames Using Retrospective Cycle GAN | 使用回顾性Cycle GAN预测未来帧 | |
Density Map Regression Guided Detection Network for RGB-D Crowd Counting and Localization | 用于RGB-D人群计数和定位的密度图回归引导检测网络 | |
TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning | TAFE-Net:基于任务感知特征嵌入的少镜头学习 | |
Learning Semantic Segmentation From Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach | 从合成数据学习语义分割:一种几何引导的输入输出自适应方法 | |
Attentive Single-Tasking of Multiple Tasks | 专注地完成多项任务中的一项任务 | |
Deep Metric Learning to Rank | 深度度量学习排名 | |
End-To-End Multi-Task Learning With Attention | 基于注意力的端到端多任务学习 | |
Self-Supervised Learning via Conditional Motion Propagation | 基于条件运动传播的自监督学习 | |
Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence | 通过时空对应桥接立体匹配和光流 | |
All About Structure: Adapting Structural Information Across Domains for Boosting Semantic Segmentation | 关于结构:跨域调整结构信息以推进语义分割 | 文章认为高层结构特征是域不变的,而低层纹理特征是域变化的。如图2,采用编解码结构,将结构特征与纹理特征分离,利用域不变的结构特征训练语义分割网络。定义了相对应的一组损失函数 |
Iterative Reorganization With Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning | 弱空间约束下的迭代重组:无监督表示学习中任意拼图问题的求解 | |
Revisiting Self-Supervised Visual Representation Learning | 再研究自我监督的视觉表征学习 | |
It's Not About the Journey; It's About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning | 这与旅行无关;与目的地有关:在问题引导下沿着软路径进行视觉推理 | |
Actively Seeking and Learning From Live Data | 从实时数据中主动地寻求和学习 | |
Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing | 用跨模态注意引导擦除改进指代表达式Grounding | |
Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks | 邻里观察:通过语言引导的图形注意力网络进行指代表达理解 | refering expression:用自然语言指定图中物体,算法自动检测出该物体:http://vision2.cs.unc.edu/refer/comprehension |
Scene Graph Generation With External Knowledge and Image Reconstruction | 基于外部知识和图像重构的场景图生成 | |
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval | 用于跨模态检索的多义视觉-语义嵌入 | |
MUREL: Multimodal Relational Reasoning for Visual Question Answering | 基于多模态关系推理的视觉问答 | |
Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering | 基于异构记忆增强多模注意力模型的视频问答 | |
Information Maximizing Visual Question Generation | 信息最大化视觉问题生成 | |
Learning to Detect Human-Object Interactions With Knowledge | 利用知识学习发现人类-物体的交互 | |
Learning Words by Drawing Images | 画图学字 | |
Factor Graph Attention | 因子图注意 | |
Reducing Uncertainty in Undersampled MRI Reconstruction With Active Acquisition | 利用主动获取实现下采样MRI重建中不确定性降低 | |
ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification | 基于迭代图像校正的端到端场景文本识别 | |
ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape | ROI-10D:单目二维检测提升到6D姿势和公制形状 | |
Collaborative Learning of Semi-Supervised Segmentation and Classification for Medical Images | 医学图像半监督分割与分类的协同学习 | |
Biologically-Constrained Graphs for Global Connectomics Reconstruction | 基于生物学约束图的全局连接体重建 | |
P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification | P3SGD:保留患者隐私的SGD,用于病理图像分类中的深层CNN正则化 | |
Elastic Boundary Projection for 3D Medical Image Segmentation | 基于弹性边界投影的三维医学图像分割 | |
SIXray: A Large-Scale Security Inspection X-Ray Benchmark for Prohibited Item Discovery in Overlapping Images | SIXray:一个大型安全检查X射线基准,用于在重叠图像中发现违禁物品 | |
Noise2Void - Learning Denoising From Single Noisy Images | Noise2Void:从单个噪声图像学习图像去噪 | 本文回顾了基于深度学习的两种训练方法(带GT的和不带GT,但是带另一幅噪声图像的),提出了一种仅通过噪声图像自身进行深度网络学习的方法。 其思想很简单(Fig.a),即在训练当前像素时,将其感受野中所对应当前像素去掉,从而迫使深度模型采用周围像素来学习当前位置的像素值 |
Joint Discriminative and Generative Learning for Person Re-Identification | 基于联合辨别与生成学习的人再识别 | |
Unsupervised Person Re-Identification by Soft Multilabel Learning | 基于软多标签学习的无监督人再识别 | |
Learning Context Graph for Person Search | 用于人员搜索的上下文图学习 | |
Gradient Matching Generative Networks for Zero-Shot Learning | 基于梯度匹配生成网络的零镜头学习 | |
Doodle to Search: Practical Zero-Shot Sketch-Based Image Retrieval | 涂鸦搜索:实用的基于零镜头草图的图像检索 | |
Zero-Shot Task Transfer | 零镜头任务迁移 | |
C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection | 基于连续多实例学习的弱监督目标检测 | |
Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations | 基于像素间关系的弱监督实例分割 | 基于类别级标签的弱监督实例分割,在CAM基础上,考虑类不可知区域和像素间关系(Affinx),如图2 |
Attention-Based Dropout Layer for Weakly Supervised Object Localization | 基于注意的Dropout层实现弱监督目标定位 | |
Domain Generalization by Solving Jigsaw Puzzles | 基于求解拼图的域泛化 | |
Transferrable Prototypical Networks for Unsupervised Domain Adaptation | 基于可转移原型网络的无监督域自适应 | |
Blending-Target Domain Adaptation by Adversarial Meta-Adaptation Networks | 基于对抗性元适应网络的混合目标域自适应 | |
ELASTIC: Improving CNNs With Dynamic Scaling Policies | ELASTIC:通过动态缩放策略改进CNN | |
ScratchDet: Training Single-Shot Object Detectors From Scratch | ScratchDet:从零开始训练单镜头目标检测器 | |
SFNet: Learning Object-Aware Semantic Correspondence | 对象感知语义对应学习 | |
Deep Metric Learning Beyond Binary Supervision | 超越二元监督的深度度量学习 | |
Learning to Cluster Faces on an Affinity Graph | 学习在关联图上聚类人脸 | |
C2AE: Class Conditioned Auto-Encoder for Open-Set Recognition | C2AE:用于开放集识别的类条件自编码器 | |
Shapes and Context: In-The-Wild Image Synthesis & Manipulation | 形状与背景:在野外图像合成与操作 | |
Semantics Disentangling for Text-To-Image Generation | 基于语义分离的文本到图像生成 | |
Semantic Image Synthesis With Spatially-Adaptive Normalization | 空间自适应归一化的语义图像合成 | |
Progressive Pose Attention Transfer for Person Image Generation | 用于人像生成的渐进式姿势-注意力转移 | |
Unsupervised Person Image Generation With Semantic Parsing Transformation | 基于语义解析转换的无监督人像生成 | |
DeepView: View Synthesis With Learned Gradient Descent | DeepView:基于梯度下降学习的视图合成 | |
Animating Arbitrary Objects via Deep Motion Transfer | 通过深度运动传输实现任意对象的动画 | |
Textured Neural Avatars | 纹理神经化身(Avatars) | |
IM-Net for High Resolution Video Frame Interpolation | 用于高分辨率视频帧插值的IM网络 | |
Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation | 基于同态隐空间插值的非配对图像到图像转换 | |
Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation | 基于级联语义指导的多通道注意选择GAN实现跨视图图像翻译 | |
Geometry-Consistent Generative Adversarial Networks for One-Sided Unsupervised Domain Mapping | 基于几何一致GAN的单侧无监督域映射 | |
DeepVoxels: Learning Persistent 3D Feature Embeddings | DeepVoxels:学习持久的3D功能嵌入 | |
Inverse Path Tracing for Joint Material and Lighting Estimation | 关节材料反路径跟踪与光照估计 | |
The Visual Centrifuge: Model-Free Layered Video Representations | 视觉离心机:无模型分层视频表示 | |
Label-Noise Robust Generative Adversarial Networks | 标签噪声鲁棒GAN | |
DLOW: Domain Flow for Adaptation and Generalization | 基于域流的适应和泛化 | |
CollaGAN: Collaborative GAN for Missing Image Data Imputation | CollaGAN:基于协作GAN的缺失图像数据插补 | |
d-SNE: Domain Adaptation Using Stochastic Neighborhood Embedding | 基于随机邻域嵌入的域自适应 | |
Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation | 更仔细地看域迁移:基于类别级对抗的语义一致域自适应 | |
ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation | ADVENT:基于对抗熵最小化的语义分割域适应 | |
ContextDesc: Local Descriptor Augmentation With Cross-Modality Context | ContextDesc:使用跨模态上下文的局部描述符扩充 | |
Large-Scale Long-Tailed Recognition in an Open World | 开放世界中的大规模长拖尾识别 | |
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations Rather Than Data | AET与AED:通过自编码转换而非数据的无监督表示学习 | |
SDC - Stacked Dilated Convolution: A Unified Descriptor Network for Dense Matching Tasks | 层叠空洞卷积:密集匹配任务的统一描述网络 | |
Learning Correspondence From the Cycle-Consistency of Time | 从时间的循环一致性中学习对应关系 | |
AE2-Nets: Autoencoder in Autoencoder Networks | AE2-Net:AutoEncoder网络中的AutoEncoder | |
Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach | 图像表示中的减轻信息泄漏:最大熵方法 | |
Learning Spatial Common Sense With Geometry-Aware Recurrent Networks | 利用几何感知循环网络学习空间Common Sense | |
Structured Knowledge Distillation for Semantic Segmentation | 基于结构化知识蒸馏的语义分割 | 结合三类知识蒸馏方法(像素级、像素对级和整体级),实现由复杂网络到简单网络的知识蒸馏 |
Scan2CAD: Learning CAD Model Alignment in RGB-D Scans | Scan2CAD:在RGB-D扫描中学习CAD模型对齐 | |
Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation | 面向场景理解:基于语义感知表示的无监督单目深度估计 | |
Tell Me Where I Am: Object-Level Scene Context Prediction | 告诉我我在哪里:对象级场景上下文预测 | |
Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation | 基于归一化目标坐标空间的类别级6D对象姿态和尺寸估计 | |
Supervised Fitting of Geometric Primitives to 3D Point Clouds | 几何基元到三维点云的监督拟合 | |
Do Better ImageNet Models Transfer Better? | 更好的ImageNet模型能得到更好的传输吗? | |
Gotta Adapt 'Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild | 联合像素和特征级域适应实现野外识别 | |
Understanding the Disharmony Between Dropout and Batch Normalization by Variance Shift | 用方差变换理解Dropout与Batch Normalization之间的不协调性 | |
Circulant Binary Convolutional Networks: Enhancing the Performance of 1-Bit DCNNs With Circulant Back Propagation | 循环二元卷积网络:利用循环反向传播增强1bit DCNN的性能 | |
DeFusionNET: Defocus Blur Detection via Recurrently Fusing and Refining Multi-Scale Deep Features | DeFusionNET:通过反复融合和细化多尺度深度特征进行散焦模糊检测 | |
Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks | 基于深层虚拟网络的多任务的内存高效推理 | |
Universal Domain Adaptation | 通用域适应 | |
Improving Transferability of Adversarial Examples With Input Diversity | 利用输入多样性提高对抗性实例的可传递性 | |
Sequence-To-Sequence Domain Adaptation Network for Robust Text Image Recognition | 序列-序列的域自适应网络实现鲁棒文本图像识别 | |
Hybrid-Attention Based Decoupled Metric Learning for Zero-Shot Image Retrieval | 基于混合注意的解耦度量学习实现零镜头图像检索 | |
Learning to Sample | 学习采样 | |
Few-Shot Learning via Saliency-Guided Hallucination of Samples | 通过显著性引导的样本幻觉进行的少镜头学习 | |
Variational Convolutional Neural Network Pruning | 变分卷积神经网络剪枝 | |
Towards Optimal Structured CNN Pruning via Generative Adversarial Learning | 基于生成对抗学习的CNN优化结构修剪 | |
Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression | 利用核稀疏性和熵实现可解释CNN压缩 | |
Fully Quantized Network for Object Detection | 基于全量化网络的目标检测 | |
MnasNet: Platform-Aware Neural Architecture Search for Mobile | MnasNet:移动设备中平台感知神经架构搜索 | |
Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More | 学生成为大师:基于知识融合的联合场景分析、深度估计等 | |
K-Nearest Neighbors Hashing | K-最近邻哈希 | |
Learning RoI Transformer for Oriented Object Detection in Aerial Images | 用于航空图像定向目标检测的学习型ROI变换器 | |
Snapshot Distillation: Teacher-Student Optimization in One Generation | 快速蒸馏:一代中的师生优化 | |
Geometry-Aware Distillation for Indoor Semantic Segmentation | 用于室内语义分割的几何感知蒸馏 | 1. 所谓“几何”,在此是“深度”的意思 2. 本文算法即将深度和语义分割同时进行,而语义分割采用的是带深度的pipeline。 3. 训练集是带有深度的RGB图像 |
LiveSketch: Query Perturbations for Guided Sketch-Based Visual Search | LiveSketch:基于引导草图的视觉搜索的查询扰动 | |
Bounding Box Regression With Uncertainty for Accurate Object Detection | 具有不确定性的边界盒回归用于精确目标检测 | |
OCGAN: One-Class Novelty Detection Using GANs With Constrained Latent Representations | OCGAN:基于约束潜在表示的GAN实现一类新颖性检测 | |
Learning Metrics From Teachers: Compact Networks for Image Embedding | 由教师学习度量:图像嵌入的紧凑网络 | |
Activity Driven Weakly Supervised Object Detection | 活动驱动的弱监督目标检测 | |
Separate to Adapt: Open Set Domain Adaptation via Progressive Separation | 分离适应:通过渐进分离的开放集域适应 | |
Layout-Graph Reasoning for Fashion Landmark Detection | 基于布局图推理的时尚标记检测 | |
DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs | 提取哈希:通过蒸馏数据对进行无监督的深度哈希 | |
Mind Your Neighbours: Image Annotation With Metadata Neighbourhood Graph Co-Attention Networks | 注意你的邻居:基于元数据邻域图共同关注网络的图像注释 | |
Region Proposal by Guided Anchoring | 基于引导锚定的区域建议 | |
Distant Supervised Centroid Shift: A Simple and Efficient Approach to Visual Domain Adaptation | 远距离监督质心偏移:一种简单有效的视觉域自适应方法 | |
Learning to Transfer Examples for Partial Domain Adaptation | 基于学习转移例子的部分域适应 | |
Generalized Zero-Shot Recognition Based on Visually Semantic Embedding | 基于视觉语义嵌入的广义零镜头识别 | |
Towards Visual Feature Translation | 面向视觉特征翻译 | |
Amodal Instance Segmentation With KINS Dataset | 基于KINS数据集的Amodal实例分割 | |
Global Second-Order Pooling Convolutional Networks | 全局二阶池化卷积网络 | |
Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up | 弱监督互补部分模型实现自底向上细粒度图像分类 | |
NetTailor: Tuning the Architecture, Not Just the Weights | NetTailor:调整架构,而不仅仅是权重 | |
Learning-Based Sampling for Natural Image Matting | 基于学习的采样实现自然图像抠图 | |
Learning Unsupervised Video Object Segmentation Through Visual Attention | 通过视觉注意学习无监督视频对象分割 | |
4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks | 4D时空ConvNet:Minkowski卷积神经网络 | |
Pyramid Feature Attention Network for Saliency Detection | 基于金字塔特征关注网络的显著性检测 | |
Co-Saliency Detection via Mask-Guided Fully Convolutional Networks With Multi-Scale Label Smoothing | 基于多尺度标签平滑的掩模引导全卷积网络实现共显著性检测 | |
SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation - A Synthetic Dataset and Baselines | SAIL-VOS:语义Amodal实例级视频对象分割-合成数据集和基线 | |
Learning Instance Activation Maps for Weakly Supervised Instance Segmentation | 基于实例激活映射学习的弱监督实例分割 | 弱监督实例分割,基于类级别标签学习,利用分类网络中特征图对于实例部分高激活特性,在此基础上进行fill,得到伪实例标签,进行学习 |
Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation | 译码器对语义分割很重要:数据相关解码实现灵活地特征聚合 | |
Box-Driven Class-Wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation | 基于盒驱动类域掩蔽和填充率导引损失的弱监督语义分割 | |
Dual Attention Network for Scene Segmentation | 用于场景分割的双注意网络 | |
InverseRenderNet: Learning Single Image Inverse Rendering | InverseRenderNet:单个图像的反向渲染学习 | |
A Variational Auto-Encoder Model for Stochastic Point Processes | 基于变分自编码模型的随机点过程 | |
Unifying Heterogeneous Classifiers With Distillation | 利用蒸馏实现非均匀分类器的统一 | |
Assessment of Faster R-CNN in Man-Machine Collaborative Search | 人机协同搜索中Faster R-CNN的评估 | |
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge | OK-VQA:一个需要外部知识的视觉问题解答基准 | |
NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction | 神经判别降维实现多任务CNN的分层特征融合 | |
Spectral Metric for Dataset Complexity Assessment | 利用谱度量实现数据集复杂性评估 | |
ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding | ADCrowdNet:一种用于群体理解的注意力注入可变形卷积网络 | |
VERI-Wild: A Large Dataset and a New Method for Vehicle Re-Identification in the Wild | VERI-Wild:一个大型数据集和一种新的野外车辆再识别方法 | |
3D Local Features for Direct Pairwise Registration | 基于3D局部特征的直接成对配准 | |
HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-Scale Point Clouds | HPLFlowNet:用于大柜面点云上场景流估计的层次置换格FlowNet | |
GPSfM: Global Projective SFM Using Algebraic Constraints on Multi-View Fundamental Matrices | 基于多视图基本矩阵代数约束的全局投影SFM | |
Group-Wise Correlation Stereo Network | 群相关立体网络 | PSMNet基础上的改进,主要体现在以下几个方面: 1. costVolumn采用concatenation(公式2)和correlation(公式1)相结合的方式,特别是correlation,采用通道分组的方式(公式3),以获取更多的信息,从而使aggregationNet简化成为可能。 2. aggregation Net 进行了改进(图2),提升了速度 |
Multi-Level Context Ultra-Aggregation for Stereo Matching | 基于多级上下文超聚合的立体匹配 | 基于PSMNet的改进,主要改动在前端matching cost calculation部分,如图3,添加了一个子分支,从而定义了所谓“interesting level组合”(图中彩色实线),而模块内部采用的是密集连接,文中称为高阶RCNN 另一部分改进在输出部分,见图2,引入了残差模块来精化 |
Large-Scale, Metric Structure From Motion for Unordered Light Fields | 无序光场运动的大尺度测度SFM | |
Understanding the Limitations of CNN-Based Absolute Camera Pose Regression | 理解基于CNN的绝对摄像机姿态回归的局限性 | |
DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene From Sparse LiDAR Data and Single Color Image | 基于稀疏激光雷达数据和单色图像的室外场景深度地表法线引导深度预测 | |
Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling | 利用自关注和Gumbel子集采样对点云进行建模 | |
Learning With Batch-Wise Optimal Transport Loss for 3D Shape Recognition | 基于分批最优传输损失的三维形状识别学习 | |
DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion | 密集融合:基于迭代密集融合的6D目标姿态估计 | |
Dense Depth Posterior (DDP) From Single Image and Sparse Range | 单个图像和稀疏区域的密集后向深度(DDP) | |
DuLa-Net: A Dual-Projection Network for Estimating Room Layouts From a Single RGB Panorama | DuLa-Net:从单一的RGB全景图估算房间布局的双投影网。 | |
Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach | 通过多任务几何和语义场景理解方法实现的时间一致深度预测 | |
Segmentation-Driven 6D Object Pose Estimation | 分段驱动6D目标姿态估计 | |
Exploiting Temporal Context for 3D Human Pose Estimation in the Wild | 利用时间上下文实现野外三维人体姿态估计 | |
What Do Single-View 3D Reconstruction Networks Learn? | 单视图三维重建网络学习什么? | |
UniformFace: Learning Deep Equidistributed Representation for Face Recognition | UniformFace:学习人脸识别的深度均匀表示 | |
Semantic Graph Convolutional Networks for 3D Human Pose Regression | 基于语义图卷积网络的三维人体姿态回归 | |
Mask-Guided Portrait Editing With Conditional GANs | 基于条件GAN的模板引导式肖像编辑 | |
Group Sampling for Scale Invariant Face Detection | 基于群抽样的尺度不变人脸检测 | |
Joint Representation and Estimator Learning for Facial Action Unit Intensity Estimation | 基于联合表示与估计学习的面部动作单元强度估计 | |
Semantic Alignment: Finding Semantically Consistent Ground-Truth for Facial Landmark Detection | 语义对齐:为人脸地标检测找到语义一致的Ground-Truth | |
LAEO-Net: Revisiting People Looking at Each Other in Videos | LAEO-Net:重温视频中互相注视的人 | |
Robust Facial Landmark Detection via Occlusion-Adaptive Deep Networks | 基于遮挡自适应深度网络的人脸地标检测 | |
Learning Individual Styles of Conversational Gesture | 学习会话手势的个人风格 | |
Face Anti-Spoofing: Model Matters, so Does Data | 人脸反欺骗:模型很重要,数据也很重要 | |
Fast Human Pose Estimation | 快速人体姿态估计 | |
Decorrelated Adversarial Learning for Age-Invariant Face Recognition | 基于非相关对抗学习的年龄不变人脸识别 | |
Cross-Task Weakly Supervised Learning From Instructional Videos | 从指导视频中实现交叉任务弱监督学习 | |
D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation | D3TW:判别性可微动态时间扭曲实现弱监督动作对齐和分割 | |
Progressive Teacher-Student Learning for Early Action Prediction | 基于渐进师生学习的早期行动预测 | |
Social Relation Recognition From Videos via Multi-Scale Spatial-Temporal Reasoning | 基于多尺度时空推理的视频社会关系识别 | |
MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation | 基于多级时间卷积网络的动作分割 | |
Transferable Interactiveness Knowledge for Human-Object Interaction Detection | 基于可转移交互知识的人-物交互检测 | |
Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition | 动作-结构图卷积网络实现基于骨架的动作识别 | |
Multi-Granularity Generator for Temporal Action Proposal | 基于多粒度生成器的时域动作建议 | |
Deep Rigid Instance Scene Flow | 深度刚性实例场景流 | |
See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks | 看到更多,了解更多:基于共同关注孪生网络的无监督视频对象分割 | |
Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification | 基于Patch的判别特征学习实现无监督人再识别 | |
SPM-Tracker: Series-Parallel Matching for Real-Time Visual Object Tracking | SPM-Tracker:基于串并行匹配的实时视觉目标跟踪 | SiamFC的改进(图2)。抽取深度特征后,接着分两个阶段(CM:重点在于鲁棒性;FM:重点在于精度) |
Spatial Fusion GAN for Image Synthesis | 基于空间融合GAN的图像合成 | |
Text Guided Person Image Synthesis | 文本引导的人图像合成 | |
STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing | 一种统一的选择传输网络实现任意图像属性编辑 | |
Towards Instance-Level Image-To-Image Translation | 面向实例级的图像到图像转换 | |
Dense Intrinsic Appearance Flow for Human Pose Transfer | 基于稠密内在表象流的人体姿态转换 | |
Depth-Aware Video Frame Interpolation | 深度感知视频帧插值 | |
Sliced Wasserstein Generative Models | 切片化Wasserstein生成模型 | |
Deep Flow-Guided Video Inpainting | 深度流引导视频修复 | |
Video Generation From Single Semantic Label Map | 从单一语义标签映射生成视频 | |
Polarimetric Camera Calibration Using an LCD Monitor | 使用LCD监视器校准偏光照相机 | |
Fully Automatic Video Colorization With Self-Regularization and Diversity | 具有自规则性和多样性的全自动视频着色 | |
Zoom to Learn, Learn to Zoom | 缩放以学习,学习以缩放 | |
Single Image Reflection Removal Beyond Linearity | 线性以外的单一图像反射消除 | |
Learning to Separate Multiple Illuminants in a Single Image | 学习在单个图像中分离多个光源 | |
Shape Unicode: A Unified Shape Representation | 形状Unicode:统一的形状表示 | |
Robust Video Stabilization by Optimization in CNN Weight Space | CNN权重空间中的优化实现鲁棒视频稳定 | |
Learning Linear Transformations for Fast Image and Video Style Transfer | 基于线性转换的快速图像和视频样式转换 | |
Local Detection of Stereo Occlusion Boundaries | 立体遮挡边界的局部检测 | |
Bi-Directional Cascade Network for Perceptual Edge Detection | 基于双向级联网络的感知边缘检测 | |
Single Image Deraining: A Comprehensive Benchmark Analysis | 单图像去雨:综合基准分析 | |
Dynamic Scene Deblurring With Parameter Selective Sharing and Nested Skip Connections | 基于参数选择共享和嵌套跳过连接的动态场景去模糊 | |
Events-To-Video: Bringing Modern Computer Vision to Event Cameras | 事件到视频:现代计算机视觉与事件摄像头的桥梁 | |
Feedback Network for Image Super-Resolution | 基于反馈网络的图像超分辨率 | |
Semi-Supervised Transfer Learning for Image Rain Removal | 基于半监督转移学习的图像雨消除 | |
EventNet: Asynchronous Recursive Event Processing | EventNet:异步递归事件处理 | |
Recurrent Back-Projection Network for Video Super-Resolution | 基于递归反投影网络的视频超分辨率 | |
Cascaded Partial Decoder for Fast and Accurate Salient Object Detection | 级联部分译码器实现快速准确的显著目标检测 | |
A Simple Pooling-Based Design for Real-Time Salient Object Detection | 一种简单的基于池的实时显著目标检测 | |
Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection | 基于对比度先验和流体金字塔集成的RGBD显著目标检测 | |
Progressive Image Deraining Networks: A Better and Simpler Baseline | 渐进式图像消除网络:一个更好和更简单的基线 | |
GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud | 基于生成形状建议网络的点云中三维实例分割 | |
Attentive Relational Networks for Mapping Images to Scene Graphs | 用于将图像映射到场景图的注意力关系网络 | |
Relational Knowledge Distillation | 关系知识蒸馏 | |
Compressing Convolutional Neural Networks via Factorized Convolutional Filters | 用因子分解卷积滤波器压缩卷积神经网络 | |
On the Intrinsic Dimensionality of Image Representations | 论图像表示的内在维数 | |
Part-Regularized Near-Duplicate Vehicle Re-Identification | 部分规则化近重复车辆重新识别 | |
Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics | 基于运动和外观统计预测的视频自监督时空表示学习 | |
Classification-Reconstruction Learning for Open-Set Recognition | 开放集识别的分类重构学习 | |
Emotion-Aware Human Attention Prediction | 情绪感知人类注意力预测 | |
Residual Regression With Semantic Prior for Crowd Counting | 基于语义先验残差回归的群体计数 | |
Context-Reinforced Semantic Segmentation | 上下文强化的语义分割 | 利用上下文来提升语义分割效果,而上下文与分割结果之间,利用强化学习来相互增强。如图2 |
Adversarial Structure Matching for Structured Prediction Tasks | 基于对抗结构匹配的结构化预测任务 | |
Deep Spectral Clustering Using Dual Autoencoder Network | 利用双自编码器网络进行深度谱聚类 | |
Deep Asymmetric Metric Learning via Rich Relationship Mining | 基于丰富关系挖掘的深度非对称度量学习 | |
Did It Change? Learning to Detect Point-Of-Interest Changes for Proactive Map Updates | 学习检测兴趣点变化以进行主动地图更新 | |
Associatively Segmenting Instances and Semantics in Point Clouds | 点云中的关联分段实例和语义 | |
Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation | 模式亲和传播在深度、表面法向和语义分割上的应用 | |
Scene Categorization From Contours: Medial Axis Based Salience Measures | 基于轮廓的场景分类:基于中轴的显著测量 | |
Unsupervised Image Captioning | 无监督图像字幕 | |
Exact Adversarial Attack to Image Captioning via Structured Output Learning With Latent Variables | 利用基于潜在变量的结构化输出学习对图像字幕的精确攻击 | |
Cross-Modal Relationship Inference for Grounding Referring Expressions | 基于跨模态关系推理的Grouding指代表达 | refering expression:用自然语言指定图中物体,算法自动检测出该物体:http://vision2.cs.unc.edu/refer/comprehension |
What's to Know? Uncertainty as a Guide to Asking Goal-Oriented Questions | 要知道什么?不确定性作为引导提问面向目标的问题 | |
Iterative Alignment Network for Continuous Sign Language Recognition | 基于迭代对齐网络的连续符号语言识别 | |
Neural Sequential Phrase Grounding (SeqGROUND) | 神经序列短语Grounding(seqground) | |
CLEVR-Ref+: Diagnosing Visual Reasoning With Referring Expressions | CLEVR-Ref+:用引用表达式实现诊断视觉推理 | |
Describing Like Humans: On Diversity in Image Captioning | 像人类一样的描述:图像字幕的多样性 | |
MSCap: Multi-Style Image Captioning With Unpaired Stylized Text | MSCap:利用不成对的样式化文本实现多风格图像字幕 | |
CRAVES: Controlling Robotic Arm With a Vision-Based Economic System | CRAVES:用基于视觉的经济系统实现机械臂控制 | |
Networks for Joint Affine and Non-Parametric Image Registration | 联合仿射与非参数图像配准的网络 | |
Learning Shape-Aware Embedding for Scene Text Detection | 用于场景文本检测的形状感知嵌入学习 | |
Learning to Film From Professional Human Motion Videos | 从专业的人体运动视频中学习电影 | |
Pay Attention! - Robustifying a Deep Visuomotor Policy Through Task-Focused Visual Attention | 通过以任务为中心的视觉关注实现深度视觉运动策略鲁棒化 | |
Deep Blind Video Decaptioning by Temporal Aggregation and Recurrence | 基于时间聚集和重现的深度盲视频去字幕 | |
Learning Video Representations From Correspondence Proposals | 从通信建议中学习视频表示 | |
SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks | SiamRPN++:具有非常深网络的孪生视觉跟踪的演变 | SiamRPN的改进算法: 1. 加深网络,同时保持移动不变性(图2) 2. 层间融合(图2) 3. 跨相关性的改进(图3) 代码: http://bo-li.info/SiamRPN++ |
Sphere Generative Adversarial Network Based on Geometric Moment Matching | 基于几何矩匹配的球面GAN | |
Adversarial Attacks Beyond the Image Space | 图像空间之外的敌对攻击 | |
Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks | 通过翻译不变攻击规避可转移对抗性例子的防御 | |
Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses | 基于梯度的二级对抗攻击与防御的去耦方向与准则 | |
A General and Adaptive Robust Loss Function | 一种通用的自适应鲁棒损失函数 | |
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration | 基于几何中值的深度卷积神经网络加速滤波修剪 | |
Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss | 通过优化有任务损失的量化区间学习量化深度网络 | |
Not All Areas Are Equal: Transfer Learning for Semantic Segmentation via Hierarchical Region Selection | 基于层次区域选择的迁移学习实现语义分割 | |
Unsupervised Learning of Dense Shape Correspondence | 密集形状对应的无监督学习 | |
Unsupervised Visual Domain Adaptation: A Deep Max-Margin Gaussian Process Approach | 无监督视觉域自适应:一种深度最大边缘高斯过程方法 | |
Balanced Self-Paced Learning for Generative Adversarial Clustering Network | 基于平衡自学习的生成对抗性聚类网络 | |
A Style-Based Generator Architecture for Generative Adversarial Networks | 一种基于风格的生成器结构实现GAN | |
Parallel Optimal Transport GAN | 并行最优传输GAN | |
3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans | 3D-SIS:RGB-D扫描的3D语义实例分割 | |
Causes and Corrections for Bimodal Multi-Path Scanning With Structured Light | 结构光的双峰多径扫描的原因及修正 | |
TextureNet: Consistent Local Parametrizations for Learning From High-Resolution Signals on Meshes | TextureNet:从网格上的高分辨率信号学习的一致局部参数化 | |
PlaneRCNN: 3D Plane Detection and Reconstruction From a Single Image | PlaneRCNN:单个图像的三维平面检测和重建 | |
Occupancy Networks: Learning 3D Reconstruction in Function Space | 占用网络:在函数空间中学习三维重建 | |
3D Shape Reconstruction From Images in the Frequency Domain | 基于频域的图像的三维形状重建 | |
SiCloPe: Silhouette-Based Clothed People | SiCloPe:基于剪影的给人穿衣 | |
Detailed Human Shape Estimation From a Single Image by Hierarchical Mesh Deformation | 基于层次网格变形的单个图像的人体形状详细估计 | |
Convolutional Mesh Regression for Single-Image Human Shape Reconstruction | 基于卷积网格回归的单图像人的形状重建 | |
H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions | H+O:三维手-物体姿势和交互的统一自我中心识别 | |
Learning the Depths of Moving People by Watching Frozen People | 通过观察静止的人来学习移动的人的深度 | |
Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion | 基于场景补全的RGB-D扫描的极端相对姿态估计 | |
A Skeleton-Bridged Deep Learning Approach for Generating Meshes of Complex Topologies From Single RGB Images | 骨架-桥接深度学习实现从单个RGB图像生成复杂拓扑网格 | |
Learning Structure-And-Motion-Aware Rolling Shutter Correction | 基于结构与运动感知学习的卷帘(Rolling Shutter)校正 | |
PVNet: Pixel-Wise Voting Network for 6DoF Pose Estimation | 基于像素级投票网络的6自由度姿态估计 | |
SelFlow: Self-Supervised Learning of Optical Flow | SelFlow:基于自监督学习的光流 | |
Taking a Deeper Look at the Inverse Compositional Algorithm | 深入研究逆合成算法 | |
Deeper and Wider Siamese Networks for Real-Time Visual Tracking | 更深更广泛的孪生网络,用于实时视觉跟踪 | 孪生(SiamFC/SiamPRN)网络实现端到端跟踪的改进:采用更深的网络和更宽的网络 代码:https://github.com/researchmm/SiamDW. |
Self-Supervised Adaptation of High-Fidelity Face Models for Monocular Performance Tracking | 高保真人脸模型的自监督自适应实现单目性能跟踪 | |
Diverse Generation for Multi-Agent Sports Games | 多智能体体育游戏的多代化 | |
Efficient Online Multi-Person 2D Pose Tracking With Recurrent Spatio-Temporal Affinity Fields | 基于循环时空相似场的高效在线多人二维姿态跟踪 | |
GFrames: Gradient-Based Local Reference Frame for 3D Shape Matching | GFrames:用于三维形状匹配的基于梯度的局部参考帧 | |
Eliminating Exposure Bias and Metric Mismatch in Multiple Object Tracking | 消除多目标跟踪中的曝光偏差和测量失配 | 逐帧检测形成Tracklet,再merge的方法对于测量失配和曝光偏差的改进。 |
Graph Convolutional Tracking | 图卷积跟踪 | 端到端SiamFC的改进算法: 1. 使用图卷积(基于两两距离矩阵,拉普拉斯矩阵) 2. 同时使用时-空信息和上下文信息(公式2,3) 代码:http://nlpr-web.ia.ac.cn/mmc/homepage/jygao/gct_cvpr2019.html |
ATOM: Accurate Tracking by Overlap Maximization | ATOM:通过重叠最大化实现精确跟踪 | 基于深度网络的在线/离线跟踪(值得推荐) 1. 整个网络(图2)分两个子网络:精确估计模块(离线学习,基于IoUNet的深度网络,估计每个IoU的socre,图3)和前景/背景分类网络(在线学习,基于相关滤波的深度网络,将利用热图前景背景分开) 2. GPU下实时30FPS 代码:https://github.com/visionml/pytracking |
Visual Tracking via Adaptive Spatially-Regularized Correlation Filters | 基于自适应空间正则化相关滤波器的视觉跟踪 | SRDCF和BACF的扩展(两者都是其特例),公式4,并给出基于ADMM的优化算法 |
Deep Tree Learning for Zero-Shot Face Anti-Spoofing | 零镜头人脸反欺骗的深度树学习 | |
ArcFace: Additive Angular Margin Loss for Deep Face Recognition | ArcFace:深度人脸识别的额外的角边缘损失 | |
Learning Joint Gait Representation via Quintuplet Loss Minimization | 通过五重损失最小化学习联合步态表示 | |
Gait Recognition via Disentangled Representation Learning | 基于分离表示学习的步态识别 | |
Reversible GANs for Memory-Efficient Image-To-Image Translation | 基于可逆GANs的内存高效的图像-图像转换 | |
Sensitive-Sample Fingerprinting of Deep Neural Networks | 深度神经网络中的敏感样本指纹 | |
Soft Labels for Ordinal Regression | 用于序数回归的软标签 | |
Local to Global Learning: Gradually Adding Classes for Training Deep Neural Networks | 局部到全局学习:基于逐步增加课程的深度神经网络训练 | |
What Does It Mean to Learn in Deep Networks? And, How Does One Detect Adversarial Attacks? | 在深度网络中学习意味着什么?而且,如何检测敌方攻击? | |
Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning | 基于对抗学习的低资源手写体识别 | |
Adversarial Defense Through Network Profiling Based Path Extraction | 基于网络仿形的路径提取实现对抗防御 | |
RENAS: Reinforced Evolutionary Neural Architecture Search | 强化演进神经架构搜索 | |
Co-Occurrence Neural Network | 共现神经网络 | |
SpotTune: Transfer Learning Through Adaptive Fine-Tuning | SpotTune:自适应微调迁移学习 | |
Signal-To-Noise Ratio: A Robust Distance Metric for Deep Metric Learning | 信噪比:一种用于深度测量学习的鲁棒距离度量 | |
Detection Based Defense Against Adversarial Examples From the Steganalysis Point of View | 从隐写分析的角度看基于检测的对抗实例防御 | |
HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs | 异构核卷积在深度CNN中的应用 | |
Strike (With) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects | 摆姿势:神经网络很容易被熟悉物体的奇怪姿势愚弄。 | |
Blind Geometric Distortion Correction on Images Through Deep Learning | 基于深度学习的图像盲几何失真校正 | |
Instance-Level Meta Normalization | 实例级元规范化 | |
Iterative Normalization: Beyond Standardization Towards Efficient Whitening | 迭代标准化:超越高效白化的标准化 | |
On Learning Density Aware Embeddings | 论密度感知嵌入学习 | |
Contrastive Adaptation Network for Unsupervised Domain Adaptation | 基于对比度自适应网络的无监督域自适应 | |
LP-3DCNN: Unveiling Local Phase in 3D Convolutional Neural Networks | LP-3DCNN:揭示三维卷积神经网络中的局部相位 | |
Attribute-Driven Feature Disentangling and Temporal Aggregation for Video Person Re-Identification | 属性驱动的特征分离与时间聚合实现视频人再识别 | |
Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? | 二元集成神经网络:每个网络的比特数更多还是每个比特的网络数更多? | |
Distilling Object Detectors With Fine-Grained Feature Imitation | 基于细粒度特征模拟的目标检测蒸馏 | |
Centripetal SGD for Pruning Very Deep Convolutional Networks With Complicated Structure | 用于修剪结构复杂的极深卷积网络的向心SGD | |
Knockoff Nets: Stealing Functionality of Black-Box Models | Konckoff Net:窃取黑匣子模型的功能 | |
Deep Embedding Learning With Discriminative Sampling Policy | 基于判别抽样策略的深度嵌入学习 | |
Hybrid Task Cascade for Instance Segmentation | 混合任务级联实例分割 | |
Multi-Task Self-Supervised Object Detection via Recycling of Bounding Box Annotations | 通过回收边界框注释实现多任务自监督目标检测 | |
ClusterNet: Deep Hierarchical Cluster Network With Rigorously Rotation-Invariant Representation for Point Cloud Analysis | ClusterNet:用于点云分析的严格旋转不变表示的深度层次集群网络 | |
Learning to Learn Relation for Important People Detection in Still Images | 通过学习关系实现静止图像中重要人物检测 | |
Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition | 寻找细节中的魔鬼:学习三线注意采样网络进行细粒度图像识别 | |
Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning | 多相似度损失的一般配对加权实现深度度量学习 | |
Domain-Symmetric Networks for Adversarial Domain Adaptation | 基于域对称网络的对抗域适应 | |
End-To-End Supervised Product Quantization for Image Search and Retrieval | 基于端到端监督产品量化的图像搜索和检索 | |
Learning to Learn From Noisy Labeled Data | 学习从带噪的标签数据中学习 | |
DSFD: Dual Shot Face Detector | 双镜头人脸检测器 | |
Label Propagation for Deep Semi-Supervised Learning | 标签传播在深度半监督学习中的应用 | |
Deep Global Generalized Gaussian Networks | 深度全局广义高斯网络 | |
Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-Based Image Retrieval | 语义关联成对循环一致性实现基于零镜头草图的图像检索 | |
Context-Aware Crowd Counting | 上下文感知的群组计数 | |
Detect-To-Retrieve: Efficient Regional Aggregation for Image Search | 检测到检索:基于高效区域聚合的图像搜索 | |
Towards Accurate One-Stage Object Detection With AP-Loss | 基于AP损失的精确一阶段目标检测 | |
On Exploring Undetermined Relationships for Visual Relationship Detection | 视觉关系检测中未定关系的探讨 | |
Learning Without Memorizing | 无需记忆的学习 | |
Dynamic Recursive Neural Network | 动态递归神经网络 | |
Destruction and Construction Learning for Fine-Grained Image Recognition | 基于破坏与构造学习的细粒度图像识别 | |
Distraction-Aware Shadow Detection | 分心感知阴影检测 | |
Multi-Label Image Recognition With Graph Convolutional Networks | 基于图卷积网络的多标签图像识别 | |
High-Level Semantic Feature Detection: A New Perspective for Pedestrian Detection | 高级语义特征检测:行人检测的新视角 | |
RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection | RepMet:基于表示的度量学习实现分类和少镜头目标检测 | |
Ranked List Loss for Deep Metric Learning | 基于排名损失的深度度量学习 | |
CANet: Class-Agnostic Segmentation Networks With Iterative Refinement and Attentive Few-Shot Learning | CANet:具有迭代细化和专注的少镜头学习的类不可知分割网络 | |
Precise Detection in Densely Packed Scenes | 密集场景中的精确检测 | |
KE-GAN: Knowledge Embedded Generative Adversarial Networks for Semi-Supervised Scene Parsing | 基于知识嵌入式GAN的半监督场景解析 | |
Fast User-Guided Video Object Segmentation by Interaction-And-Propagation Networks | 基于交互和传播网络的快速用户引导视频对象分割 | |
Fast Interactive Object Annotation With Curve-GCN | 基于曲线GCN的快速交互对象标注 | |
FickleNet: Weakly and Semi-Supervised Semantic Image Segmentation Using Stochastic Inference | FickleNet:基于随机推理的弱监督和半监督语义图像分割 | |
RVOS: End-To-End Recurrent Network for Video Object Segmentation | 视频对象分割的端到端循环网络 | |
DeepFlux for Skeletons in the Wild | 基于DeepFlux的野外骨架 | |
Interactive Image Segmentation via Backpropagating Refinement Scheme | 基于后向传播细化方案的交互式图像分割 | |
Scene Parsing via Integrated Classification Model and Variance-Based Regularization | 基于综合分类模型和方差正则化的场景分析 | |
RAVEN: A Dataset for Relational and Analogical Visual REasoNing | RAVEN:用于关系和类比视觉推理的数据集 | |
Surface Reconstruction From Normals: A Robust DGP-Based Discontinuity Preservation Approach | 基于归一化的曲面重构:一种基于DGP的鲁棒不连续性保持方法 | |
DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images | DeepFashion2:服装图像的检测、姿势估计、分割和再识别的通用基准 | |
Jumping Manifolds: Geometry Aware Dense Non-Rigid Structure From Motion | 跳跃流形:几何感知密集非刚性SFM | |
LVIS: A Dataset for Large Vocabulary Instance Segmentation | 大词汇实例分割的数据集 | |
Fast Object Class Labelling via Speech | 通过语音实现快速标记对象类 | |
LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking | LaSOT:大规模单目标跟踪的高质量基准 | |
Creative Flow+ Dataset | 创意流+数据集 | |
Weakly Supervised Open-Set Domain Adaptation by Dual-Domain Collaboration | 基于双域协作的弱监督开放集域自适应 | |
A Neurobiological Evaluation Metric for Neural Network Model Search | 用于神经网络模型搜索的神经生物学评价指标 | |
Iterative Projection and Matching: Finding Structure-Preserving Representatives and Its Application to Computer Vision | 迭代投影与匹配:寻找保结构表示及其在计算机视觉中的应用 | |
Efficient Multi-Domain Learning by Covariance Normalization | 基于协方差归一化的高效多域学习 | |
Predicting Visible Image Differences Under Varying Display Brightness and Viewing Distance | 不同显示亮度和视距下的可见图像差异预测 | |
A Bayesian Perspective on the Deep Image Prior | 基于深度图像先验的贝叶斯透视 | |
ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving | ApolloCar3D:面向自动驾驶的一个大型3D汽车实例理解基准 | |
Compressing Unknown Images With Product Quantizer for Efficient Zero-Shot Classification | 用乘积量化器压缩未知图像实现有效的零镜头分类 | |
Self-Supervised Convolutional Subspace Clustering Network | 自监督卷积子空间聚类网络 | |
Multi-Scale Geometric Consistency Guided Multi-View Stereo | 多尺度几何一致性引导的多视角立体 | |
Privacy Preserving Image-Based Localization | 隐私保护的基于图像的定位 | |
SimulCap : Single-View Human Performance Capture With Cloth Simulation | SimulCap:用布料模拟单视图人的表现捕捉 | |
Hierarchical Deep Stereo Matching on High-Resolution Images | 高分辨率图像的分层深度立体匹配 | 主要是利用空间金字塔(SPP)抽取不同(4个)尺度的特征,并由此得到多尺度的costVolumn,低尺度特征不仅独立估计视差,同时辅助高尺度特征计算高尺度costVolumn(图2,3),即所谓“层级” 文中同时还给出数据增广方法和新的训练数据集 |
Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference | 基于循环MVSNet的高分辨率多视图立体深度推断 | |
Synthesizing 3D Shapes From Silhouette Image Collections Using Multi-Projection Generative Adversarial Networks | 使用多投影GAN从轮廓图像集合合成三维形状 | |
The Perfect Match: 3D Point Cloud Matching With Smoothed Densities | 完美匹配:基于平滑密度的三维点云匹配 | |
Recurrent Neural Network for (Un-)Supervised Learning of Monocular Video Visual Odometry and Depth | 用于(非)监督学习单眼视频视觉里程计和深度的循环神经网络 | 基于LSTM深度网络结构的深度和位置同时估计。网络结构如图3,pipleline如图2,损失函数如公式1,5,6,7,8 |
PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing | PointWeb:基于局部邻域功能增强的点云处理 | |
Scan2Mesh: From Unstructured Range Scans to 3D Meshes | Scan2Mesh:从非结构化范围扫描到三维网格 | |
Unsupervised Domain Adaptation for ToF Data Denoising With Adversarial Learning | 利用对抗学习实现基于无监督域自适应的ToF数据去噪 | |
Learning Independent Object Motion From Unlabelled Stereoscopic Videos | 从未标记的立体视频中学习独立物体运动 | |
Learning Single-Image Depth From Videos Using Quality Assessment Networks | 使用质量评估网络从视频中学习单个图像深度 | |
Learning 3D Human Dynamics From Video | 从视频中学习三维人体动力学 | |
Lending Orientation to Neural Networks for Cross-View Geo-Localization | 面向神经网络的跨视图地理定位研究 | |
Visual Localization by Learning Objects-Of-Interest Dense Match Regression | 通过感兴趣对象的密集匹配回归学习实现视觉定位 | |
Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction | 双边循环约束与自适应正则化实现无监督单目深度预测 | |
Face Parsing With RoI Tanh-Warping | 用ROI正切弯曲实现人脸分析 | |
Multi-Person Articulated Tracking With Spatial and Temporal Embeddings | 基于时空嵌入的多人关节跟踪 | |
Multi-Person Pose Estimation With Enhanced Channel-Wise and Spatial Information | 基于增强通道和空间信息的多人姿态估计 | |
A Compact Embedding for Facial Expression Similarity | 基于紧凑嵌入的面部表情相似性 | |
Deep High-Resolution Representation Learning for Human Pose Estimation | 基于深度高分辨率表示学习的人体姿态估计 | |
Feature Transfer Learning for Face Recognition With Under-Represented Data | 欠表示数据下基于特征迁移学习的人脸识别 | |
Unsupervised 3D Pose Estimation With Geometric Self-Supervision | 基于几何自监督的无监督三维姿态估计 | |
Peeking Into the Future: Predicting Future Person Activities and Locations in Videos | 展望未来:在视频中预测未来人的活动和地点 | |
Re-Identification With Consistent Attentive Siamese Networks | 用一致注意的孪生网络实现重识别 | |
On the Continuity of Rotation Representations in Neural Networks | 论神经网络中旋转表示的连续性 | |
Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation | 基于迭代残差精化的联合光流和遮挡估计 | |
Inverse Discriminative Networks for Handwritten Signature Verification | 基于反向判别网络的手写签名验证 | |
Led3D: A Lightweight and Efficient Deep Approach to Recognizing Low-Quality 3D Faces | Led3D:识别低质量三维人脸的一种轻量级和高效的深度方法 | |
ROI Pooled Correlation Filters for Visual Tracking | 用于视觉跟踪的ROI集合相关滤波器 | |
Deep Video Inpainting | 深度视频修复 | |
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-To-Image Synthesis | DM-GAN:基于动态存储GAN的文本-图像合成 | |
Non-Adversarial Image Synthesis With Generative Latent Nearest Neighbors | 基于生成潜在最近邻的非对抗性图像合成 | |
Mixture Density Generative Adversarial Networks | 混合密度GAN | |
SketchGAN: Joint Sketch Completion and Recognition With Generative Adversarial Network | SketchGAN:基于GAN的联合草图补全与识别 | |
Foreground-Aware Image Inpainting | 前景感知的图像修补 | |
Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-To-Image Translation | Art2Real:通过语义感知的图像-图像的翻译来展现艺术作品的真实性 | |
Structure-Preserving Stereoscopic View Synthesis With Multi-Scale Adversarial Correlation Matching | 基于多尺度对抗相关匹配的保结构立体视图合成 | |
DynTypo: Example-Based Dynamic Text Effects Transfer | DynTypo:基于示例的动态文本效果传输 | |
Arbitrary Style Transfer With Style-Attentional Networks | 基于样式注意力网络的任意样式转换 | |
Typography With Decor: Intelligent Text Style Transfer | 带装饰的印刷:智能文本样式转换 | |
RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion | 基于增强学习代理控制GAN网络的实时点云形状补全 | |
Photo Wake-Up: 3D Character Animation From a Single Photo | 照片唤醒:来自单个照片的3D角色动画 | |
DeepLight: Learning Illumination for Unconstrained Mobile Mixed Reality | DeepLight:基于照明学习的无约束移动混合现实 | |
Iterative Residual CNNs for Burst Photography Applications | 迭代残差CNN在突发摄影中的应用 | |
Learning Implicit Fields for Generative Shape Modeling | 基于隐式域学习的生成形状建模 | |
Reliable and Efficient Image Cropping: A Grid Anchor Based Approach | 可靠高效的图像裁剪:基于网格锚的方法 | |
Patch-Based Progressive 3D Point Set Upsampling | 基于Patch的渐进式三维点集上采样 | |
An Iterative and Cooperative Top-Down and Bottom-Up Inference Network for Salient Object Detection | 一种迭代协作的自顶向下和自下而上的显著目标检测推理网络 | |
Deep Stacked Hierarchical Multi-Patch Network for Image Deblurring | 用于图像去模糊的深度层级多Patch网络 | |
Turn a Silicon Camera Into an InGaAs Camera | 把硅相机变成InGaAs相机 | |
Low-Rank Tensor Completion With a New Tensor Nuclear Norm Induced by Invertible Linear Transforms | 由可逆线性变换诱导的新张量核模实现低阶张量完备 | |
Joint Representative Selection and Feature Learning: A Semi-Supervised Approach | 联合代表选择与特征学习:一种半监督方法 | |
The Domain Transform Solver | 域变换求解器 | |
CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection | CapSal:利用字幕增强语义实现显著目标检测 | |
Phase-Only Image Based Kernel Estimation for Single Image Blind Deblurring | 基于纯相位图像的核估计实现单图像盲去模糊 | |
Hierarchical Discrete Distribution Decomposition for Match Density Estimation | 基于层次离散分布分解的匹配密度估计 | |
FOCNet: A Fractional Optimal Control Network for Image Denoising | 一种用于图像去噪的分数阶最优控制网络 | 所谓“控制网络”就是将一类特殊的深度网络(满足公式1)解释为给定初始状态的动态系统(公式2)。然后求解这个动态系统的最优参数(及网络参数) |
Orthogonal Decomposition Network for Pixel-Wise Binary Classification | 像素级二元分类的正交分解网络 | |
Multi-Source Weak Supervision for Saliency Detection | 多源弱监督的显著性检测 | |
ComDefend: An Efficient Image Compression Model to Defend Adversarial Examples | 一种有效的图像压缩模型来防御对抗性的例子 | |
Combinatorial Persistency Criteria for Multicut and Max-Cut | 基于组合持久性准则的多Cut和最大Cut | |
S4Net: Single Stage Salient-Instance Segmentation | S4Net:单阶段显著实例分割 | |
A Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem | 稀疏广义特征值问题的分解算法 | |
Polynomial Representation for Persistence Diagram | 持久图的多项式表示 | |
Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks | 基于网格编码-译码器网络的人群计数和密度估计 | |
Cross-Atlas Convolution for Parameterization Invariant Learning on Textured Mesh Surface | 利用交叉图集卷积实现纹理网格表面的参数化不变学习 | |
Deep Surface Normal Estimation With Hierarchical RGB-D Fusion | 基于层次化RGB-D融合的深度表面法向估计 | |
Knowledge-Embedded Routing Network for Scene Graph Generation | 用于场景图生成的知识嵌入式路由网络 | |
An End-To-End Network for Panoptic Segmentation | 一种用于全光分割的端到端网络 | |
Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models | 基于深度卷积生成模型的快速灵活的室内场景合成 | |
Marginalized Latent Semantic Encoder for Zero-Shot Learning | 用于零镜头学习的边缘化潜在语义编码器 | |
Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation | 尺度自适应神经密集特征:基于层次上下文聚合的学习 | |
Unsupervised Embedding Learning via Invariant and Spreading Instance Feature | 基于不变量和扩展实例特征的无监督嵌入学习 | |
AOGNets: Compositional Grammatical Architectures for Deep Learning | AOGNets:用于深度学习的复合语法体系结构 | |
A Robust Local Spectral Descriptor for Matching Non-Rigid Shapes With Incompatible Shape Structures | 用于非刚性形状与不相容形状结构匹配的鲁棒局部谱描述符 | |
Context and Attribute Grounded Dense Captioning | 上下文和属性固定的密集字幕 | |
Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification | 斑点与学习:基于最大熵Patch采样的少镜头图像分类 | |
Interpreting CNNs via Decision Trees | 通过决策树解释CNN | |
Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning | 密集关系字幕:基于三流网络的关系字幕 | |
Deep Modular Co-Attention Networks for Visual Question Answering | 基于深度模块化协同注意网络的视觉问答 | |
Synthesizing Environment-Aware Activities via Activity Sketches | 通过活动草图的环境感知活动合成 | |
Self-Critical N-Step Training for Image Captioning | 基于自评N步训练的图像字幕 | |
Multi-Target Embodied Question Answering | 多目标体现问题问答 | |
Visual Question Answering as Reading Comprehension | 作为阅读理解的视觉问答 | |
StoryGAN: A Sequential Conditional GAN for Story Visualization | StoryGAN:用于故事可视化的序列条件GAN | |
Noise-Aware Unsupervised Deep Lidar-Stereo Fusion | 噪声感知无监督深度激光-雷达立体融合 | |
Versatile Multiple Choice Learning and Its Application to Vision Computing | 多选择学习及其在视觉计算中的应用 | |
EV-Gait: Event-Based Robust Gait Recognition Using Dynamic Vision Sensors | EV-Gait:动态视觉传感器中基于事件的鲁棒步态识别 | |
ToothNet: Automatic Tooth Instance Segmentation and Identification From Cone Beam CT Images | ToothNet:基于锥束CT图像的牙齿实例自动分割与识别 | |
Modularized Textual Grounding for Counterfactual Resilience | 反事实弹性的模块化文本基础 | |
L3-Net: Towards Learning Based LiDAR Localization for Autonomous Driving | L3-Net:面向自主驾驶的学习型激光雷达定位 | |
Panoptic Feature Pyramid Networks | 泛光特征金字塔网络 | |
Mask Scoring R-CNN | 模板评分R-CNN | |
Reasoning-RCNN: Unifying Adaptive Global Reasoning Into Large-Scale Object Detection | Reasoning-RCNN:将自适应全局推理统一到大规模目标检测中 | |
Cross-Modality Personalization for Retrieval | 交叉模态个性化检索 | |
Composing Text and Image for Image Retrieval - an Empirical Odyssey | 为图像检索组合文本和图像-一个经验Odyssey | |
Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation | 基于自适应文本区域表示的任意形状场景文本检测 | |
Adaptive NMS: Refining Pedestrian Detection in a Crowd | 自适应NMS:改进人群中的行人检测 | |
Point in, Box Out: Beyond Counting Persons in Crowds | 点进,框出:人群计数之外 | |
Locating Objects Without Bounding Boxes | 无需边界框的目标定位 | |
FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery | FineGAN:基于无监督层次分离的细粒度对象生成和发现 | |
Mutual Learning of Complementary Networks via Residual Correction for Improving Semi-Supervised Classification | 基于残差校正的互补网络互学习实现半监督分类的改进 | |
Sampling Techniques for Large-Scale Object Detection From Sparsely Annotated Objects | 稀疏注记对象的大尺度目标检测的抽样技术 | |
Curls & Whey: Boosting Black-Box Adversarial Attacks | Curls与Whey:增强黑盒对抗性攻击 | |
Barrage of Random Transforms for Adversarially Robust Defense | 基于随机变换的拦河坝的对抗性防御 | |
Aggregation Cross-Entropy for Sequence Recognition | 基于聚集交叉熵的序列识别 | |
LaSO: Label-Set Operations Networks for Multi-Label Few-Shot Learning | LaSO:用于多标签少镜头学习的标签集操作网络 | |
Few-Shot Learning With Localization in Realistic Settings | 在现实环境中定位的少镜头学习 | |
AdaGraph: Unifying Predictive and Continuous Domain Adaptation Through Graphs | AdaGraph:通过图的统一预测和连续域自适应 | |
Grounded Video Description | 接地视频描述 | |
Streamlined Dense Video Captioning | 流线型密集视频字幕 | |
Adversarial Inference for Multi-Sentence Video Description | 多句视频描述的对抗推理 | |
Unified Visual-Semantic Embeddings: Bridging Vision and Language With Structured Meaning Representations | 统一的视觉-语义嵌入:利用结构化的意义表达将视觉和语言连接起来 | |
Learning to Compose Dynamic Tree Structures for Visual Contexts | 学习为视觉上下文构建动态树结构 | |
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation | 基于强化跨模态匹配与自监督模拟学习的视觉语言导航 | |
Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering | 基于模式内和模式间注意力流动态融合的视觉问答 | |
Cycle-Consistency for Robust Visual Question Answering | 基于循环一致性的鲁棒视觉问答 | |
Embodied Question Answering in Photorealistic Environments With Point Cloud Perception | 基于点云感知的真实感摄影环境中的具体问题解答 | |
Reasoning Visual Dialogs With Structural and Partial Observations | 基于结构和局部观察的视觉对话推理 | |
Recursive Visual Attention in Visual Dialog | 视觉对话中的递归视觉注意 | |
Two Body Problem: Collaborative Visual Task Completion | 二体问题:协同视觉任务补全 | |
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering | GQA:一个新的数据集,用于现实世界的视觉推理和组合问题解答 | |
Text2Scene: Generating Compositional Scenes From Textual Descriptions | Text2Scene:根据文本描述生成合成场景 | |
From Recognition to Cognition: Visual Commonsense Reasoning | 从识别到认知:视觉常识推理 | |
The Regretful Agent: Heuristic-Aided Navigation Through Progress Estimation | 遗憾的代理人:通过进度估计的启发式辅助导航 | |
Tactical Rewind: Self-Correction via Backtracking in Vision-And-Language Navigation | 战术倒带:视觉-语言导航中的利用回溯进行自我修正 | |
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning | 学习如何学习:基于元学习的自适应视觉导航 | |
High Flux Passive Imaging With Single-Photon Sensors | 单光子传感器的高通量被动成像 | |
Photon-Flooded Single-Photon 3D Cameras | 光子淹没的单光子3D相机 | |
Acoustic Non-Line-Of-Sight Imaging | 声音的非视线成像 | |
Steady-State Non-Line-Of-Sight Imaging | 稳态非视线成像 | |
A Theory of Fermat Paths for Non-Line-Of-Sight Shape Reconstruction | 非视线形状重建的费马路径理论 | |
End-To-End Projector Photometric Compensation | 端到端投影仪光度补偿 | |
Bringing a Blurry Frame Alive at High Frame-Rate With an Event Camera | 使用事件照相机以高帧速率激活模糊帧 | |
Bringing Alive Blurred Moments | 激活模糊的瞬间 | |
Learning to Synthesize Motion Blur | 学习合成运动模糊 | |
Underexposed Photo Enhancement Using Deep Illumination Estimation | 使用深度亮度估计的曝光不足照片增强 | |
Blind Visual Motif Removal From a Single Image | 从单个图像中盲视觉装饰删除 | |
Non-Local Meets Global: An Integrated Paradigm for Hyperspectral Denoising | 非局部满足全局:高光谱图像去噪的一个综合范例 | |
Neural Rerendering in the Wild | 野外环境下神经再分化 | |
GeoNet: Deep Geodesic Networks for Point Cloud Analysis | GeoNet:基于深度测地线网络的点云分析 | |
MeshAdv: Adversarial Meshes for Visual Recognition | MeshAdv:用于视觉识别的对抗性网格 | |
Fast Spatially-Varying Indoor Lighting Estimation | 快速空间变化的室内亮度估计 | |
Neural Illumination: Lighting Prediction for Indoor Environments | 神经照明:室内环境的照明预测 | |
Deep Sky Modeling for Single Image Outdoor Lighting Estimation | 单图像室外照明估计的深度天空建模 | |
Bidirectional Learning for Domain Adaptation of Semantic Segmentation | 基于域自适应双向学习的语义分割 | 基于域自适应弱监督语义分割:两个子网(有标签的源到无标签目标的translate网,及目标域的语义分割网),传统的方法是先从源到目标,再训练。本文方法是translate与分割两个子网之间双向训练(影响) |
Enhanced Bayesian Compression via Deep Reinforcement Learning | 通过深度强化学习增强贝叶斯压缩 | |
Strong-Weak Distribution Alignment for Adaptive Object Detection | 基于强-弱分布对齐的自适应目标检测 | |
MFAS: Multimodal Fusion Architecture Search | 多模式融合体系结构搜索 | |
Disentangling Adversarial Robustness and Generalization | 脱离对抗的鲁棒性和泛化 | |
ShieldNets: Defending Against Adversarial Attacks Using Probabilistic Adversarial Robustness | ShieldNets:使用概率对抗鲁棒性防御对抗攻击 | |
Deeply-Supervised Knowledge Synergy | 深度监督知识协同 | |
Dual Residual Networks Leveraging the Potential of Paired Operations for Image Restoration | 利用配对操作的潜力进行图像恢复的对偶残差网络 | 将残差块内的操作设计(分割)为两个对偶的操作(文中定义了4种对偶操作,Fig.4),然后将残差块设计为如图1d的方式,意图是充分开发对偶操作的潜能(传统的方式没有对偶的概念,如Fig.1a)。 文中针对五种常见图像恢复问题,采用Fig.4提出的不同对偶操作,定义不同的残差块,来实现图像恢复 |
Probabilistic End-To-End Noise Correction for Learning With Noisy Labels | 基于概率端到端噪声校正的带噪声标签学习 | |
Attention-Guided Unified Network for Panoptic Segmentation | 基于注意引导统一网络的全光分割 | |
NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection | NAS-FPN:通过金字塔结构的可扩展特征学习实现对象检测 | |
OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks | 信道外稀疏正则化实现紧凑型深度神经网络 | |
Semantically Aligned Bias Reducing Zero Shot Learning | 语义一致偏差减少的零镜头学习 | |
Feature Space Perturbations Yield More Transferable Adversarial Examples | 特征空间扰动产生更多可转移的对抗性例子 | |
IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction | IGE-Net:用于人体姿态估计和单视图重建的逆图形能量网络 | |
Accelerating Convolutional Neural Networks via Activation Map Compression | 通过激活映射压缩加速卷积神经网络 | |
Knowledge Distillation via Instance Relationship Graph | 基于实例关系图的知识提取 | |
PPGNet: Learning Point-Pair Graph for Line Segment Detection | PPGNet:用于线段分割检测的点对图学习 | |
Building Detail-Sensitive Semantic Segmentation Networks With Polynomial Pooling | 基于多项式池化的细节敏感语义分割网络的构建 | 提出了一个(适用于语义分割)的,介于平均池化和最大池化之间的多项式池化(公式1),并对其进行分析和实验 |
Variational Bayesian Dropout With a Hierarchical Prior | 具有层次先验的变分贝叶斯Dropout | |
AANet: Attribute Attention Network for Person Re-Identifications | AANet:用于人重识别的属性注意网络 | |
Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction | 克服混合密度网络的局限性:多模式未来预测的抽样和拟合框架 | |
A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks | 简化二元神经网络的主/子网络框架 | |
PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet | PointNetLk:使用PointNet的健壮高效的点云配准 | |
Few-Shot Adaptive Faster R-CNN | 少镜头自适应快速R-CNN | |
VRSTC: Occlusion-Free Video Person Re-Identification | VRSTC:无遮挡视频人重识别 | |
Compact Feature Learning for Multi-Domain Image Classification | 多域图像分类的压缩特征学习 | |
Adaptive Transfer Network for Cross-Domain Person Re-Identification | 跨域人再识别的自适应传输网络 | |
Large-Scale Few-Shot Learning: Knowledge Transfer With Class Hierarchy | 大规模少镜头学习:具有类层次的知识转移 | |
Moving Object Detection Under Discontinuous Change in Illumination Using Tensor Low-Rank and Invariant Sparse Decomposition | 基于张量低阶不变稀疏分解的光照不连续变化下运动目标检测 | |
Pedestrian Detection With Autoregressive Network Phases | 基于自回归网络相位的行人检测 | |
All You Need Is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification | 你所需要的只是一些转变:为图像分类设计高效的卷积神经网络 | |
Stochastic Class-Based Hard Example Mining for Deep Metric Learning | 随机的基于类的硬实例挖掘实现深度度量学习 | |
Revisiting Local Descriptor Based Image-To-Class Measure for Few-Shot Learning | 基于局部描述子的图像-类测度实现少镜头学习 | |
Towards Robust Curve Text Detection With Conditional Spatial Expansion | 基于条件空间扩展的鲁棒曲线文本检测 | |
Revisiting Perspective Information for Efficient Crowd Counting | 基于透视信息的有效的人群计数 | |
Towards Universal Object Detection by Domain Attention | 基于域关注的通用目标检测 | |
Ensemble Deep Manifold Similarity Learning Using Hard Proxies | 基于硬代理的集成深度流形相似性学习 | |
Quantization Networks | 量化网络 | |
RES-PCA: A Scalable Approach to Recovering Low-Rank Matrices | 一种低秩矩阵恢复的可扩展方法 | |
Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks | Occlusion-Net:使用图形网络进行二维/三维遮挡关键点定位 | |
Efficient Featurized Image Pyramid Network for Single Shot Detector | 单镜头探测器的高效特征图像金字塔网络 | |
Multi-Task Multi-Sensor Fusion for 3D Object Detection | 用于三维目标检测的多任务多传感器融合 | |
Domain-Specific Batch Normalization for Unsupervised Domain Adaptation | 用于无监督域适应的特定域批处理规范化 | |
Grid R-CNN | 网格R-CNN | |
MetaCleaner: Learning to Hallucinate Clean Representations for Noisy-Labeled Visual Recognition | 元清洗器:用于噪声标签视觉识别的幻觉干净表示学习 | |
Mapping, Localization and Path Planning for Image-Based Navigation Using Visual Features and Map | 利用视觉特征和地图进行基于地图绘制、定位和路径规划的图像导航 | |
Triply Supervised Decoder Networks for Joint Detection and Segmentation | 用于联合检测和分割的三重监督解码器网络 | |
Leveraging the Invariant Side of Generative Zero-Shot Learning | 利用生成零镜头学习的不变边 | |
Exploring the Bounds of the Utility of Context for Object Detection | 探索上下文在目标检测中的应用边界 | |
A-CNN: Annularly Convolutional Neural Networks on Point Clouds | A-CNN:点云上的环形卷积神经网络 | |
DARNet: Deep Active Ray Network for Building Segmentation | 用于建筑物分割的深度主动射线网络 | |
Point Cloud Oversegmentation With Graph-Structured Deep Metric Learning | 基于图结构深度度量学习的点云过分割 | |
Graphonomy: Universal Human Parsing via Graph Transfer Learning | 笔迹学:通过图迁移学习的通用人类分析 | |
Fitting Multiple Heterogeneous Models by Multi-Class Cascaded T-Linkage | 用多类级联T-连杆实现多个异构模型拟合 | |
A Late Fusion CNN for Digital Matting | 用于数字抠图的后期融合CNN | |
BASNet: Boundary-Aware Salient Object Detection | BASNet:边界感知显著目标检测 | |
ZigZagNet: Fusing Top-Down and Bottom-Up Context for Object Segmentation | ZigZagNet:融合自上而下和自下而上的上下文进行对象分割 | |
Object Instance Annotation With Deep Extreme Level Set Evolution | 基于深度极值水平集演化的对象实例注释 | |
Leveraging Crowdsourced GPS Data for Road Extraction From Aerial Imagery | 利用众包GPS数据从航空影像中提取道路 | |
Adaptive Pyramid Context Network for Semantic Segmentation | 用于语义分割的自适应金字塔上下文网络 | 研究Content信息在语义分割中的作用,指出context信息的使用应满足三个特点:1. 多尺度;2. 自适应;3. 全局引导局部Affinity(两两相关),并提出ACM(自适应Context模块,见图2),从中可见,多尺度、自适应(体现在其Affinity矩阵是学习而来的),全局引导局部Affinity(矩阵相乘) |
Isospectralization, or How to Hear Shape, Style, and Correspondence | 同构,或如何听到形状、风格和对应 | |
Speech2Face: Learning the Face Behind a Voice | Speech2Face:学习声音背后的表情 | |
Joint Manifold Diffusion for Combining Predictions on Decoupled Observations | 联合流形扩散用于解耦合观测的组合预测 | |
Audio Visual Scene-Aware Dialog | 视听的场景感知对话 | |
Learning to Minify Photometric Stereo | 学习缩小光度立体 | |
Reflective and Fluorescent Separation Under Narrow-Band Illumination | 窄带照明下的反射和荧光分离 | |
Depth From a Polarisation + RGB Stereo Pair | 基于极化+RGB立体配对的深度 | |
Rethinking the Evaluation of Video Summaries | 对视频摘要评价的再思考 | |
What Object Should I Use? - Task Driven Object Detection | 我应该使用什么对象?-任务驱动的对象检测 | |
Triangulation Learning Network: From Monocular to Stereo 3D Object Detection | 三角测量学习网络:从单目到立体三维目标检测 | |
Connecting the Dots: Learning Representations for Active Monocular Depth Estimation | 连接点:主动单目深度估计的学习表示法 | |
Learning Non-Volumetric Depth Fusion Using Successive Reprojections | 利用连续重投影学习非体积深度融合 | |
Stereo R-CNN Based 3D Object Detection for Autonomous Driving | 基于立体R-CNN的自主驾驶三维目标检测 | |
Hybrid Scene Compression for Visual Localization | 用于视觉定位的混合场景压缩 | |
MMFace: A Multi-Metric Regression Network for Unconstrained Face Reconstruction | 多尺度回归网络在无约束人脸重建中的应用 | |
3D Motion Decomposition for RGBD Future Dynamic Scene Synthesis | 三维运动分解在RGBD未来动态场景合成中的应用 | |
Single Image Depth Estimation Trained via Depth From Defocus Cues | 利用离焦线索实现基于深度的单图像深度估计训练 | |
RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion | 基于RGBD的维度分解残差分解网络实现三维语义场景补全 | |
Neural Scene Decomposition for Multi-Person Motion Capture | 基于神经场景分解的多人运动捕捉 | |
Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition | 对人脸识别的基于决策的黑盒对抗攻击 | |
FA-RPN: Floating Region Proposals for Face Detection | FA-RPN:基于浮动区域建议的人脸检测 | |
Bayesian Hierarchical Dynamic Model for Human Action Recognition | 基于贝叶斯层次动态模型的人类行为识别 | |
Mixed Effects Neural Networks (MeNets) With Applications to Gaze Estimation | 混合效应神经网络及其在凝视估计中的应用 | |
3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training | 基于时间卷积和半监督训练的视频三维人体姿态估计 | |
Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision | 学习在没有3D监控的情况下从图像中回归3D人脸形状和表情 | |
PoseFix: Model-Agnostic General Human Pose Refinement Network | PoseFix:模型不可知论通用人体姿势优化网络 | |
RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation | RepNet:用于三维人体姿态估计的弱监督训练对抗性再投射网络 | |
Fast and Robust Multi-Person 3D Pose Estimation From Multiple Views | 多视图多人三维姿态快速鲁棒估计 | |
Face-Focused Cross-Stream Network for Deception Detection in Videos | 面向人脸的交叉流网络实现视频欺骗检测 | |
Unequal-Training for Deep Face Recognition With Long-Tailed Noisy Data | 利用非均匀训练实现长尾噪声数据下人脸识别 | |
T-Net: Parametrizing Fully Convolutional Nets With a Single High-Order Tensor | T-Net:用一个高阶张量参数化全卷积网 | |
Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss | 基于动态像素级损失的层次化跨模态说话人脸生成 | |
Object-Centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video | 基于目标中心自动编码器和虚拟异常的视频异常事件检测 | |
DDLSTM: Dual-Domain LSTM for Cross-Dataset Action Recognition | DDLSTM:基于双域LSTM的跨数据集动作识别 | |
The Pros and Cons: Rank-Aware Temporal Attention for Skill Determination in Long Videos | 利弊:等级感知的时间关注实现长视频中技能决定 | |
Collaborative Spatiotemporal Feature Learning for Video Action Recognition | 基于协同时空特征学习的视频动作识别 | |
MARS: Motion-Augmented RGB Stream for Action Recognition | MARS:用于动作识别的运动增强RGB流 | |
Convolutional Relational Machine for Group Activity Recognition | 用于群体活动识别的卷积关系机 | |
Video Summarization by Learning From Unpaired Data | 从未配对数据中学习视频摘要 | |
Skeleton-Based Action Recognition With Directed Graph Neural Networks | 利用有向图神经网络实现基于骨架的动作识别 | |
PA3D: Pose-Action 3D Machine for Video Recognition | PA3D:基于姿势动作3D机的视频识别 | |
Deep Dual Relation Modeling for Egocentric Interaction Recognition | 基于深度对偶关系模型的自我中心交互识别 | |
MOTS: Multi-Object Tracking and Segmentation | 多目标跟踪与分割 | 1. 提出了同时跟踪与(像素级)分割的(训练)数据库 2. 提出了跟踪组分割过程中的距离测量方法 3. 提出了基于Mask-RCNN的检测、分割方法 该算法是每帧检测,然后link检测结果 代码:https://www.vision.rwth-aachen.de/page/mots |
Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking | 基于孪生级联区域建议网络的实时视觉跟踪 | |
PointFlowNet: Learning Representations for Rigid Motion Estimation From Point Clouds | PointFlowNet:点云刚性运动估计的表示法学习 | |
Listen to the Image | 听图像 | |
Image Super-Resolution by Neural Texture Transfer | 基于神经纹理传递的图像超分辨率 | |
Conditional Adversarial Generative Flow for Controllable Image Synthesis | 基于条件对抗生成流的可控图像合成 | |
How to Make a Pizza: Learning a Compositional Layer-Based GAN Model | 如何制作披萨:学习基于合成层的GAN模型 | |
TransGaGa: Geometry-Aware Unsupervised Image-To-Image Translation | TransGaGa:几何感知无监督图像到图像的翻译 | |
Depth-Attentional Features for Single-Image Rain Removal | 基于深度注意特征的单图像雨水去除 | |
Hyperspectral Image Reconstruction Using a Deep Spatial-Spectral Prior | 基于深度空-谱先验的高光谱图像重建 | |
LiFF: Light Field Features in Scale and Depth | LiFF:在尺度和深度上的光场特征 | |
Deep Exemplar-Based Video Colorization | 深度基于示例的视频着色 | |
On Finding Gray Pixels | 关于寻找灰色像素 | |
UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos | UnOS:通过观看视频进行统一的无监督光流和立体深度估计 | |
Learning Transformation Synchronization | 学习转换同步 | |
D2-Net: A Trainable CNN for Joint Description and Detection of Local Features | D2-Net:用于联合描述和检测局部特征的一个可训练的CNN | |
Recurrent Neural Networks With Intra-Frame Iterations for Video Deblurring | 视频去模糊的帧内迭代递归神经网络 | |
Learning to Extract Flawless Slow Motion From Blurry Videos | 学习从模糊的视频中提取完美的慢动作 | |
Natural and Realistic Single Image Super-Resolution With Explicit Natural Manifold Discrimination | 利用显式的自然流形识别实现自然与真实的单图像超分辨率 | |
RF-Net: An End-To-End Image Matching Network Based on Receptive Field | RF-Net:基于接收场的端到端图像匹配网络 | |
Fast Single Image Reflection Suppression via Convex Optimization | 基于凸优化的快速单图像反射抑制 | |
A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision | 一种多监督交织的显著目标检测互学习方法 | |
Enhanced Pix2pix Dehazing Network | 增强型Pix2Pix去雾网络 | |
Assessing Personally Perceived Image Quality via Image Features and Collaborative Filtering | 通过图像特征和协同滤波实现个人感知图像质量评估 | |
Single Image Reflection Removal Exploiting Misaligned Training Data and Network Enhancements | 利用不对齐训练数据和网络增强实现单一图像反射消除 | |
Exploring Context and Visual Pattern of Relationship for Scene Graph Generation | 利用关系的上下文和视觉模式实现场景图形生成 | |
Learning From Synthetic Data for Crowd Counting in the Wild | 合成数据学习实现野外群体计数 | |
A Local Block Coordinate Descent Algorithm for the CSC Model | CSC模型的局部块坐标下降算法 | |
Not Using the Car to See the Sidewalk -- Quantifying and Controlling the Effects of Context in Classification and Segmentation | 不使用汽车看人行道--在分类和分割中量化和控制上下文的影响 | |
Discovering Fair Representations in the Data Domain | 发现数据域中的公平表示 | |
Actor-Critic Instance Segmentation | 演员评论实例分割 | |
Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders | 基于变分自编码器对齐的广义零镜头和少镜头学习 | |
Semantic Projection Network for Zero- and Few-Label Semantic Segmentation | 零标签和少标签语义分割的语义投影网络 | 零镜头或少镜头的语义分割,网络结构如图2 |
GCAN: Graph Convolutional Adversarial Network for Unsupervised Domain Adaptation | 无监督域适应的图卷积对抗网络 | |
Seamless Scene Segmentation | 无缝场景分割 | |
Unsupervised Image Matching and Object Discovery as Optimization | 无监督图像匹配和目标发现优化 | |
Wide-Area Crowd Counting via Ground-Plane Density Maps and Multi-View Fusion CNNs | 通过地面密度图和多视图融合CNN实现广域人群计数 | |
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions | 显示、控制和讲述:生成可控和固定字幕的框架 | |
Towards VQA Models That Can Read | 面向可读取的VQA模型 | |
Object-Aware Aggregation With Bidirectional Temporal Graph for Video Captioning | 基于双向时间图的对象感知聚合实现视频字幕 | |
Progressive Attention Memory Network for Movie Story Question Answering | 基于渐进式注意力记忆网络的电影故事问答 | |
Memory-Attended Recurrent Network for Video Captioning | 基于内存参与循环网络的视频字幕 | |
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning | 基于实体属性图匹配推理的视觉问答 | |
Look Back and Predict Forward in Image Captioning | 基于回顾与预测的图像字幕 | |
Explainable and Explicit Visual Reasoning Over Scene Graphs | 基于场景图的可解释和显式视觉推理 | |
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering | 通过无监督任务发现的迁移学习以进行视觉问答 | |
Intention Oriented Image Captions With Guiding Objects | 带有引导对象的意向性图像标题 | |
Uncertainty Guided Multi-Scale Residual Learning-Using a Cycle Spinning CNN for Single Image De-Raining | 基于不确定性的循环旋转CNN多尺度残差学习实现单图像去雨 | |
Toward Realistic Image Compositing With Adversarial Learning | 基于对抗学习的现实图像组合 | |
Cross-Classification Clustering: An Efficient Multi-Object Tracking Technique for 3-D Instance Segmentation in Connectomics | 交叉分类聚类:一种有效的连接体三维实例分割多目标跟踪技术 | |
Deep ChArUco: Dark ChArUco Marker Pose Estimation | Deep ChArUco:基于暗ChArUco标记的姿态估计 | |
Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving | 基于视觉深度估计的伪激光雷达:在自主驾驶的三维目标检测中架起桥梁 | |
Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions | 道路规则:用语义交互卷积模型预测驾驶行为 | |
Metric Learning for Image Registration | 图像配准的度量学习 | |
LO-Net: Deep Real-Time Lidar Odometry | LO-Net:深度实时激光雷达里程计 | |
TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions | TraPHic:基于加权相互作用的密集和非均匀交通中的轨道预测 | |
World From Blur | 模糊世界 | |
Topology Reconstruction of Tree-Like Structure in Images via Structural Similarity Measure and Dominant Set Clustering | 基于结构相似性测度和优势集聚类的图像树型结构拓扑重构 | |
Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training | 基于多损失动态训练的金字塔人再识别 | |
Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning From Radiology Reports and Label Ontology | 对不同CT图像的临床重要发现的整体和全面注释:从放射学报告和标记本体学学习 | |
Robust Histopathology Image Analysis: To Label or to Synthesize? | 鲁棒的组织病理学图像分析:贴标签还是合成? | |
Data Augmentation Using Learned Transformations for One-Shot Medical Image Segmentation | 利用学习变换进行单镜头医学图像分割的数据增强 | |
Shifting More Attention to Video Salient Object Detection | 将更多的注意力转移到视频显著物体检测上 | |
Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration | 神经任务图:从单个视频演示中归纳为未看到的任务 | |
Beyond Tracking: Selecting Memory and Refining Poses for Deep Visual Odometry | 超越追踪:利用选择记忆和调整姿势实现深度视觉里程计 | |
Image Generation From Layout | 从布局生成图像 | |
Multimodal Explanations by Predicting Counterfactuality in Videos | 利用视频中反事实预测实现多模态解释 | |
Learning to Explain With Complemental Examples | 学习用互补的例子解释 | |
HAQ: Hardware-Aware Automated Quantization With Mixed Precision | 利用混合精度实现硬件感知的自动量化 | |
Content Authentication for Neural Imaging Pipelines: End-To-End Optimization of Photo Provenance in Complex Distribution Channels | 神经成像管道的内容认证:复杂分布通道中端到端的光源优化 | |
Inverse Procedural Modeling of Knitwear | 针织品的逆过程建模 | |
Estimating 3D Motion and Forces of Person-Object Interactions From Monocular Video | 从单目视频估计人-物交互的三维运动和力 | |
DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds | DeepMapping:多点云的无监督地图估计 | |
End-To-End Interpretable Neural Motion Planner | 端到端可解释神经运动规划器 | |
Divergence Triangle for Joint Training of Generator Model, Energy-Based Model, and Inferential Model | 基于发散三角形的生成模型、能量模型和推理模型联合训练 | |
Image Deformation Meta-Networks for One-Shot Learning | 基于图像变形元网络的单镜头学习 | |
Online High Rank Matrix Completion | 在线高阶矩阵补全 | |
Multispectral Imaging for Fine-Grained Recognition of Powders on Complex Backgrounds | 复杂背景下利用多光谱成像实现粉末的细粒度识别 | |
ContactDB: Analyzing and Predicting Grasp Contact via Thermal Imaging | ContactDB:通过热成像分析和预测抓握接触 | |
Robust Subspace Clustering With Independent and Piecewise Identically Distributed Noise Modeling | 具有独立和分段一致分布噪声建模的鲁棒子空间聚类 | |
What Correspondences Reveal About Unknown Camera and Motion Models? | 关于未知的摄像机和运动模型,有什么通讯揭示? | |
Self-Calibrating Deep Photometric Stereo Networks | 自校准深度光度立体网络 | |
Argoverse: 3D Tracking and Forecasting With Rich Maps | Argoverse:用丰富的地图进行三维跟踪和预测 | |
Side Window Filtering | 侧窗滤波 | 一种保边缘/结构的滤波窗设计(图2)。将待处理的像素放到边缘(而不是中心)。这个方法有利于保边缘,但是降噪能力估计有所下降(根据公式4,若在平坦区,则必然下降) |
Defense Against Adversarial Images Using Web-Scale Nearest-Neighbor Search | 使用网络规模的最近邻搜索防御敌对图像 | |
Incremental Object Learning From Contiguous Views | 从相邻视图进行增量对象学习 | |
IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition | IP102:昆虫害虫识别的大规模基准数据集 | |
CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification | CityFlow:多目标多摄像机车辆跟踪与再识别的城市尺度基准 | |
Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence | Social-IQ:人工社会智能的问答基准 | |
UPSNet: A Unified Panoptic Segmentation Network | 统一全光分割网络 | |
JSIS3D: Joint Semantic-Instance Segmentation of 3D Point Clouds With Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields | JSIS3D:基于多任务点态网络和多值条件随机域的三维点云联合语义实例分割 | |
Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth | 联合优化空间嵌入和聚类带宽的实例分割 | 基于聚类的(proposal-free)实例分割方法的改进,如图2,两个分支,一个分支用于预测object center(seed branch),另一个分支用于使用object center来预测实例图。本文关注对于不同大小实例采用不同margin(传统方法为相同margin)的改进算法 |
DeepCO3: Deep Instance Co-Segmentation by Co-Peak Search and Co-Saliency Detection | DeepCO3:基于共峰搜索和共显著性检测的深度实例共分割 | |
Improving Semantic Segmentation via Video Propagation and Label Relaxation | 通过视频传播和标签松弛改进语义分割 | |
Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video | 一种用于视频语义分割的校正融合网络 | |
Shape2Motion: Joint Analysis of Motion Parts and Attributes From 3D Shapes | Shape2Motion:三维形状的运动部件和属性的联合分析 | |
Semantic Correlation Promoted Shape-Variant Context for Segmentation | 语义关联促进的形状变量上下文实现分割 | |
Relation-Shape Convolutional Neural Network for Point Cloud Analysis | 基于关系-形状卷积神经网络的点云分析 | |
Enhancing Diversity of Defocus Blur Detectors via Cross-Ensemble Network | 利用交叉-集成网络提高离焦模糊探测器的多样性 | |
BubbleNets: Learning to Select the Guidance Frame in Video Object Segmentation by Deep Sorting Frames | BubbleNets:通过深度排序帧实现视频对象分割中的制导帧选择学习 | |
Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images | 用于超高分辨率图像内存高效分割的协作全局-局部网络 | |
Efficient Parameter-Free Clustering Using First Neighbor Relations | 基于第一邻域关系的高效无参数聚类 | |
Learning Personalized Modular Network Guided by Structured Knowledge | 基于结构化知识的个性化模块化网络学习 | |
A Generative Appearance Model for End-To-End Video Object Segmentation | 端到端视频对象分割的生成性外观模型 | |
A Flexible Convolutional Solver for Fast Style Transfers | 用于快速样式转换的灵活卷积求解器 | |
Cross Domain Model Compression by Structurally Weight Sharing | 基于结构化权值共享的跨域模型压缩 | |
TraVeLGAN: Image-To-Image Translation by Transformation Vector Learning | TraVelGAN:通过变换矢量学习实现图像-图像的翻译 | |
Deep Robust Subjective Visual Property Prediction in Crowdsourcing | 众包中的深度鲁棒主观视觉特性预测 | |
Transferable AutoML by Model Sharing Over Grouped Datasets | 分组数据集上模型共享实现可转移AutoML | |
Learning Not to Learn: Training Deep Neural Networks With Biased Data | 学习不学习:用有偏数据训练深度神经网络 | |
IRLAS: Inverse Reinforcement Learning for Architecture Search | IRLAS:建筑搜索的逆强化学习 | |
Learning for Single-Shot Confidence Calibration in Deep Neural Networks Through Stochastic Inferences | 基于随机推理的深度神经网络实现单镜头置信度校正学习 | |
Attention-Based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions | 在未知组合失真的情况下,基于注意的自适应选择实现图像复原 | |
Fully Learnable Group Convolution for Acceleration of Deep Neural Networks | 基于完全可学习群卷积的深度神经网络加速 | |
EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching From Scratch | 神经网络结构从头搜索的生态激励遗传方法 | |
Deep Incremental Hashing Network for Efficient Image Retrieval | 基于深度增量哈希网络的高效图像检索 | |
Robustness via Curvature Regularization, and Vice Versa | 通过曲率正则化的鲁棒性,反之亦然。 | |
SparseFool: A Few Pixels Make a Big Difference | SparseFool:几个像素会产生很大的差异 | |
Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks | 卷积神经网络的可解释和细粒度可视化解释 | |
Structured Pruning of Neural Networks With Budget-Aware Regularization | 基于预算感知正则化的神经网络结构剪枝 | |
MBS: Macroblock Scaling for CNN Model Reduction | MBS:基于宏块缩放的CNN模型缩减 | |
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells | 基于辅助单元的紧凑语义分割模型的快速神经结构搜索 | 用于语义分割的神经网络结构搜索方法(图1) |
Generating 3D Adversarial Point Clouds | 生成三维对抗点云 | |
Partial Order Pruning: For Best Speed/Accuracy Trade-Off in Neural Architecture Search | 部分顺序修剪:在神经架构搜索中实现最佳速度/精度权衡 | |
Memory in Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity From Spatiotemporal Dynamics | 记忆中的记忆:从时空动力学中学习高阶非平稳性的预测神经网络 | |
Variational Information Distillation for Knowledge Transfer | 基于变分信息蒸馏的知识转移 | |
You Look Twice: GaterNet for Dynamic Filter Selection in CNNs | 你看了两遍:基于GaterNet的CNNs动态过滤器选择 | |
SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360deg Images | SpherePHD:将CNNs应用于360deg图像的球面多面体表示 | |
ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network | ESPNetv2:一种轻量、节能、通用的卷积神经网络 | |
Assisted Excitation of Activations: A Learning Technique to Improve Object Detectors | 激活的辅助激发:一种改进目标检测器的学习技术 | |
Exploiting Edge Features for Graph Neural Networks | 图神经网络边缘特征的开发 | |
Propagation Mechanism for Deep and Wide Neural Networks | 深度宽神经网络的传播机制 | |
Catastrophic Child's Play: Easy to Perform, Hard to Defend Adversarial Attacks | 灾难性的儿童游戏:易于执行,难以防御对抗性攻击 | |
Embedding Complementary Deep Networks for Image Classification | 基于嵌入互补深度网络的图像分类 | |
Deep Multimodal Clustering for Unsupervised Audiovisual Learning | 基于深度多模态聚类的无监督视听学习 | |
Dense Classification and Implanting for Few-Shot Learning | 密集分类和植入技术在少镜头学习中的应用 | |
Class-Balanced Loss Based on Effective Number of Samples | 基于有效样本数的类平衡损失 | |
Discovering Visual Patterns in Art Collections With Spatially-Consistent Feature Learning | 利用空间一致性特征学习发现艺术藏品中的视觉模式 | |
Min-Max Statistical Alignment for Transfer Learning | 基于最小-最大统计对齐的迁移学习 | |
Spatial-Aware Graph Relation Network for Large-Scale Object Detection | 基于空间感知图形关系网络的大规模目标检测 | |
Deformable ConvNets V2: More Deformable, Better Results | 变形ConvNets v2:变形性更强,效果更好 | |
Interaction-And-Aggregation Network for Person Re-Identification | 用于人重识别的交互和聚合网络 | |
Rare Event Detection Using Disentangled Representation Learning | 基于分离表示学习的罕见事件检测 | |
Shape Robust Text Detection With Progressive Scale Expansion Network | 基于渐进式尺度扩展网络的形状鲁棒文本检测 | |
Dual Encoding for Zero-Example Video Retrieval | 零示例视频检索的双重编码 | |
MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors | MaxpoolNMS:消除两阶段目标检测器中的NMS瓶颈 | |
Character Region Awareness for Text Detection | 基于字符区域意识的文本检测 | |
Effective Aesthetics Prediction With Multi-Level Spatially Pooled Features | 基于多层次空间池化特征的有效美学预测 | |
Attentive Region Embedding Network for Zero-Shot Learning | 基于专注区域嵌入网络的零镜头学习 | |
Explicit Spatial Encoding for Deep Local Descriptors | 基于显式空间编码的深度局部描述符 | |
Panoptic Segmentation | 全光分割 | |
You Reap What You Sow: Using Videos to Generate High Precision Object Proposals for Weakly-Supervised Object Detection | 你得到你所播种的:使用视频生成高精度目标建议实现弱监督目标检测 | |
Explore-Exploit Graph Traversal for Image Retrieval | 探索利用图遍历进行图像检索 | |
Dissimilarity Coefficient Based Weakly Supervised Object Detection | 基于相异系数的弱监督目标检测 | |
Kernel Transformer Networks for Compact Spherical Convolution | 基于核变换网络的紧凑球形卷积 | |
Object Detection With Location-Aware Deformable Convolution and Backward Attention Filtering | 基于位置感知的可变形卷积和反向注意滤波的目标检测 | |
Variational Prototyping-Encoder: One-Shot Learning With Prototypical Images | 变分原型编码器:基于原型图像的单镜头学习 | |
Unsupervised Domain Adaptation Using Feature-Whitening and Consensus Loss | 使用特征白化和共识损失的无监督域适应 | |
FEELVOS: Fast End-To-End Embedding Learning for Video Object Segmentation | FEELVOS:视频对象分割的快速端到端嵌入学习 | |
PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation | PartNet:一种用于细粒度层次形状分割的递归零件分解网络 | |
Learning Multi-Class Segmentations From Single-Class Datasets | 从单类数据集中学习多类分割 | |
Convolutional Recurrent Network for Road Boundary Extraction | 用于道路边界提取的卷积递归网络 | |
DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation | 面向实时语义分割的深度特征聚合 | 网络结构如图3,分为子网络特征聚合和子阶段特征聚合(如图2),速度比较快(100FPS) |
A Cross-Season Correspondence Dataset for Robust Semantic Segmentation | 一种鲁棒语义分割的跨季节对应数据集 | 相同场景,不同季节的数据集,如图2,每对图像创建对应点 |
ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features | ManTra-Net:用于检测和定位具有异常特征的图像伪造的操纵跟踪网 | |
On Zero-Shot Recognition of Generic Objects | 关于一般对象的零镜头识别 | |
Explicit Bias Discovery in Visual Question Answering Models | 视觉问答模型中的显式偏差发现 | |
REPAIR: Removing Representation Bias by Dataset Resampling | REPAIR:通过数据集重采样消除表示偏差 | |
Label Efficient Semi-Supervised Learning via Graph Filtering | 基于图过滤的标签高效半监督学习 | |
MVTec AD -- A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection | MVTec AD——一个用于无监督异常检测的综合现实数据集 | |
ABC: A Big CAD Model Dataset for Geometric Deep Learning | 一个用于几何深度学习的大型CAD模型数据集 | |
Tightness-Aware Evaluation Protocol for Scene Text Detection | 基于紧密性感知评估协议的场景文本检测 | |
PointConv: Deep Convolutional Networks on 3D Point Clouds | PointConv:三维点云上的深度卷积网络 | |
Octree Guided CNN With Spherical Kernels for 3D Point Clouds | 用于三维点云的具有球形核的八叉树引导的CNN | |
VITAMIN-E: VIsual Tracking and MappINg With Extremely Dense Feature Points | VITAMIN-E:具有极其密集特征点的视觉跟踪和绘图 | |
Conditional Single-View Shape Generation for Multi-View Stereo Reconstruction | 基于条件单视图形状生成的多视图立体重建 | |
Learning to Adapt for Stereo | 基于学习适应的立体 | |
3D Appearance Super-Resolution With Deep Learning | 基于深度学习的三维外观超分辨率 | |
Radial Distortion Triangulation | 径向畸变三角测量 | |
Robust Point Cloud Based Reconstruction of Large-Scale Outdoor Scenes | 基于点云的大规模室外场景重构 | |
Minimal Solvers for Mini-Loop Closures in 3D Multi-Scan Alignment | 三维多扫描对齐中用于微环闭合的最小解算器 | |
Volumetric Capture of Humans With a Single RGBD Camera via Semi-Parametric Learning | 通过半参数学习用单台RGBD相机对人体进行体积捕获 | |
Joint Face Detection and Facial Motion Retargeting for Multiple Faces | 联合人脸检测与面部运动重定位实现多人脸 | |
Monocular Depth Estimation Using Relative Depth Maps | 基于相对深度图的单目深度估计 | |
Unsupervised Primitive Discovery for Improved 3D Generative Modeling | 基于无监督原始发现的三维生成建模改进 | |
Learning to Explore Intrinsic Saliency for Stereoscopic Video | 学习探索立体视频的内在显著性 | |
Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on N-Spheres | 球面回归:学习N球体上的视点、曲面法线和三维旋转 | |
Refine and Distill: Exploiting Cycle-Inconsistency and Knowledge Distillation for Unsupervised Monocular Depth Estimation | 提炼:利用周期不一致性和知识蒸馏进行无监督单目深度估计 | |
Learning View Priors for Single-View 3D Reconstruction | 基于视图优先级学习的单视图三维重建 | |
Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation | 基于几何感知对称域自适应的单目深度估计 | |
Learning Monocular Depth Estimation Infusing Traditional Stereo Knowledge | 注入传统立体知识的单目深度估计学习 | |
SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception | 语义实例辅助的无监督三维几何感知 | |
3D Guided Fine-Grained Face Manipulation | 三维引导的细粒度人脸操作 | |
Neuro-Inspired Eye Tracking With Eye Movement Dynamics | 利用眼球运动动力学进行神经刺激的眼球跟踪 | |
Facial Emotion Distribution Learning by Exploiting Low-Rank Label Correlations Locally | 利用局部低阶标签相关进行面部情绪分布学习 | |
Unsupervised Face Normalization With Extreme Pose and Expression in the Wild | 利用野外极端姿势和表情实现无监督人脸标准化 | |
Semantic Component Decomposition for Face Attribute Manipulation | 基于语义成分分解的人脸属性操作 | |
R3 Adversarial Network for Cross Model Face Recognition | 基于R3对抗网络的跨模型人脸识别 | |
Disentangling Latent Hands for Image Synthesis and Pose Estimation | 分离潜手进行图像合成和姿态估计 | |
Generating Multiple Hypotheses for 3D Human Pose Estimation With Mixture Density Network | 用混合密度网络实现基于多假设生成的三维人体姿态估计 | |
CrossInfoNet: Multi-Task Information Sharing Based Hand Pose Estimation | CrossInfoNet:基于多任务信息共享的手势估计 | |
P2SGrad: Refined Gradients for Optimizing Deep Face Models | P2SGrad:基于梯度精化的深度人脸模型优化 | |
Action Recognition From Single Timestamp Supervision in Untrimmed Videos | 未剪辑视频中单时间戳监督的动作识别 | |
Time-Conditioned Action Anticipation in One Shot | 时间条件下的单镜头动作预期 | |
Dance With Flow: Two-In-One Stream Action Detection | 与流共舞:二合一流动作检测 | |
Representation Flow for Action Recognition | 基于表示流程的动作识别 | |
LSTA: Long Short-Term Attention for Egocentric Action Recognition | 基于长期短期关注的自我中心行为识别 | |
Learning Actor Relation Graphs for Group Activity Recognition | 基于参与者关系图学习的群体活动识别 | |
A Structured Model for Action Detection | 一种结构化的动作检测模型 | |
Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition | 广义零镜头动作识别的失配检测 | |
Object Discovery in Videos as Foreground Motion Clustering | 作为前景运动聚类的视频中的对象发现 | |
Towards Natural and Accurate Future Motion Prediction of Humans and Animals | 人类和动物的自然和准确的未来运动预测 | |
Automatic Face Aging in Videos via Deep Reinforcement Learning | 通过深度强化学习实现视频中的自动面部老化 | |
Multi-Adversarial Discriminative Deep Domain Generalization for Face Presentation Attack Detection | 面向人脸显示攻击检测的多对抗识别深度域生成 | |
A Content Transformation Block for Image Style Transfer | 基于内容转换块的图像样式转换 | |
BeautyGlow: On-Demand Makeup Transfer Framework With Reversible Generative Network | BeautyGlow:具有可逆生成网络的按需补给传输框架 | |
Style Transfer by Relaxed Optimal Transport and Self-Similarity | 基于松弛最优传输和自相似的风格转换 | |
Inserting Videos Into Videos | 将视频插入视频 | |
Learning Image and Video Compression Through Spatial-Temporal Energy Compaction | 基于时空能量压缩的图像和视频压缩学习 | |
Event-Based High Dynamic Range Image and Very High Frame Rate Video Generation Using Conditional Generative Adversarial Networks | 利用条件GAN实现基于事件的高动态范围图像和高帧速率视频生成 | |
Enhancing TripleGAN for Semi-Supervised Conditional Instance Synthesis and Classification | 基于增强TripleGAN的半监督条件实例合成与分类 | |
Capture, Learning, and Synthesis of 3D Speaking Styles | 捕捉、学习和合成3D口语风格 | |
Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds Using Convolutional Neural Networks | 用卷积神经网络对非结构化三维点云的正态估计 | |
Ray-Space Projection Model for Light Field Camera | 基于光线空间投影模型的光场相机 | |
Deep Geometric Prior for Surface Reconstruction | 基于深度几何先验的表面重建 | |
Analysis of Feature Visibility in Non-Line-Of-Sight Measurements | 非视线测量中特征可见度的分析 | |
Hyperspectral Imaging With Random Printed Mask | 基于随机打印掩模的高光谱成像 | |
All-Weather Deep Outdoor Lighting Estimation | 全天候深度室外照明估算 | |
A Variational EM Framework With Adaptive Edge Selection for Blind Motion Deblurring | 基于自适应边缘选择的变分EM框架实现盲运动去模糊 | |
Viewport Proposal CNN for 360deg Video Quality Assessment | 视区建议CNN进行360deg视频质量评估 | |
Beyond Gradient Descent for Regularized Segmentation Losses | 超越梯度下降实现正则化分割损失 | |
MAGSAC: Marginalizing Sample Consensus | MAGSAC:将样本共识边缘化 | |
Understanding and Visualizing Deep Visual Saliency Models | 深度视觉显著性模型的理解和可视化 | |
Divergence Prior and Vessel-Tree Reconstruction | 散度先验与血管树重建 | |
Unsupervised Domain-Specific Deblurring via Disentangled Representations | 通过分离表示的无监督特定域去模糊 | |
Douglas-Rachford Networks: Learning Both the Image Prior and Data Fidelity Terms for Blind Image Deconvolution | Douglas-Rachford网:基于图像先验和数据保真度学习的盲图像反卷积 | |
Speed Invariant Time Surface for Learning to Detect Corner Points With Event-Based Cameras | 利用速度不变时间曲面实现基于事件摄像机的角点检测 | |
Training Deep Learning Based Image Denoisers From Undersampled Measurements Without Ground Truth and Without Image Prior | 没有GroundTruth和图像先验的情况下,利用欠采样测量实现基于深度学习的图像降噪。 | D-AMP,利用降噪器来帮助恢复CS图像(Algo.1)。 LD-AMP,利用深度学习降噪(DnCNN)替换传统降噪器(如BM3D),因此称为Learning D-AMP。但是需要GT图像 MC-Stein无偏估计,利用无偏估计,替换真正的MSE,从而无需GT。 本文即为LD-AMP + MC-Stein无偏估计的组合,从而实现基于深度学习,但无需GT的CS图像恢复算法(Algo.2) |
A Variational Pan-Sharpening With Local Gradient Constraints | 基于局部梯度约束的变分平移锐化 | |
F-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning | F-VAEGAN-D2:一个用于任意镜头学习的特征生成框架 | |
Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation | 基于Wasserstein切片差异的无监督域适应 | |
Graph Attention Convolution for Point Cloud Semantic Segmentation | 基于图形注意卷积的点云语义分割 | |
Normalized Diversification | 规范化多元化 | |
Learning to Localize Through Compressed Binary Maps | 学习通过压缩二进制地图实现定位学习 | |
A Parametric Top-View Representation of Complex Road Scenes | 复杂道路场景的参数化顶视图表示 | |
Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction | 基于视频剪辑顺序预测的自监督时空学习 | |
Superquadrics Revisited: Learning 3D Shape Parsing Beyond Cuboids | 超四次曲面再探讨:学习立方体以外的三维形状解析 | |
Unsupervised Disentangling of Appearance and Geometry by Deformable Generator Network | 利用变形生成网络实现外观和几何的无监督分离 | |
Self-Supervised Representation Learning by Rotation Feature Decoupling | 基于旋转特征解耦的自监督表示学习 | |
Weakly Supervised Deep Image Hashing Through Tag Embeddings | 通过标记嵌入的弱监督深度图像散列 | |
Improved Road Connectivity by Joint Learning of Orientation and Segmentation | 通过方向和分割的联合学习实现道路连通性的改善 | |
Deep Supervised Cross-Modal Retrieval | 深度监督跨模式检索 | |
A Theoretically Sound Upper Bound on the Triplet Loss for Improving the Efficiency of Deep Distance Metric Learning | 三重损失理论上合理的上界对提高深度距离度量学习效率的作用 | |
Data Representation and Learning With Graph Diffusion-Embedding Networks | 基于图扩散-嵌入网络的数据表示与学习 | |
Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph | 基于门控时空能量图的视频关系推理 | |
Image-Question-Answer Synergistic Network for Visual Dialog | 基于图像问答协同网络的视觉对话 | |
Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses | 并非所有帧都相同:基于上下文相似和视觉聚类损失的弱监督视频背景估计 | background estimation problem for videos captured by moving cameras, referred to as video grounding https://team.inria.fr/perception/research/cvvt2013/ |
Inverse Cooking: Recipe Generation From Food Images | 逆向烹饪:从食物图像中生成食谱 | |
Adversarial Semantic Alignment for Improved Image Captions | 基于对抗性语义对齐的图像标注改进 | |
Answer Them All! Toward Universal Visual Question Answering Models | 全部回答!面向通用视觉问答模型 | |
Unsupervised Multi-Modal Neural Machine Translation | 无监督多模神经机器翻译 | |
Multi-Task Learning of Hierarchical Vision-Language Representation | 层次视觉语言表示的多任务学习 | |
Cross-Modal Self-Attention Network for Referring Image Segmentation | 用于参考图像分割的跨模态自注意网络 | |
DuDoNet: Dual Domain Network for CT Metal Artifact Reduction | DuDoNet:基于双域网络的CT金属伪影消除 | |
Fast Spatio-Temporal Residual Network for Video Super-Resolution | 基于快速时空残差网络的视频超分辨率 | |
Complete the Look: Scene-Based Complementary Product Recommendation | 完成外观:基于场景的补充产品推荐 | |
Selective Sensor Fusion for Neural Visual-Inertial Odometry | 基于选择性传感器融合的神经视觉惯性里程计 | |
Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes | 不止一次看:任意形状文本的精确检测器 | |
Learning Binary Code for Personalized Fashion Recommendation | 基于二进制代码学习的个性化时尚推荐 | |
Attention Based Glaucoma Detection: A Large-Scale Database and CNN Model | 基于注意的青光眼检测:大型数据库和CNN模型 | |
Privacy Protection in Street-View Panoramas Using Depth and Multi-View Imagery | 使用深度和多视图图像的街景全景中的隐私保护 | |
Grounding Human-To-Vehicle Advice for Self-Driving Vehicles | 自动驾驶车辆的人-车建议接地 | |
Multi-Step Prediction of Occupancy Grid Maps With Recurrent Neural Networks | 基于递归神经网络的占用率网格图多步预测 | |
Connecting Touch and Vision via Cross-Modal Prediction | 通过跨模式预测连接触摸和视觉 | |
X2CT-GAN: Reconstructing CT From Biplanar X-Rays With Generative Adversarial Networks | X2CT-GAN:用GAN从双平面X射线重建CT | |
Practical Full Resolution Learned Lossless Image Compression | 实用的全分辨率学习无损图像压缩 | |
Image-To-Image Translation via Group-Wise Deep Whitening-And-Coloring Transformation | 基于群体式深度美白和着色变换的图像-图像翻译 | |
Max-Sliced Wasserstein Distance and Its Use for GANs | 最大切块Wasserstein距离及其在GAN上的应用 | |
Meta-Learning With Differentiable Convex Optimization | 基于可微凸优化的元学习 | |
RePr: Improved Training of Convolutional Filters | 卷积滤波器的改进训练 | |
Tangent-Normal Adversarial Regularization for Semi-Supervised Learning | 正切正态对抗正则化的半监督学习 | |
Auto-Encoding Scene Graphs for Image Captioning | 基于自编码场景图的图像字幕 | |
Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech | 部分语音引导下快速、多样、准确的图像字幕 | |
Attention Branch Network: Learning of Attention Mechanism for Visual Explanation | 注意力分支网络:基于注意力机制学习的视觉解释 | |
Cascaded Projection: End-To-End Network Compression and Acceleration | 级联投影:端到端网络压缩和加速 | |
DeepCaps: Going Deeper With Capsule Networks | DeepCaps:胶囊网络的深入发展 | |
FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search | 基于可微神经结构搜索的硬件感知高效ConvNet设计 | |
APDrawingGAN: Generating Artistic Portrait Drawings From Face Photos With Hierarchical GANs | APDrawingGAN:利用层级GAN实现由面部照片生成艺术肖像画 | |
Constrained Generative Adversarial Networks for Interactive Image Generation | 用于交互式图像生成的约束GAN | |
WarpGAN: Automatic Caricature Generation | WarpGAN:自动漫画生成 | |
Explainability Methods for Graph Convolutional Neural Networks | 图卷积神经网络的可解释性方法 | |
A Generative Adversarial Density Estimator | 一种生成对抗密度估计 | |
SoDeep: A Sorting Deep Net to Learn Ranking Loss Surrogates | SoDeep:一个排序深度网,用于学习排名损失代理 | |
High-Quality Face Capture Using Anatomical Muscles | 使用解剖肌肉进行高质量面部捕捉 | |
FML: Face Model Learning From Videos | 从视频中学习面部模型 | |
AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations | AdaCos:自适应缩放余弦逻辑实现以深度人脸表示的有效学习 | |
3D Hand Shape and Pose Estimation From a Single RGB Image | 单个RGB图像的三维手形和姿势估计 | |
3D Hand Shape and Pose From Images in the Wild | 从野外图像中获取的三维手形和姿势 | |
Self-Supervised 3D Hand Pose Estimation Through Training by Fitting | 基于拟合训练的自监督三维手部姿态估计 | |
CrowdPose: Efficient Crowded Scenes Pose Estimation and a New Benchmark | CrowdPose:有效的拥挤场景姿态估计和新的基准 | |
Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in a Triadic Interaction | 面向社会人工智能:三元交互中的非语言社会信号预测 | |
HoloPose: Holistic 3D Human Reconstruction In-The-Wild | HoloPose:野外整体三维人体重建 | |
Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation | 基于几何感知表示的三维人体姿态估计的弱监督发现 | |
In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations | 基于显式二维特征和中间三维表示的野生人体姿态估计 | |
Slim DensePose: Thrifty Learning From Sparse Annotations and Motion Cues | Slim DensePose:从稀疏的注释和运动提示中节俭地学习 | |
Self-Supervised Representation Learning From Videos for Facial Action Unit Detection | 基于视频自监督表示学习的面部动作单元检测 | |
Combining 3D Morphable Models: A Large Scale Face-And-Head Model | 组合三维可变形模型:大型面和头部模型 | |
Boosting Local Shape Matching for Dense 3D Face Correspondence | 增强局部形状匹配实现密集三维人脸对应 | |
Unsupervised Part-Based Disentangling of Object Shape and Appearance | 无监督基于部分的物体形状和外观分离 | |
Monocular Total Capture: Posing Face, Body, and Hands in the Wild | 单眼全捕获:在野外摆出面部、身体和手的姿势 | |
Expressive Body Capture: 3D Hands, Face, and Body From a Single Image | 富有表现力的身体捕捉:来自单个图像的3D手、脸和身体 | |
Neural RGB(r)D Sensing: Depth and Uncertainty From a Video Camera | 神经RGB(R)D感知:来自摄像机的深度和不确定性 | |
DAVANet: Stereo Deblurring With View Aggregation | DAVANet:基于视图聚合的立体去模糊 | |
DVC: An End-To-End Deep Video Compression Framework | 端到端深度视频压缩框架 | |
SOSNet: Second Order Similarity Regularization for Local Descriptor Learning | 基于二阶相似正则化的局部描述符学习 | |
"Double-DIP": Unsupervised Image Decomposition via Coupled Deep-Image-Priors | “Double-DIP”:通过耦合深图像先验进行无监督图像分解 | |
Unprocessing Images for Learned Raw Denoising | 未处理图像用于原始去噪学习 | |
Residual Networks for Light Field Image Super-Resolution | 基于残差网络的光场图像超分辨率 | |
Modulating Image Restoration With Continual Levels via Adaptive Feature Modification Layers | 基于自适应特征修正层的连续水平调制图像恢复 | |
Second-Order Attention Network for Single Image Super-Resolution | 基于二阶注意网络的单图像超分辨率 | |
Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations | 魔鬼在边缘:从嘈杂的注释中学习语义边界 | |
Path-Invariant Map Networks | 路径不变映射网络 | |
FilterReg: Robust and Efficient Probabilistic Point-Set Registration Using Gaussian Filter and Twist Parameterization | FilterReg:基于高斯滤波和扭曲参数化的鲁棒高效概率点集配准 | |
Probabilistic Permutation Synchronization Using the Riemannian Structure of the Birkhoff Polytope | 基于Birkhoff多面体黎曼结构的概率置换同步 | |
Lifting Vectorial Variational Problems: A Natural Formulation Based on Geometric Measure Theory and Discrete Exterior Calculus | 提升向量变分问题:基于几何测度理论和离散外部微积分的自然公式 | |
A Sufficient Condition for Convergences of Adam and RMSProp | Adam与RMSProp收敛的一个充分条件 | |
Guaranteed Matrix Completion Under Multiple Linear Transformations | 多重线性变换下的保证矩阵完备 | |
MAP Inference via Block-Coordinate Frank-Wolfe Algorithm | 基于块坐标Frank-Wolfe算法的最大后验推断 | |
A Convex Relaxation for Multi-Graph Matching | 基于凸松弛的多图匹配 | |
Pixel-Adaptive Convolutional Neural Networks | 像素自适应卷积神经网络 | |
Single-Frame Regularization for Temporally Stable CNNs | 基于单帧正则化的时域稳定CNN | |
An End-To-End Network for Generating Social Relationship Graphs | 用于社会关系图生成的端到端网络 | |
Meta-Learning Convolutional Neural Architectures for Multi-Target Concrete Defect Classification With the COncrete DEfect BRidge IMage Dataset | 元学习卷积神经结构实现基于混凝土缺陷桥图像集的多目标混凝土缺陷分类 | |
ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model | 基于双线性回归模型的平台独立能量约束深度神经网络压缩 | |
SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization | SeerNet:通过低比特量化预测卷积神经网络特征图稀疏性 | |
Defending Against Adversarial Attacks by Randomized Diversification | 通过随机多样化防御对抗性攻击 | |
Rob-GAN: Generator, Discriminator, and Adversarial Attacker | Rob-GAN:生成器、判别器和对抗攻击者 | |
Learning From Noisy Labels by Regularized Estimation of Annotator Confusion | 用注释器混淆的正则化估计从噪声标签中学习 | |
Task-Free Continual Learning | 无任务连续学习 | |
Importance Estimation for Neural Network Pruning | 基于重要性估计的神经网络剪枝 | |
Detecting Overfitting of Deep Generative Networks via Latent Recovery | 通过潜在恢复检测深度生成网络的过拟合 | |
Coloring With Limited Data: Few-Shot Colorization via Memory Augmented Networks | 有限数据着色:通过内存增强网络实现少镜头着色 | |
Characterizing and Avoiding Negative Transfer | 表征和避免负迁移 | |
Building Efficient Deep Neural Networks With Unitary Group Convolutions | 利用一元群卷积构造高效的深度神经网络 | |
Semi-Supervised Learning With Graph Learning-Convolutional Networks | 基于图学习卷积网络的半监督学习 | |
Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning | 学习记忆:基于突触可塑性驱动框架的持续学习 | |
AIRD: Adversarial Learning Framework for Image Repurposing Detection | 图像再定位检测的对抗性学习框架 | |
A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations | 基于核化流形映射的对抗性扰动影响减少 | |
Trust Region Based Adversarial Attack on Neural Networks | 基于信任域的神经网络对抗攻击 | |
PEPSI : Fast Image Inpainting With Parallel Decoding Network | PEPSI:基于并行解码网络的快速图像修复 | |
Model-Blind Video Denoising via Frame-To-Frame Training | 基于帧对帧训练的盲模型视频去噪 | |
End-To-End Efficient Representation Learning via Cascading Combinatorial Optimization | 基于级联组合优化的端到端高效表示学习 | |
Sim-Real Joint Reinforcement Transfer for 3D Indoor Navigation | 用于三维室内导航的仿真实节点强化传输 | |
ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation | ChamNet:通过平台感知模型自适应实现高效网络设计 | |
Regularizing Activation Distribution for Training Binarized Deep Networks | 基于正则化激活分布的二值化深度网络训练 | |
Robustness Verification of Classification Deep Neural Networks via Linear Programming | 基于线性规划的分类深度神经网络鲁棒性验证 | |
Additive Adversarial Learning for Unbiased Authentication | 无偏认证的加性对抗学习 | |
Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network Using Truncated Gaussian Approximation | 用截断高斯近似同时优化三元神经网络的权值和量化器 | |
Adversarial Defense by Stratified Convolutional Sparse Coding | 分层卷积稀疏编码的对抗性防御 | |
Exploring Object Relation in Mean Teacher for Cross-Domain Detection | 利用中值教师中目标关系实现跨域检测 | |
Hierarchical Disentanglement of Discriminative Latent Features for Zero-Shot Learning | 判决潜在特征的层次分离实现零镜头学习 | |
R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network | R2GAN:基于生成对抗网络的跨模式配方检索 | |
Rethinking Knowledge Graph Propagation for Zero-Shot Learning | 基于知识图传播的零镜头学习中的再思考 | |
Learning to Learn Image Classifiers With Visual Analogy | 基于视觉类比的图像分类器学习 | |
Where's Wally Now? Deep Generative and Discriminative Embeddings for Novelty Detection | Wally现在在哪里?基于深度生成和判别嵌入的新颖性检测 | |
Weakly Supervised Image Classification Through Noise Regularization | 基于噪声正则化的弱监督图像分类 | |
Data-Driven Neuron Allocation for Scale Aggregation Networks | 基于数据驱动神经元分配的尺度聚合网络 | |
Graphical Contrastive Losses for Scene Graph Parsing | 用于场景图分析的图形对比损失 | |
Deep Transfer Learning for Multiple Class Novelty Detection | 基于深度迁移学习的多类别新颖性检测 | |
QATM: Quality-Aware Template Matching for Deep Learning | QATM:基于质量感知模板匹配的深度学习 | |
Retrieval-Augmented Convolutional Neural Networks Against Adversarial Examples | 基于检索增强卷积神经网络的反对抗样例 | |
Learning Cross-Modal Embeddings With Adversarial Networks for Cooking Recipes and Food Images | 基于对抗网络的跨模式嵌入学习实现烹饪食谱和食物图像 | |
FastDraw: Addressing the Long Tail of Lane Detection by Adapting a Sequential Prediction Network | FastDraw:通过采用顺序预测网络解决车道检测的长尾问题 | |
Weakly Supervised Video Moment Retrieval From Text Queries | 基于文本查询的弱监督视频片段检索 | |
Content-Aware Multi-Level Guidance for Interactive Instance Segmentation | 基于内容感知多级指导的交互式实例分割 | |
Greedy Structure Learning of Hierarchical Compositional Models | 层次组合模型的贪婪结构学习 | |
Interactive Full Image Segmentation by Considering All Regions Jointly | 综合考虑所有区域的交互式全图像分割 | |
Learning Active Contour Models for Medical Image Segmentation | 医学图像分割中主动轮廓模型的学习 | |
Customizable Architecture Search for Semantic Segmentation | 基于可定制体系结构搜索的语义分割 | 强调可定制,即用户输入条件(限制),搜索满足用户条件的轻量级的网络结构。其条件(限制)体现在自定义的损失函数中。 |
Local Features and Visual Words Emerge in Activations | 激活中局部特征和视觉词汇的出现 | |
Hyperspectral Image Super-Resolution With Optimized RGB Guidance | 基于优化RGB制导的高光谱图像超分辨率 | |
Adaptive Confidence Smoothing for Generalized Zero-Shot Learning | 基于自适应置信平滑的广义零镜头学习 | |
PMS-Net: Robust Haze Removal Based on Patch Map for Single Images | PMS网络:基于Patch图的鲁棒单图像雾去除 | |
Deep Spherical Quantization for Image Search | 基于深度球面量化的图像搜索 | |
Large-Scale Interactive Object Segmentation With Human Annotators | 带人工注释器的大规模交互式对象分割 | |
A Poisson-Gaussian Denoising Dataset With Real Fluorescence Microscopy Images | 基于真实荧光显微镜图像的泊松高斯去噪数据集 | |
Task Agnostic Meta-Learning for Few-Shot Learning | 基于任务不可知元学习的少镜头学习 | |
Progressive Ensemble Networks for Zero-Shot Recognition | 基于渐进集成网络的零镜头识别 | |
Direct Object Recognition Without Line-Of-Sight Using Optical Coherence | 利用光学相干直接识别无视线物体 | |
Atlas of Digital Pathology: A Generalized Hierarchical Histological Tissue Type-Annotated Database for Deep Learning | 数字病理学图集:深度学习的广义层次组织类型注释数据库 | |
Perturbation Analysis of the 8-Point Algorithm: A Case Study for Wide FoV Cameras | 8点算法的扰动分析:宽FoV摄像机的一个实例研究 | |
Robustness of 3D Deep Learning in an Adversarial Setting | 对抗环境下三维深度学习的鲁棒性 | |
SceneCode: Monocular Dense Semantic Reconstruction Using Learned Encoded Scene Representations | SceneCode:基于学习编码场景表示的单目密集语义重建 | |
StereoDRNet: Dilated Residual StereoNet | StereoDRNet:扩张的残差立体网 | 流程框架见图2,文中在特征抽取、CostFiltering、回归、精化等几个子模块都有改进,主要有: 1. DR:CostFiltering中使用扩张卷积和残差 2. 特征抽取中使用Vortex池化 3. 精化阶段不仅考虑光度误差(公式4),同时考虑几何误差(公式5) 详细可见Contribution 中介绍 |
The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation | 球面对齐:基于全局最优球面混合对齐的相机姿态估计 | |
Learning Joint Reconstruction of Hands and Manipulated Objects | 手和被操纵物体的关节重建学习 | |
Deep Single Image Camera Calibration With Radial Distortion | 具有径向畸变的深度单像摄像机标定 | |
CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth | CAM-Convs:基于摄像机感知多尺度卷积的单视图深度 | |
Translate-to-Recognize Networks for RGB-D Scene Recognition | 基于转换到识别网络的RGB-D场景识别 | |
Re-Identification Supervised Texture Generation | 基于重新识别监督的纹理生成 | |
Action4D: Online Action Recognition in the Crowd and Clutter | Action4D:人群和混乱中的在线动作识别 | |
Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction | 利用精确建议和形状重建的单目三维目标检测 | |
Attribute-Aware Face Aging With Wavelet-Based Generative Adversarial Networks | 利用基于小波的GAN实现属性感知人脸老化 | |
Noise-Tolerant Paradigm for Training Face Recognition CNNs | 利用抗噪声范式实现人脸识别CNN训练 | |
Low-Rank Laplacian-Uniform Mixed Model for Robust Face Recognition | 用于稳健人脸识别的低秩拉普拉斯-均匀混合模型 | |
Generalizing Eye Tracking With Bayesian Adversarial Learning | 基于贝叶斯逆学习的广义眼跟踪 | |
Local Relationship Learning With Person-Specific Shape Regularization for Facial Action Unit Detection | 基于特定人形状正则化的局部关系学习实现人脸动作单元检测 | |
Point-To-Pose Voting Based Hand Pose Estimation Using Residual Permutation Equivariant Layer | 利用残差排列等变层实现基于点对位置投票的手位估计 | |
Improving Few-Shot User-Specific Gaze Adaptation via Gaze Redirection Synthesis | 通过注视重定向合成改进少镜头用户特定的注视适应 | |
AdaptiveFace: Adaptive Margin and Sampling for Face Recognition | AdaptiveFace:用于人脸识别的自适应边缘和采样 | |
Disentangled Representation Learning for 3D Face Shape | 三维人脸形状的分离表示学习 | |
LBS Autoencoder: Self-Supervised Fitting of Articulated Meshes to Point Clouds | LBS自编码器:连接网格到点云的自监督拟合 | |
PifPaf: Composite Fields for Human Pose Estimation | 基于复合场的人体姿态估计 | |
TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection | 基于过渡感知上下文网络的时空行为检测 | |
Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos | 基于骨架轨迹规则学习的视频异常检测 | |
Local Temporal Bilinear Pooling for Fine-Grained Action Parsing | 用于细粒度动作分析的局部时间双线性池化 | |
Improving Action Localization by Progressive Cross-Stream Cooperation | 通过渐进式跨流合作实现行动定位的改进 | |
Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition | 双流自适应图卷积网络实现基于骨架的动作识别 | |
A Neural Network Based on SPD Manifold Learning for Skeleton-Based Hand Gesture Recognition | 基于神经网络的SPD流形学习实现基于骨架的手势识别 | |
Large-Scale Weakly-Supervised Pre-Training for Video Action Recognition | 大规模弱监督预训练实现视频动作识别 | |
Learning Spatio-Temporal Representation With Local and Global Diffusion | 利用局部和全局扩散实现时空表示学习 | |
Unsupervised Learning of Action Classes With Continuous Temporal Embedding | 利用连续时间嵌入实现动作类别的无监督学习 | |
Double Nuclear Norm Based Low Rank Representation on Grassmann Manifolds for Clustering | Grassmann流形上基于双核范数的低秩表示的聚类 | |
SR-LSTM: State Refinement for LSTM Towards Pedestrian Trajectory Prediction | SR-LSTM:基于LSTM状态精化的行人轨迹预测 | |
Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes | 基于无监督深度极线流的静止或动态场景 | |
An Efficient Schmidt-EKF for 3D Visual-Inertial SLAM | 一种用于三维视觉惯性SLAM的高效Schmidt-EKF | |
A Neural Temporal Model for Human Motion Prediction | 人类运动预测的神经时间模型 | |
Multi-Agent Tensor Fusion for Contextual Trajectory Prediction | 上下文轨迹预测的多智能体张量融合 | |
Coordinate-Based Texture Inpainting for Pose-Guided Human Image Generation | 基于坐标的纹理修补实现姿态引导的人体图像生成 | |
On Stabilizing Generative Adversarial Training With Noise | 通过噪声实现生成对抗训练稳定 | |
Self-Supervised GANs via Auxiliary Rotation Loss | 基于辅助旋转损失的自监督GAN | |
Texture Mixer: A Network for Controllable Synthesis and Interpolation of Texture | 纹理混合:一种纹理的可控合成和插值网络 | |
Object-Driven Text-To-Image Synthesis via Adversarial Training | 通过对抗性训练实现对象驱动的文本-图像合成 | |
Zoom-In-To-Check: Boosting Video Interpolation via Instance-Level Discrimination | 放大检查:通过实例级判别增强视频插值 | |
Disentangling Latent Space for VAE by Label Relevant/Irrelevant Dimensions | 通过标签相关/不相关维度分离出VAE的潜在空间 | |
Spectral Reconstruction From Dispersive Blur: A Novel Light Efficient Spectral Imager | 色散模糊的光谱重建:一种新型的光效光谱成像仪 | |
Quasi-Unsupervised Color Constancy | 准无监督颜色恒常性 | |
Deep Defocus Map Estimation Using Domain Adaptation | 基于域自适的深度失焦图估计 | |
Using Unknown Occluders to Recover Hidden Scenes | 使用未知遮挡物恢复隐藏场景 | |
Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation | 竞争协作:深度、相机运动、光流和运动分割的联合无监督学习 | |
Learning Parallax Attention for Stereo Image Super-Resolution | 基于视差注意学习的立体图像超分辨率 | |
Knowing When to Stop: Evaluation and Verification of Conformity to Output-Size Specifications | 知道何时停止:符合输出尺寸规格的评估和验证 | |
Spatial Attentive Single-Image Deraining With a High Quality Real Rain Dataset | 用高质量的真实雨量数据集实现基于空间注意的单一图像去雨 | |
Focus Is All You Need: Loss Functions for Event-Based Vision | 专注是你所需要的:基于事件的视觉的损失函数 | |
Scalable Convolutional Neural Network for Image Compressed Sensing | 基于可伸缩卷积神经网络的图像压缩感知 | |
Event Cameras, Contrast Maximization and Reward Functions: An Analysis | 事件摄像头、对比度最大化和奖励功能:分析 | |
Convolutional Neural Networks Can Be Deceived by Visual Illusions | 卷积神经网络可能被视觉错觉欺骗。 | |
PDE Acceleration for Active Contours | 基于PDE加速的主动轮廓 | |
Dichromatic Model Based Temporal Color Constancy for AC Light Sources | 基于双色模型的AC光源时域颜色恒定性 | |
Semantic Attribute Matching Networks | 语义属性匹配网络 | |
Skin-Based Identification From Multispectral Image Data Using CNNs | 利用CNN实现多光谱图像基于皮肤的识别 | |
Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks | 深度卷积神经网络的Kronecker因子近似曲率大规模分布二阶优化 | |
Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments | 将人类置于场景中:在3D室内环境中学习负担 | |
PIEs: Pose Invariant Embeddings | PIEs:姿势不变嵌入 | |
Representation Similarity Analysis for Efficient Task Taxonomy & Transfer Learning | 高效任务分类与转移学习的表示相似性分析 | |
Object Counting and Instance Segmentation With Image-Level Supervision | 基于图像级监控的目标计数与实例分割 | |
Variational Autoencoders Pursue PCA Directions (by Accident) | 变分自编码器追踪PCA方向(意外) | |
A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes | 基于关系增强全卷积网络的航空场景语义分割 | |
Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping | 时间变换网络:不变和判别时间扭曲的联合学习 | |
PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval | 基于上下文信息的三维注意力图学习实现基于点云的检索 | |
Depth Coefficients for Depth Completion | 基于深度系数的深度补全 | |
Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection | 多样化与匹配:一种面向对象检测的域自适应表示学习范式 | |
Good News, Everyone! Context Driven Entity-Aware Captioning for News Images | 好消息,各位!新闻图像的上下文驱动实体感知标注 | |
Multi-Level Multimodal Common Semantic Space for Image-Phrase Grounding | 用于图像短语接地的多级多模态公共语义空间 | |
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning | 利用时空动态和语义属性丰富的视觉编码实现视频字幕 | |
Pointing Novel Objects in Image Captioning | 在图像字幕中指向新对象 | |
Informative Object Annotations: Tell Me Something I Don't Know | 信息对象注释:告诉我一些我不知道的事情 | |
Engaging Image Captioning via Personality | 通过个性吸引图像字幕 | |
Vision-Based Navigation With Language-Based Assistance via Imitation Learning With Indirect Intervention | 通过间接干预的模仿学习实现基于语言辅助的视觉导航 | |
TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments | TOUCHDOWN:视觉街道环境中的自然语言导航和空间推理 | |
A Simple Baseline for Audio-Visual Scene-Aware Dialog | 音视频场景感知对话的简单基线 | |
End-To-End Learned Random Walker for Seeded Image Segmentation | 用于带种子图像分割的端到端随机游走学习 | |
Efficient Neural Network Compression | 有效的神经网络压缩 | |
Cascaded Generative and Discriminative Learning for Microcalcification Detection in Breast Mammograms | 乳腺X光片微钙化检测的级联生成与判别学习 | |
C3AE: Exploring the Limits of Compact Model for Age Estimation | C3AE:探索用于年龄估计的紧致模型的极限 | |
Adaptive Weighting Multi-Field-Of-View CNN for Semantic Segmentation in Pathology | 自适应加权多视场CNN在病理学语义分割中的应用 | |
In Defense of Pre-Trained ImageNet Architectures for Real-Time Semantic Segmentation of Road-Driving Images | 用于道路驾驶图像实时语义分割的预训练ImageNet结构防御 | |
Context-Aware Visual Compatibility Prediction | 上下文感知视觉兼容性预测 | |
Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks | 通过Sim-to-Sim实现Sim-to-Real:利用随机-基础适应网络实现数据高效的机器人抓取 | |
Multiview 2D/3D Rigid Registration via a Point-Of-Interest Network for Tracking and Triangulation | 基于兴趣点网络的多视图二维/三维刚性配准实现跟踪和三角测量 | |
Context-Aware Spatio-Recurrent Curvilinear Structure Segmentation | 上下文感知的空间-递归曲线结构分割 | |
An Alternative Deep Feature Approach to Line Level Keyword Spotting | 线级关键词定位的一种替代深度特征方法 | |
Dynamics Are Important for the Recognition of Equine Pain in Video | 动力学对于识别视频中的马疼痛很重要。 | |
LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving | 自主驾驶中一种高效概率三维目标探测器 | |
Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds | 机器视觉引导的3D医学图像压缩,实现云中高效传输和精确分割 | |
PointPillars: Fast Encoders for Object Detection From Point Clouds | 点柱:基于快速编码器的点云目标检测 | |
Motion Estimation of Non-Holonomic Ground Vehicles From a Single Feature Correspondence Measured Over N Views | 利用N个视图上单特征对应实现非完整地面车辆的运动估计 | |
From Coarse to Fine: Robust Hierarchical Localization at Large Scale | 从粗到细:大规模的鲁棒层次定位 | |
Large Scale High-Resolution Land Cover Mapping With Multi-Resolution Data | 利用多分辨率数据进行大尺度高分辨率土地覆盖图绘制 | |
Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting | 利用异构辅助任务来辅助人群计数 |
最后
以上就是典雅老师为你收集整理的CVPR2019论文题目中文列表的全部内容,希望文章能够帮你解决CVPR2019论文题目中文列表所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
发表评论 取消回复