Machine Perception
Research in machine perception tackles the hard problems of understanding images, sounds, music and video. In recent years, our computers have become much better at such tasks, enabling a variety of new applications such as: content-based search in Google Photos and Image Search, natural handwriting interfaces for Android, optical character recognition for Google Drive documents, and recommendation systems that understand music and YouTube videos. Our approach is driven by algorithms that benefit from processing very large, partially-labeled datasets using parallel computing clusters. A good example is our recent work on object recognition using a novel deep convolutional neural network architecture known as Inception that achieves state-of-the-art results on academic benchmarks and allows users to easily search through their large collection of Google Photos. The ability to mine meaningful information from multimedia is broadly applied throughout Google.
454 Publications
-
A Neural Representation of Sketch Drawings
ICLR 2018
-
Aperture Supervision for Monocular Depth Estimation
Pratul Srinivasan, Rahul Garg, Neal Wadhwa, Ren Ng, Jonathan T. Barron
CVPR (2018) (to appear)
-
Burst Denoising with Kernel Prediction Networks
Ben Mildenhall, Jonathan T. Barron, Jiawen Chen, Dillon Sharlet, Ren Ng, Rob Carroll
CVPR (2018) (to appear)
-
COCO-Stuff: Thing and Stuff Classes in Context
Holger Caesar, Jasper Uijlings, Vittorio Ferrari
CVPR (2018) (to appear)
-
Cross-View Training for Semi-Supervised Learning
Kevin Clark, Quoc V. Le, Thang Luong
ICLR (2018) (to appear)
-
Frame-Recurrent Video Super-Resolution
Mehdi S. M. Sajjadi, Raviteja Vemulapalli, Matthew Brown
CVPR (2018) (to appear)
-
Dale Webster, Ehsan Rahimy, Greg Corrado, Jonathan Krause, Kasumi Widner, Lily Peng, Peter Karth, Varun Gulshan
Ophthalmology (2018)
-
Intriguing Properties of Adversarial Examples
Barret Zoph, Ekin Dogus Cubuk, Quoc V. Le, Sam Schoenholz
ICLR (2018)
-
Large-Scale 3D Scene Classification With Multi-View Volumetric CNN
Dror Aiger, Brett Allen, Aleksey Golovinskiy
arxiv (2018)
-
Learning Intelligent Dialogs for Bounding-Box Annotation
Ksenia Konyushkova, Jasper Uijlings, Chris Lampert, Vittorio Ferrari
CVPR (2018) (to appear)
-
Learning with Imprinted Weights
Hang Qi, David Lowe, Matthew Brown
CVPR (2018) (to appear)
-
Matrix capsules with EM routing
Geoffrey Hinton, Sara Sabour, Nicholas Frosst
ICLR (2018) (to appear)
-
Revisiting knowledge transfer for training object class detectors
Jasper Uijlings, Stefan Popov, Vittorio Ferrari
CVPR (2018) (to appear)
-
Searching for Activation Functions
Prajit Ramachandran, Barret Zoph, Quoc Le
ICLR (2018)
-
Sequences with Low-Discrepancy Blue-Noise 2-D Projections
Helene Perrier, David Coeurjolly, Feng Xie, Matt Pharr, Pat Hanrahan, Victor Ostromoukhov
Proceedings of Eurographics (2018)
-
Thermometer Encoding: One Hot Way To Resist Adversarial Examples
Aurko Roy, Colin Raffel, Ian Goodfellow, Jacob Buckman
ICLR (2018)
-
Time-Contrastive Networks: Self-Supervised Learning from Video
Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine
Proceedings of International Conference in Robotics and Automation (ICRA 2018) + Deep Learning for Robotic Vision (DLRV) Workshop at CVPR 2017 + Deep Reinforcement Learning Symposium at NIPS 2017 (2018)
-
Towards learning a metric for neural prosthetics
Nishal P Shah, Sasidhar Madugula, Alan Litke, Alexander Sher, EJ Chichilnisky, Yoram Singer, Jonathon Shlens
ICLR (2018) (to appear)
-
Unsupervised Learning of Depth and Egomotion from Monocular Video Using 3D Geometric Constraints
Reza Mahjourian, Martin Wicke, Anelia Angelova
CVPR (2018)
-
Unsupervised Learning of Semantic Audio Representations
Aren Jansen, Manoj Plakal, Ratheet Pandya, Dan Ellis, Shawn Hershey, Jiayang Liu, Channing Moore, Rif A. Saurous
Proceedings of ICASSP 2018 (to appear)
-
Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
Konstantinos Bousmalis, Alex Irpan, Paul Wohlhart, Yunfei Bai, Matthew Kelcey, Mrinal Kalakrishnan, Laura Downs, Julian Ibarz, Peter Pastor Sampedro, Kurt Konolige, Sergey Levine, Vincent Vanhoucke
ICRA (2018)
-
3D object classification and retrieval with Spherical CNNs
Carlos Esteves, Christine Allen-Blanchette, Ameesh Makadia, Kostas Daniilidis
ArXiv (2017)
-
A Learned Representation For Artistic Style
Vincent Dumoulin, Jonathon Shlens, Manjunath Kudlur
ICLR (2017)
-
A No-Reference Video Quality Predictor for H.264 Compression and Scaling Artifacts
Deepti Ghadiyaram, Chao Chen, Sasi Inguva, Anil Kokaram
IEEE International Conference on Image Processing, IEEE (2017) (to appear)
-
A discriminative view of MRF pre-processing algorithms
Chen Wang, Charles Herrmann, Ramin Zabih
ICCV 2017
-
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Chunhui Gu, Chen Sun, David A. Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, Cordelia Schmid, Jitendra Malik
Arxiv (2017)
-
Accelerating Eulerian Fluid Simulation With Convolutional Networks
Jonathan Tompson, Kristofer Schlachter, Pablo Sprechmann, Ken Perlin
ICML (2017)
-
Adversarial Machine Learning at Scale
Alexey Kurakin, Ian J. Goodfellow, Samy Bengio
ICLR (2017)
-
Tom Brown, Dandelion Mane, Aurko Roy, Martin Abadi, Justin Gilmer
NIPS Workshop (2017)
-
Adversarial examples in the physical world
Alexey Kurakin, Ian Goodfellow, Samy Bengio
ICLR Workshop (2017)
-
Ambisonics soundfield navigation using directional decomposition and path distance estimation
Andrew Allen, Bastiaan Kleijn
(2017)
-
Appearance-and-Relation Networks for Video Classification
Limin Wang, Wei Li, Wen Li, Luc Van Gool
arXiv (2017)
-
Are GANs Created Equal? A Large-Scale Study
Mario Lučić, Karol Kurach, Marcin Michalski, Sylvain Gelly, Olivier Bousquet
arXiv (2017)
-
Philip Haeusser, Thomas Frerix, Alexander Mordvintsev, Daniel Cremers
International Conference on Computer Vision (ICCV), IEEE (2017) (to appear)
-
Attention-based Extraction of Structured Information from Street View Imagery
Zbigniew Wojna, Alex Gorban, Dar-Shyang Lee, Kevin Murphy, Qian Yu, Yeqing Li, Julian Ibarz
ICDAR (2017), pp. 8
-
Audio Set: An ontology and human-labeled dataset for audio events
Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, Marvin Ritter
Proc. IEEE ICASSP 2017, New Orleans, LA (to appear)
-
Automatic Spatially-aware Fashion Concept Discovery
Xintong Han, Zuxuan Wu, Phoenix X Huang, Xiao Zhang, Menglong Zhu, Yuan Li, Yang Zhao, Larry S Davis
ICCV (2017)
-
BranchOut: Regularization for Online Ensemble Tracking with CNNs
Bohyung Han, Hartwig Adam, Jack Sim
CVPR (2017) (to appear)
-
CNN Architectures for Large-Scale Audio Classification
Shawn Hershey, Sourish Chaudhuri, Daniel P. W. Ellis, Jort F. Gemmeke, Aren Jansen, Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron Weiss, Kevin Wilson
International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE (2017)
-
Cognitive Mapping and Planning for Visual Navigation
Saurabh Gupta, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik
CVPR (2017)
-
Conditional Image Synthesis With Auxiliary Classifier GANs
Augustus Odena, Christopher Olah, Jonathon Shlens
ICML (2017)
-
Context-aware Captions from Context-agnostic Supervision
Shanmukha Ramakrishna Vedantam, Samy Bengio, Kevin Murphy, Devi Parikh, Gal Chechik
CVPR (2017)
-
CycleGAN, a Master of Steganography
Casey Chu, Andrey Zhmoginov, Mark Sandler
NIPS 2017 Workshop “Machine Deception” (2017)
-
Decomposing Motion and Content for Natural Video Sequence Prediction
Ruben Villegas, Jimei Yang, Seunghoon Hong, Xunyu Lin, Honglak Lee
ICLR (2017)
-
Deep Bilateral Learning for Real-Time Image Enhancement
Michaël Gharbi, Jiawen Chen, Jonathan T. Barron, Sam Hasinoff, Frédo Durand
ACM Transactions on Graphics, ACM (2017)
-
Deep Metric Learning via Facility Location
Hyun Oh Song, Stefanie Jegelka, Vivek Rathod, Kevin Murphy
IEEE CVPR (2017)
-
Deep Visual Foresight for Planning Robot Motion
Sergey Levine, Chelsea Finn
ICRA (2017)
-
Deformable Shape Completion with Graph Convolutional Autoencoders
Or Litany, Alex Bronstein, Michael Bronstein, Ameesh Makadia
CVPR 2018 (2017) (to appear)
-
Deformable block based motion estimation in omnidirectional image sequences
Francesca De Simone, Neil Birkbeck, Balu Adsumilli, Pascal Frossard
IEEE 19th International Workshop on Multimedia Signal Processing (2017)
-
Detecting Cancer Metastases on Gigapixel Pathology Images
Yun Liu, Krishna Kumar Gadepalli, Mohammad Norouzi, George Dahl, Timo Kohlberger, Subhashini Venugopalan, Aleksey S Boyko, Aleksei Timofeev, Philip Q Nelson, Greg Corrado, Jason Hipp, Lily Peng, Martin Stumpe
MICCAI (2017)
-
Encoding Bitrate Optimization Using Playback Statistics for HTTP-based Adaptive Video Streaming
Chao Chen, Yao-Chung Lin, Anil Kokaram, Steve Benting
arxiv (2017)
-
End-to-End Learning of Semantic Grasping
Eric Jang, Julian Ibarz, Peter Pastor Sampedro, Sergey Levine, Sudheendra Vijayanarasimhan
CoRL 2017 (2017) (to appear)
-
Enhancing Video Summarization via Vision-Language Embedding
Bryan Plummer, Matthew Brown, Svetlana Lazebnik
IEEE International Conference on Computer Vision and Pattern Recognition (2017)
-
Exploring the structure of a real-time, arbitrary neural artistic stylization network
Golnaz Ghiasi, Honglak Lee, Manjunath Kudlur, Vincent Dumoulin, Jonathon Shlens
Proceedings of the 28th British Machine Vision Conference (BMVC) (2017)
-
Extreme clicking for efficient object annotation
Dim Papadopoulos, Jasper Uijlings, Frank Keller, Vittorio Ferrari
ICCV (2017)
-
Eyemotion: Classifying facial expressions in VR using eye-tracking cameras
Steven Hickson, Nick Dufour, Avneesh Sud, Vivek Kwatra, Irfan Essa
arXiv, https://arxiv.org/abs/1707.07204 (2017)
-
Jonathan T. Barron, Yun-Ta Tsai
CVPR (2017)
-
Feature agnostic geometric alignment
Dror Aiger, Yoni Weill
Patent (2017)
-
Geometry-Based Next Frame Prediction from Monocular Video
Reza Mahjourian, Martin Wicke, Anelia Angelova
Intelligent Vehicles Symposium (2017)
-
Guetzli: Perceptually Guided JPEG Encoder
Jyrki Alakuijala, Robert Obryk, Ostap Stoliarchuk, Zoltan Szabadka, Lode Vandevenne, Jan Wassenberg
arXiv (2017)
-
Headset Removal for Virtual and Mixed Reality
Christian Frueh, Avneesh Sud, Vivek Kwatra
SIGGRAPH Talks 2017, ACM SIGGRAPH (to appear)
-
Human and Machine Hearing: Extracting Meaning from Sound
Cambridge University Press (2017)
-
Improving Phenotypic Measurements in High-Content Imaging Screens
D. Mike Ando, Cory McLean, Marc Berndl
bioRxiv (2017)
-
Improving Smiling Detection with Race and Gender Diversity
Hee Jung Ryu, Margaret Mitchell, Hartwig Adam
arXiv (2017)
-
Incoherent idempotent ambisonics rendering
W. Bastiaan Kleijn, Andrew Allen, Jan Skoglund, Felicia Lim
2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (2017)
-
Joint Wideband Source Localization and Acquisition Based on a Grid-Shift Approach
Christos Tzagkarakis, Bastiaan Kleijn, Jan Skoglund
2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (2017)
-
Large-Scale Audio Event Discovery in One Million YouTube Videos
Aren Jansen, Jort F. Gemmeke, Daniel P. W. Ellis, Xiaofeng Liu, Wade Lawrence, Dylan Freedman
Proceedings of ICASSP (2017) (to appear)
-
Large-Scale Content-Only Video Recommendation
Joonseok Lee, Sami Abu-El-Haija
International Conference on Computer Vision Workshop, Computer Vision Foundation (2017), pp. 987 - 995
-
Large-Scale Image Retrieval with Attentive Deep Local Features
Hyeonwoo Noh, Andre Araujo, Jack Sim, Tobias Weyand, Bohyung Han
Proc. ICCV (2017) (to appear)
-
Learning Discriminative and Transformation Covariant Local Feature Detectors
Xu Zhang, Felix Yu, Svebor Karaman, Shih-Fu Chang
CVPR (2017)
-
Learning From Noisy Large-Scale Datasets With Minimal Supervision
Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta, Serge Belongie
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 839-847
-
Learning Spread-out Local Feature Descriptors
Xu Zhang, Felix Yu, Sanjiv Kumar, Shih-Fu Chang
ICCV (2017)
-
Learning Unified Embedding for Apparel Recognition
Yang Song, Yuan Li, Bo Wu, Chao-Yeh Chen, Xiao Zhang, Hartwig Adam
ICCV Computational Fashion Workshop (2017)
-
Learning by Association - A versatile semi-supervised training method for neural networks
Philip Haeusser, Alexander Mordvintsev, Daniel Cremers
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
-
Learning to Generate Long-term Future via Hierarchical Prediction
Ruben Villegas, Jimei Yang, Yuliang Zou, Sungryull Sohn, Xunyu Lin, Honglak Lee
ICML (2017)
-
Learning typographic style: from discrimination to synthesis
Machine Vision and Applications, vol. 28, Issues 5-6 (2017), pp. 551-568
-
Learning with Proxy Supervision for End-To-End Visual Learning
Jiří Čermák, Anelia Angelova
Deep Learning for Vehicle Perception Workshop, Intelligent Vehicles Symposium (2017)
-
Modulating early visual processing by language
Harm de Vries, Florian Strub, Jérémie Mary, Hugo Larochelle, Olivier Pietquin, Aaron Courville
NIPS (2017)
-
No Fuss Distance Metric Learning using Proxies
Yair Movshovitz-Attias, Alexander Toshev, Thomas Leung, Sergey Ioffe, Saurabh Singh
International Conference on Computer Vision (ICCV), IEEE (2017) (to appear)
-
Novel inter and intra prediction tools under consideration for the emerging AV1 video codec
Urvang Joshi, Debargha Mukherjee, Jingning Han, Yue Chen, Sarah Parker, Hui Su, Angie Chiang, Yaowu Xu, Zoe Liu, Yunqing Wang, Jim Bankoski, Chen Wang, Emil Keyder
SPIE Optical Engineering + Applications, vol. 10396 (2017), 10396 - 10396 - 13
-
Novel modes and adaptive block scanning order for intra prediction in AV1
Ofer Hadar, Ariel Shleifer, Debargha Mukherjee, Urvang Joshi, Itai Mazar, Michael Yuzvinsky, Nitzan Tavor, Nati Itzhak, Raz Birman
SPIE Optical Engineering + Applications, vol. 10396 (2017), 10396 - 10396 - 10
-
Object category learning and retrieval with weak supervision
Steven Hickson, Anelia Angelova, Irfan Essa, Rahul Sukthankar
NIPS Workshop on Learning With Limited Labeled Data (2017)
-
Onsets and Frames: Dual-Objective Piano Transcription
Curtis Hawthorne, Erich Elsen, Jialin Song, Adam Roberts, Ian Simon, Colin Raffel, Jesse Engel, Sageev Oore, Douglas Eck
arXiv Preprint (2017)
-
PixColor: Pixel Recursive Colorization
Sergio Guadarrama, Ryan Dahl, David Bieber, Mohammad Norouzi, Jonathon Shlens, Kevin Murphy
Proceedings of the 28th British Machine Vision Conference (BMVC) (2017)
-
Pixel Recursive Super Resolution
Ryan Dahl, Mohammad Norouzi, Jonathan Shlens
ICCV (2017)
-
Practically Efficient Nonlinear Acoustic Echo Cancellers Using Cascaded Block RLS and FLMS Adaptive Filters
Yiteng (Arden) Huang, Jan Skoglund, Alejandro Luebs
ICASSP (2017)
-
Predicting Cardiovascular Risk Factors in Retinal Fundus Photographs using Deep Learning
Ryan Poplin, Avinash Vaidyanathan Varadarajan, Katy Blumer, Yun Liu, Mike McConnell, Greg Corrado, Lily Peng, Dale Webster
Arxiv (2017)
-
Quantitative evaluation of omnidirectional video quality
Neil Birkbeck, Chip Brown, Rob Suderman
Quality of Multimedia Experience (QoMEX) (2017)
-
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
Chen Sun, Abhinav Shrivastava, Saurabh Singh, Abhinav Gupta
ICCV (2017)
-
Seamless texturing of 3D meshes of objects from multiple views
Yoni Weill, Dror Aiger
Patent (2017)
-
Self-Supervised Learning of Structure and Motion from Video
Aikaterini Fragkiadaki, Bryan Seybold, Rahul Sukthankar, Sudheendra Vijayanarasimhan, Susanna Ricco
arxiv (2017)
-
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering
Vahid Kazemi, Ali Elqursh
arxiv (2017)
-
Soft 3D Reconstruction for View Synthesis
ACM Transactions on Graphics (Proc. SIGGRAPH Asia), vol. 36 (2017) (to appear)
-
Spatially Adaptive Computation Time for Residual Networks
Dmitry P. Vetrov, Jonathan Huang, Li Zhang, Maxwell Collins, Michael Figurnov, Ruslan Salakhutdinov, Yukun Zhu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
-
Spatially adaptive image compression using a tiled deep network
David Minnen, George Toderici, Michele Covell, Troy Chinen, Nick Johnston, Joel Shor, Sung Jin Hwang, Damien Vincent, Saurabh Singh
Proceedings of the International Conference on Image Processing (2017), pp. 2796-2800
-
Spatiotemporal atlas parameterization for evolving meshes
Fabian Prada, Misha Kazhdan, Ming Chuang, Alvaro Collet, Hugues Hoppe
ACM Transactions on Graphics, vol. 36 (2017)
-
Speed and accuracy trade-offs for modern convolutional object detectors
Alireza Fathi, Anoop Korattikara, Chen Sun, Ian Fischer, Jonathan Huang, Kevin Murphy, Menglong Zhu, Sergio Guadarrama, Vivek Rathod, Yang Song, Zbigniew Wojna
CVPR 2017, Honolulu, Hawaii (2017)
-
Strategies for Foveated Compression and Transmission
Symposium for Information Display, Palisades Convention Management, Inc. 411 Lafayette Street, Suite 201 New York, NY 10003 (2017) (to appear)
-
Supervision via Competition: Robot Adversaries for Learning Tasks
Lerrel Pinto, James Davidson, Abhinav Gupta
ICRA (2017)
-
Synthesizing Normalized Faces from Facial Identity Features
Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, William T. Freeman
Conference on Computer Vision and Pattern Recognition (CVPR) (2017) (to appear)
-
TALL: Temporal Activity Localization via Language Query
Jiyang Gao, Chen Sun, Zhenheng Yang, Ram Nevatia
ICCV (2017)
-
TURN TAP: Temporal Unit Regression Networks for Temporal Action Proposals
Jiyang Gao, Zhenheng Yang, Chen Sun, Kan Chen, Ram Nevatia
ICCV (2017)
-
Zbigniew Wojna, Vittorio Ferrari, Sergio Guadarrama, Nathan Silberman, Liang-chieh Chen, Alireza Fathi, Jasper Uijlings
BMVC (2017)
-
The Kinetics Human Action Video Dataset
Andrew Zisserman, Joao Carreira, Karen Simonyan, Will Kay, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, Mustafa Suleyman
arXiv (2017)
-
The power of sparsity in convolutional neural networks
Soravit Changpinyo, Mark Sandler, Andrey Zhmoginov
arXiv (2017)
-
Three-dimensional models visual differential
Yoni Weill, Dror Aiger
Patent (2017)
-
Towards Accurate Multi-person Pose Estimation in the Wild
George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, Kevin Murphy
CVPR (2017)
-
Towards Learning Semantic Audio Representations from Unlabeled Data
Aren Jansen, Manoj Plakal, Ratheet Pandya, Dan Ellis, Shawn Hershey, Jiayang Liu, Channing Moore, Rif A. Saurous
NIPS Workshop on Machine Learning for Audio Signal Processing (ML4Audio) (2017) (to appear)
-
Training object class detectors with click supervision
Dim Papadopoulos, Jasper Uijlings, Frank Keller, Vittorio Ferrari
CVPR (2017)
-
Training ultra-deep CNNs with critical initialization
Lechao Xiao, Yasaman Bahri, Sam Schoenholz, Jeffrey Pennington
NIPS Workshop (2017) (to appear)
-
Unsupervised Learning of Depth and Ego-Motion from Video
Tinghui Zhou, Matthew Brown, Noah Snavely, David Lowe
Computer Vision and Pattern Recognition, IEEE (2017)
-
Unsupervised Perceptual Rewards for Imitation Learning
Pierre Sermanet, Kelvin Xu, Sergey Levine
Proceedings of Robotics: Science and Systems (RSS 2017) + Deep Learning for Action and Interaction workshop at NIPS (2016) + International Conference on Learning Representations (ICLR 2017) Workshop (2017)
-
Unsupervised Pixel-level Domain Adaptation with Generative Adversarial Networks
Konstantinos Bousmalis, Nathan Silberman, David Dohan, Dumitru Erhan, Dilip Krishnan
CVPR (2017)
-
Unsupervised deep clustering for semantic object retrieval
Steven Hickson, Anelia Angelova, Irfan Essa, Rahul Sukthankar
Baylearn, http://www.baylearn.org/ (2017)
-
Using Perceptual Metrics for Something Other Than Compression
IS&T, Hyatt Regency, Burlingame, California (2017)
-
Video Frame Synthesis Using Deep Voxel Flow
Ziwei Liu, Raymond Yeh, Xiaoou Tang, Yiming Liu, Aseem Agarwala
Proceedings of International Conference on Computer Vision (ICCV) (2017) (to appear)
-
XGAN: Unsupervised Image-to-Image Translation for many-to-many Mappings
Amelie Royer, Konstantinos Bousmalis, Stephan Gouws, Fred Bertsch, Inbar Mosseri, Forrester Cole, Kevin Murphy
arXiv (2017)
-
YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Dataset for Object Detection in Video
Esteban Real, Jon Shlens, Stefano Mazzocchi, Vincent Vanhoucke, Xin Pan
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7464-7473
-
A DYNAMIC MOTION VECTOR REFERENCING SCHEME FOR VIDEO CODING
Jingning Han, Yaowu Xu, James Bankoski
IEEE ICIP (2016)
-
A Deep Matrix Factorization Method for Learning Attribute Representations
George Trigeorgis, Konstantinos Bousmalis, Stefanos Zafeiriou, Björn W. Schuller
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 39 (2016), pp. 417-429
-
A No-reference Perceptual Quality Metric for Videos Distorted by Spatially Correlated Noise
Chao Chen, Mohammad Izadi, Anil Kokaram
ACM Multimedia 2016, Amsterdam, The Netherlands (to appear)
-
A Perceptual Visibility Metric for Banding Artifacts
Yilin Wang, Sang-Uok Kum, Chao Chen, Anil Kokaram
IEEE International Conference on Image Processing (2016) (to appear)
-
A Staircase Transform Coding Scheme for Screen Content Video Coding
Cheng Chen, Jingning Han, Yaowu Xu, James Bankoski
IEEE ICIP (2016)
-
A Subjective Study for the Design of Multi-resolution ABR Video Streams with the VP9 Codec
Chao Chen, Sasi Inguva, Andrew Rankin, Anil Kokaram
SPIE Electronic Imaging, Human Visual Perception (2016) (to appear)
-
A cloud-based large-scale distributed video analysis system
Yongzhe Wang, Wei-Ta Chen, Huahui Wu, Anil Kokaram, Jaron Schaeffer
IEEE International Conference on Image Processing (2016)
-
AN ACOUSTIC KEYSTROKE TRANSIENT CANCELER FOR SPEECH COMMUNICATION TERMINALS USING A SEMI-BLIND ADAPTIVE FILTER MODEL
Herbert Buchner, Simon Godsill, Jan Skoglund
ICASSP (2016)
-
Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow
International Conference on Learning Representations (2016)
-
Attend, Infer, Repeat: Fast Scene Understanding with Generative Models
S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, Geoffrey E. Hinton
NIPS (2016)
-
Audio Deepdream: Optimizing raw audio with convolutional networks
Adam Roberts, Cinjon Resnick, Diego Ardila, Doug Eck
International Society for Music Information Retrieval Conference, Google Brain (2016)
-
BI-MAGNITUDE PROCESSING FRAMEWORK FOR NONLINEAR ACOUSTIC ECHO CANCELLATION ON ANDROID DEVICES
Yiteng (Arden) Huang, Jan Skoglund, Alejandro Luebs
International Workshop on Acoustic Signal Enhancement 2016 (IWAENC2016)
-
Jiawen Chen, Andrew Adams, Neal Wadhwa, Sam Hasinoff
ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2016) (2016)
-
Bitrate Classification of Twice-Encoded Audio using Objective Quality Features
Colm Sloan, Naomi Harte, Anil Kokaram, Damien Kelly, Andrew Hines
8th International Conference on Quality of Multimedia Experience (QoMEX 2016)
-
Blockout: Dynamic Model Selection for Hierarchical Deep Networks
Calvin Murdock, Zhen Li, Howard Zhou, Tom Duerig
CVPR 2016
-
Burst photography for high dynamic range and low-light imaging on mobile cameras
Sam Hasinoff, Dillon Sharlet, Ryan Geiss, Andrew Adams, Jonathan T. Barron, Florian Kainz, Jiawen Chen, Marc Levoy
ACM Transactions on Graphics (Proc. SIGGRAPH Asia 2016) (2016)
-
Chained Predictions Using Convolutional Neural Networks
Georgia Gkioxari, Navdeep Jaitly, Alexander Toshev
European Conference on Computer Vision (2016)
-
Chained predictions using convolutional neural networks
Georgia Gkioxari, Alexander Toshev, Navdeep Jaitly
ECCV (2016)
-
Computer Vision for Active and Assisted Living
Rainer Planinc, Alexandros Chaaraoui, Martin Kampel, Francisco Florez-Revuelta
Active and Assisted Living: Technologies and Applications, IET - The institution of Engineering and Technology, Savoy Place London WC2R 0BL UK (2016)
-
Content-based Related Video Recommendations
Joonseok Lee, Nisarg Kothari, Paul Natsev
Advances in Neural Information Processing Systems (NIPS) Demonstration Track (2016)
-
DeepStereo: Learning to Predict New Views From the World's Imagery
John Flynn, Ivan Neulander, James Philbin, Noah Snavely
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
-
Density Estimation using Real NVP
Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio
arXiv preprint (2016)
-
Detecting Events and Key Actors in Multi-Person Videos
Vignesh Ramanathan, Jonathan Huang, Sami Abu-El-Haija, Alexander Gorban, Kevin Murphy, Li Fei-Fei
Computer Vision and Pattern Recognition (CVPR) (2016)
-
Discovering the physical parts of an articulated object class from multiple videos
Luca DelPero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari
CVPR (2016)
-
Do-It-Yourself Lighting Design for Product Videography
Ivaylo Boyadzhiev, Jiawen Chen, Kavita Bala, Sylvain Paris
IEEE International Conference on Computational Photography (2016)
-
Konstantinos Bousmalis, George Trigeorgis, Nathan Silberman, Dilip Krishnan, Dumitru Erhan
NIPS 2016 (2016)
-
Exploiting cyclic symmetry in convolutional neural networks
Jeffrey De Fauw, Koray Kavukcuoglu, Sander Dieleman
International Conference on Machine Learning (2016)
-
Alireza Fathi, Anoop Korattikara, Chen Sun, Ian Fischer, Jonathan Huang, Kevin Murphy, Menglong Zhu, Sergio Guadarrama, Vivek Rathod, Yang Song, Zbigniew Wojna
2nd ImageNet and COCO Visual Recognition Challenges Joint Workshop, Amsterdam (2016)
-
GLOBALLY OPTIMIZED LEAST-SQUARES POST-FILTERING FOR MICROPHONE ARRAY SPEECH ENHANCEMENT
Yiteng (Arden) Huang, Alejandro Luebs, Jan Skoglund, W. Bastiaan Kleijn
ICASSP (2016)
-
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao, Jonathan Huang, Alexander Toshev, Oana Camburu, Kevin Murphy
Computer Vision and Pattern Recognition (2016)
-
Geometry-driven quantization for omnidirectional image coding
Francesca De Simone, Pascal Frossard, Paul Wilkins, Neil Birkbeck, Anil Kokaram
Picture Coding Symposium (PCS) (2016)
-
Improving the Robustness of Deep Neural Networks via Stability Training
Stephan Zheng, Yang Song, Thomas Leung, Ian Goodfellow
CVPR'2016, IEEE (to appear)
-
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex A. Alemi
ICLR 2016 Workshop
-
Inverting Face Embeddings with Convolutional Neural Networks
Andrey Zhmoginov, Mark Sandler
arXiv (2016)
-
Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernandez Esteban, Sameer Agarwal, Steven M. Seitz
ACM Transactions on Graphics(Proc. of SIGGRAPH Asia 2016) (2016) (to appear)
-
arXiv (2016)
-
Leveraging Contextual Cues for Generating Basketball Highlights
Vinay Bettadapura, Caroline Pantofaru, Irfan Essa
ACM Multimedia (2016)
-
Multi-Task Convolutional Music Models
Adam Roberts, Cinjon Resnick, Diego Ardila, Doug Eck
BayLearn (2016)
-
ON PRE-FILTERING STRATEGIES FOR THE GCC-PHAT ALGORITHM
Hong-Goo Kang, Michael Graczyk, Jan Skoglund
International Workshop on Acoustic Signal Enhancement 2016 (IWAENC 2016)
-
On The Existence of Epipolar Matrices
Sameer Agarwal, Hon Leung Lee, Bernd Sturmfels, Rekha R. Thomas
International Journal of Computer Vision (2016), pp. 1-13
-
Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision
Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, Honglak Lee
NIPS (2016)
-
Perspective-aware manipulation of portrait photos
Ohad Fried, Eli Shechtman, Dan B Goldman, Adam Finkelstein
ACM Transactions on Graphics (Proc. SIGGRAPH), vol. 35(4) (2016)
-
PlaNet - Photo Geolocation with Convolutional Neural Networks
Tobias Weyand, Ilya Kostrikov, James Philbin
European Conference on Computer Vision (ECCV) (2016)
-
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, Zbigniew Wojna
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, (2016)
-
Robust Estimation of Reverberation Time Using Polynomial Roots
Ian Kelly, Francis Boland, Jan Skoglund
AES 60th Conference on Dereverberation and Reverberation of Audio, Music, and Speech, Google Ireland Ltd. (2016)
-
SSD: Single Shot MultiBox Detector
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed,, Cheng-Yang Fu,, Alexander C. Berg
Proceedings of the European Conference on Computer Vision (ECCV) (2016) (to appear)
-
Scalable Learning of Non-Decomposable Objectives
Elad Eban, Mariano Schain, Alan Mackey, Ariel Gordon, Rif A. Saurous, Gal Elidan
arXiv preprint arXiv:1608.04802 (2016)
-
Harrie Oosterhuis, Sujith Ravi, Mike Bendersky
ICML 2016 Workshop on Multi-View Representation Learning
-
Jonathan T. Barron, Ben Poole
ECCV (2016)
-
The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition
Jonathan Krause, Andrew Howard, Benjamin Sapp, Howard Zhou, Alexander Toshev, Tom Duerig, James Philbin, Li Fei-Fei
Computer Vision and Pattern Recognition (2016)
-
The little Engine that Could: Regularization by Denoising (RED)
Yaniv Romano, Michael Elad, Peyman Milanfar
ArXiv (2016) (to appear)
-
Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task
Nan Ding, Sebastian Goodman, Fei Sha, Radu Soricut
Arxiv, https://arxiv.org/abs/1612.07833 (2016)
-
Unsupervised Learning for Physical Interaction through Video Prediction
Chelsea Finn, Ian Goodfellow, Sergey Levine
arXiv e-prints (2016)
-
Webly-supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames
Chuang Gan, Chen Sun, Lixin Duan, Boqing Gong
European Conference on Computer Vision (ECCV) (2016) (to appear)
-
YouTube-8M: A Large-Scale Video Classification Benchmark
Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Apostol (Paul) Natsev, George Toderici, Balakrishnan Varadarajan, Sudheendra Vijayanarasimhan
arXiv:1609.08675 (2016)
-
Guang Wang, Richard F. Lyon, Emmanuel M. Drakakis
IEEE Transactions on Biomedical Circuits and Systems, vol. 9 (2015), pp. 72-86
-
A Computational Approach for Obstruction-Free Photography
Tianfan Xue, Michael Rubinstein, Ce Liu, William T. Freeman
ACM Transactions on Graphics, vol. 34, no. 4 (Proc. SIGGRAPH) (2015)
-
A World of Movement
Fredo Durand, William T. Freeman, Michael Rubinstein
Scientific American, vol. 312, no. 1 (2015)
-
An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections
Yu Cheng, Felix X. Yu, Rogerio Feris, Sanjiv Kumar, Shih-Fu Chang
International Conference on Computer Vision (ICCV) (2015)
-
An estimation-theoretic approach to video denoising
Jingning Han, Timothy Kopp, Yaowu Xu
2015 IEEE International Conference on Image Processing, IEEE, pp. 4273-4277
-
Attention for fine-grained categorization
Pierre Sermanet, Andrea Frome, Esteban Real
International Conference on Learning Representations (ICLR 2015) workshop
-
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe, Christian Szegedy
Proceedings of The 32nd International Conference on Machine Learning (2015), pp. 448-456
-
Best-Buddies Similarity for Robust Template Matching
Tali Dekel, Shaul Oron, Michael Rubinstein, Shai Avidan, William T. Freeman
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (2015)
-
Beyond Short Snippets: Deep Networks for Video Classification
Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, George Toderici
Computer Vision and Pattern Recognition (2015)
-
ICCV (2015) (to appear)
-
DETECTION AND SUPPRESSION OF KEYBOARD TRANSIENT NOISE IN AUDIO STREAMS WITH AUXILIARY KEYBED MICROPHONE
Simon Godsill, Herbert Buchner, Jan Skoglund
ICASSP 2015, IEEE
-
DIRECT-TO-REVERBERANT RATIO ESTIMATION USING A NULL-STEERED BEAMFORMER
James Eaton, Alastair Moore, Patrick Naylor, Jan Skoglund
ICASSP 2015, IEEE
-
Deep Networks With Large Output Spaces
Sudheendra Vijayanarasimhan, Jonathon Shlens, Rajat Monga, Jay Yagnik
International Conference on Learning Representations (2015)
-
Efficient Large Scale Video Classification
Balakrishnan Varadarajan, George Toderici, Paul Natsev, Sudheendra Vijayanarasimhan
dblp computer science bibliography, http://dblp.org (2015) (to appear)
-
Egocentric Field-of-View Localization Using First-Person Point-of-View Devices
Vinay Bettadapura, Irfan Essa, Caroline Pantofaru
Proceedings of Winter Conference on Applications of Computer Vision (WACV), IEEE (2015)
-
Fast Bilateral-Space Stereo for Synthetic Defocus
Jonathan T Barron, Andrew Adams, YiChang Shih, Carlos Hernández
CVPR (2015)
-
Fast Orthogonal Projection Based on Kronecker Product
Xu Zhang, Felix X. Yu, Ruiqi Guo, Sanjiv Kumar, Shengjin Wang, Shih-Fu Chang
International Conference on Computer Vision (ICCV) (2015)
-
Going Deeper with Convolutions
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich
Computer Vision and Pattern Recognition (CVPR) (2015)
-
Im2Calories: towards an automated mobile vision food diary
Austin Myers, Nick Johnston, Vivek Rathod, Anoop Korattikara, Alex Gorban, Nathan Silberman, Sergio Guadarrama, George Papandreou, Jonathan Huang, Kevin Murphy
ICCV (2015)
-
IsoMatch: Creating Informative Grid Layouts
Ohad Fried, Stephen DiVerdi, Maciej Halber, Elena Sizikova, Adam Finkelstein
Computer Graphics Forum (Proceedings of Eurographics), vol. 34(2) (2015) (to appear)
-
Yasuhisa Fujii, Dmitriy Genzel, Ashok C. Popat, Remco Teunen
13th International Conference on Document Analysis and Recognition (ICDAR), IEEE (2015), pp. 756-760
-
Large Scale Business Discovery from Street Level Imagery
Qian Yu, Christian Szegedy, Martin C. Stumpe, Liron Yatziv, Vinay Shet, Julian Ibarz, Sacha Arnoud
arXiv (2015)
-
Learning semantic relationships for better action retrieval in images
Vignesh Ramanathan, Congcong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang Song, Samy Bengio, Chuck Rosenberg, Li Fei-Fei
CVPR (2015)
-
Object Recognition from Short Videos for Robotic Perception
Ivan Bogun, Anelia Angelova, Navdeep Jaitly
CoRR, vol. abs/1509.01602 (2015)
-
Ontological Supervision for Fine Grained Classification of Street View Storefronts
Yair Movshovitz-Attias, Qian Yu, Martin C. Stumpe, Vinay Shet, Sacha Arnoud, Liron Yatziv
CVPR15 (2015)
-
Palette-based Photo Recoloring
Huiwen Chang, Ohad Fried, Yiming Liu, Stephen DiVerdi, Adam Finkelstein
Transactions on Graphics (Proceedings of SIGGRAPH) (2015) (to appear)
-
Pedestrian Detection with a Large-Field-Of-View Deep Network
Anelia Angelova, Alex Krizhevsky, Vincent Vanhoucke
Proceedings of ICRA 2015
-
Pose Embeddings: A Deep Architecture for Learning to Match Human Poses
Greg Mori, Caroline Pantofaru, Nisarg Kothari, Thomas Leung, George Toderici, Alexander Toshev, Weilong Yang
arXiv (2015)
-
Probabilistic Label Relation Graphs with Ising Models
Nan Ding, Jia Deng, Kevin Murphy, Hartmut Neven
International Conference on Computer Vision (2015)
-
Real-Time Grasp Detection Using Convolutional Neural Networks
Joseph Redmon, Anelia Angelova
International Conference on Robotics and Automation (ICRA), IEEE (2015)
-
Real-Time Pedestrian Detection With Deep Network Cascades
Anelia Angelova, Alex Krizhevsky, Vincent Vanhoucke, Abhijit Ogale, Dave Ferguson
Proceedings of BMVC 2015
-
Refer-to-as Relations as Semantic Knowledge
Song Feng, Sujith Ravi, Ravi Kumar, Polina Kuznetsova, Wei Liu, Alex Berg, Tamara Berg, Yejin Choi
AAAI Conference on Artificial Intelligence (2015)
-
Show and tell: A neural image caption generator
Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan
Computer Vision and Pattern Recognition (2015)
-
Speech Acoustic Modeling from Raw Multichannel Waveforms
Yedid Hoshen, Ron Weiss, Kevin W Wilson
International Conference on Acoustics, Speech, and Signal Processing, IEEE (2015)
-
Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images
Chen Sun, Sanketh Shetty, Rahul Sukthankar, Ram Nevatia
ACM Multimedia (2015)
-
The latest open-source video codec VP9 - An overview and preliminary results
Debargha Mukherjee, Jingning Han, Jim Bankoski, Ronald S Bultje, Adrian Grange, John Koleszar, Paul Wilkins, Yaowu Xu
SMPTE Motion Imaging Journal, vol. 124 (2015)
-
VIP: Finding Important People in Images
Clint Solomon Mathialagan, Andrew C. Gallagher, Dhruv Batra
Computer Vision and Pattern Recognition, Computer Vision and Pattern Recognition, Computer Vision and Pattern Recognition (2015), pp. 4858-4966
-
ViSQOLAudio: An objective audio quality metric for low bitrate codecs
Andrew Hines, Eoin Gillen, Damien Kelly, Jan Skoglund, Anil Kokaram, Naomi Harte
The Journal of the Acoustical Society of America, vol. 137 (6) (2015), EL449-EL455
-
Visual Vibrometry: Estimating Material Properties from Small Motion in Video
Abe Davis, Katherine L. Bouman, Justin G. Chen, Michael Rubinstein, Fredo Durand, William T. Freeman
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (2015)
-
What’s Cookin’? Interpreting Cooking Videos using Text, Speech and Vision
Jonathan Malmaud, Jonathan Huang, Vivek Rathod, Nicholas Johnston, Andrew Rabinovich, Kevin Murphy
North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL HLT 2015) (to appear)
-
An optimized template matching approach to intra coding in video/image compression
Hui Su, Jingning Han, Yaowu Xu
IS&T/SPIE Electronic Imaging, 2014, SPIE, pp. 1-6
-
Auto-Rectification of User Photos
Krishnendu Chaudhury (aka Krish Chaudhury), Stephen DiVerdi, Sergey Ioffe
Proceedings of International Conference on Image Processing, ICIP, IEEE (2014), pp. 3479-3483
-
Co-Segmentation of Textured 3D Shapes with Sparse Annotations
M. Ersin Yumer, Ameesh Makadia
Computer Vision and Pattern Recognition (CVPR) (2014)
-
Rui Hou, Amir Roshan Zamir, Rahul Sukthankar, Mubarak Shah
Proceedings of European Conference on Computer Vision (2014)
-
DeepPose: Human Pose Estimation via Deep Neural Networks
Alexander Toshev, Christian Szegedy
Computer Vision and Pattern Recognition (2014) (to appear)
-
Discovering Groups of People in Images
Wongun Choi, Yu-Wei Chao, Caroline Pantofaru, Silvio Savarese
European Conference on Computer Vision (ECCV) (2014)
-
Indoor Scene Understanding with Geometric and Semantic Contexts
Wongun Choi, Yu-Wei Chao, Caroline Pantofaru, Silvio Savarese
International Journal of Computer Vision (IJCV) (2014)
-
Large-Scale Object Classification Using Label Relation Graphs
Jia Deng, Nan Ding, Yangqing Jia, Andrea Frome, Kevin Murphy, Samy Bengio, Yuan Li, Hartmut Neven, Hartwig Adam
European Conference on Computer Vision (2014)
-
Large-scale Video Classification with Convolutional Neural Networks
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei
Proceedings of International Computer Vision and Pattern Recognition (CVPR 2014), IEEE
-
Learning 3D Part Detection from Sparsely Labeled Data
Ameesh Makadia, Mehmet Ersin Yumer
2nd International Conference on 3D Vision, 2014 (2014)
-
Learning Fine-grained Image Similarity with Deep Ranking
Jiang Wang, Yang Song, Thomas Leung, Chuck Rosenberg, Jingbin Wang, James Philbin, Bo Chen, Ying Wu
CVPR'2014, IEEE
-
Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks
Ian Goodfellow, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, Vinay Shet
ICLR2014, ICLR2014 (to appear)
-
Neural Networks and Neuroscience-Inspired Computer Vision
David Cox, Tom Dean
Current Biology, vol. 24 (2014), pp. 921-929
-
Marc'Aurelio Ranzato
Google Inc. (2014)
-
Mark D. Benjamin, Stephen DiVerdi, Adam Finkelstein
Proceedings of the Workshop on Non-Photorealistic Animation and Rendering, NPAR, ACM, New York, NY, USA (2014), pp. 13-20
-
RealPigment: Paint Compositing by Example
Jingwan Lu, Stephen DiVerdi, Willa Chen, Connelly Barnes, Adam Finkelstein
Proceedings of the Workshop on Non-Photorealistic Animation and Rendering, NPAR, ACM, New York, NY, USA (2014), pp. 21-30
-
Recognition of Complex Events: Exploiting Temporal Dynamics between Underlying Concepts
Subhabrata Bhattacharya, Mahdi M. Kalayeh, Rahul Sukthankar, Mubarak Shah
Proceedings of International Computer Vision and Pattern Recognition (CVPR 2014), IEEE
-
SUPER 4PCS Fast Global Pointcloud Registration via Smart Indexing
Nicolas Mellado, Dror Aiger, Niloy Mitra
Eurographics Symposium on Geometry Processing 2014
-
Scalable Object Detection using Deep Neural Networks
Dumitru Erhan, Christian Szegedy, Alexander Toshev, Dragomir Anguelov
Computer Vision and Pattern Recognition, IEEE (2014), pp. 2155- 2162
-
Sinusoidal Interpolation Across Missing Data
W. Bastiaan Kleijn, Turaj Zakizadeh Shabestary, Jan Skoglund
International Workshop on Acoustic Signal Enhancement 2014 (IWAENC 2014), pp. 71-75
-
Temporal Synchronization of Multiple Audio Signals
Julius Kammerl, Neil Birkbeck, Sasi Inguva, Damien Kelly, Andy Crawford, Hugh Denman, Anil Kokaram, Caroline Pantofaru
Proceedings of the International Conference on Signal Processing (ICASSP), Florence, Italy (2014)
-
The Optical Mouse: Early Biomimetic Embedded Vision
Advnances in Embedded Computer Vision, Springer (2014), pp. 3-22
-
Training Highly Multi-class Linear Classifiers
Maya R. Gupta, Samy Bengio, Jason Weston
Journal Machine Learning Research (JMLR) (2014), 1461-−1492
-
Unsupervised Discovery of Object Classes with a Mobile Robot
Julian Mason, Bhaskara Marthi, Ronald Parr
ICRA 2014
-
Video Object Discovery and Co-segmentation with Extremely Weak Supervision
Le Wang, Gang Hua, Rahul Sukthankar, Jianru Xue, Nanning Zheng
Proceedings of European Conference on Computer Vision (2014)
-
Video Quality Assessment for Web Content Mirroring
Ye He, Kevin Fei, Gus Fernandez, Edward J. Delp
Imaging and Multimedia Analytics in a Web and Mobile World 2014, IS&T/SPIE Electronic Imaging, San Francisco, California, pp. 9027-11
-
Zero-Shot Learning by Convex Combination of Semantic Embeddings
Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg Corrado, Jeffrey Dean
International Conference on Learning Representations (2014)
-
3DNN: Viewpoint Invariant 3D Geometry Matching for Scene Understanding
Scott Satkin, Martial Hebert
Proceedings of the International Conference on Computer Vision (ICCV) (2013) (to appear)
-
A Butterfly Structured Design of The Hybrid Transform Coding Scheme
Jingning Han, Yaowu Xu, Debargha Mukherjee
Picture Coding Symposium, IEEE (2013), pp. 1-4
-
A Discriminative Model for Learning Semantic and Geometric Interactions in Indoor Scenes
Wongun Choi, Yu-Wei Chao, Caroline Pantofaru, Silvio Savarese
Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Scene Understanding Workshop (SUNw) (2013)
-
Accelerating defocus blur magnification
Florian Kriener, Thomas Binder, Manuel Wille
Proceedings SPIE Vol. 8667 (Multimedia Content and Mobile Devices), SPIE (2013)
-
Category-Independent Object-level Saliency Detection
International Conference on Computer Vision (2013)
-
DeViSE: A Deep Visual-Semantic Embedding Model
Andrea Frome, Greg Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc’Aurelio Ranzato, Tomas Mikolov
Neural Information Processing Systems (NIPS) (2013)
-
Deep Neural Networks for Object Detection
Christian Szegedy, Alexander Toshev, Dumitru Erhan
Advances in Neural Information Processing Systems (2013)
-
Design of user interfaces for selective editing of digital photos on touchscreen devices
Thomas Binder, Meikel Steiding, Manuel Wille, Nils Kokemohr
Proceedings SPIE 8667 (Multimedia Content and Mobile Devices), SPIE (2013)
-
Discriminative Segment Annotation in Weakly Labeled Video
Kevin Tang, Rahul Sukthankar, Jay Yagnik, Li Fei-Fei
Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR 2013)
-
Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
Thomas Dean, Mark Ruzon, Mark Segal, Jonathon Shlens, Sudheendra Vijayanarasimhan, Jay Yagnik
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Washington, DC, USA (2013)
-
Fast, Accurate Detection of 100,000 Object Classes on a Single Machine: Technical Supplement
Thomas Dean, Mark Ruzon, Mark Segal, Jonathon Shlens, Sudheendra Vijayanarasimhan, Jay Yagnik
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Washington, DC, USA (2013)
-
HMM-based script identification for OCR
Dmitriy Genzel, Ashok Popat, Remco Teunen, Yasuhisa Fujii
Proceedings of the 4th International Workshop on Multilingual OCR, ACM, New York, NY, US (2013), 2:1-2:5
-
Handling Packet Loss in WebRTC
Stefan Holmer, Mikhal Shemer, Marco Paniconi
International Conference on Image Processing (ICIP 2013), IEEE, pp. 1860-1864
-
High-Resolution Global Maps of 21st-Century Forest Cover Change
Rebecca Moore, Matt Hancher, David Thau
Science, vol. 342 (2013), pp. 850-853
-
Image Annotation in Presence of Noisy Labels
Chandrashekhar V., Shailesh Kumar, C. V. Jawahar
International Conference on Pattern Recognition and Machine Intelligence (2013) (to appear)
-
Image Compression via Colorization Using Semi-Regular Color Samples
Chenguang Zhang, Hui Fang
Data Compression Conference (2013)
-
Joint Noise Level Estimation from Personal Photo Collections
YiChang Shih, Vivek Kwatra, Troy Chinen, Hui Fang, Sergey Ioffe
ICCV 2013 (to appear)
-
Learning Binary Codes for High Dimensional Data Using Bilinear Projections
Yunchao Gong, Sanjiv Kumar, Henry Rowley, Svetlana Lazebnik
IEEE Computer Vision and Pattern Recognition (2013)
-
Learning Multiple Non-Linear Sub-Spaces using K-RBMs
Siddhartha Chandra, Shailesh Kumar, C. V. Jawahar
Computer Vision and Pattern Recognition (2013)
-
Learning Part-based Templates from Large Collections of 3D Shapes
Vladimir Kim, Wilmot Li, Niloy Mitra, Siddhartha Chaudhuri, Stephen DiVerdi, Thomas Funkhouser
ACM Transactions on Graphics (TOG) - SIGGRAPH 2013 Conference Proceedings, vol. 32, no. 4 (2013), 70:1-70:12
-
Learning Query-Specific Distance Functions for Large-Scale Web Image Search
Yushi Jing, Michele Covell, David Tsai, James M. Rehg
IEEE Transactions on Multimedia, vol. 15 (2013), pp. 2022-2034
-
Modelling the Distortion Produced by Cochlear Compression
Roy D. Patterson, Timothy Ives, Thomas C. Walters, Richard F. Lyon
Basic Aspects of Hearing, Springer (2013), pp. 81-88
-
Random Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search
Dror Aiger, Efi Kokiopoulou, Ehud Rivlin
ICCV 2013
-
Rate-Distortion Optimization for Multichannel Audio Compression
Minyue Li, Jan Skoglund, W. Bastiaan Kleijn
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
-
RealBrush: Painting with Examples of Physical Media
Jingwan Lu, Connelly Barnes, Stephen DiVerdi, Adam Finkelstein
ACM Transactions on Graphics (TOG) -- SIGGRAPH 2013 Conference Proceedings, vol. 32, no. 4 (2013), 117:1-117:12
-
Ivan Neulander, Toshi Kato, Kevin Beason
ACM, New York, NY, USA
-
Reporting Neighbors in High-Dimensional Euclidean Space
Dror Aiger, Haim Kaplan, Micha Sharir
SODA (2013)
-
Spatiotemporal Deformable Part Models for Action Detection
Yicong Tian, Rahul Sukthankar, Mubarak Shah
Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR 2013)
-
Street View Motion-from-Structure-from-Motion
Bryan Klingner, David Martin, James Roseborough
Proceedings of the International Conference on Computer Vision, IEEE (2013)
-
The Intervalgram: An Audio Feature for Large-Scale Cover-Song Recognition
Thomas C. Walters, David A. Ross, Richard F. Lyon
From Sounds to Music and Emotions: 9th International Symposium, CMMR 2012, London, UK, June 19-22, 2012, Revised Selected Papers, Springer Berlin Heidelberg (2013), pp. 197-213
-
The latest open-source video codec VP9 - An overview and preliminary results
Debargha Mukherjee, Jim Bankoski, Adrian Grange, Jingning Han, John Koleszar, Paul Wilkins, Yaowu Xu, Ronald S Bultje
Picture Coding Symposium (2013)
-
Tracking Large-Scale Video Remix in Real-World Events
Lexing Xie, Apostol Natsev, Xuming He, John R. Kender, Matthew L. Hill, John R. Smith
IEEE Transactions on Multimedia, vol. 15, no. 6 (2013), pp. 1244-1254
-
Understanding Indoor Scenes using 3D Geometric Phrases
Wongun Choi, Yu-Wei Chao, Caroline Pantofaru, Silvio Savarese
Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR 2013)
-
Using Web Co-occurrence Statistics for Improving Image Categorization
Samy Bengio, Jeffrey Dean, Dumitru Erhan, Eugene Ie, Quoc Le, Andrew Rabinovich, Jonathon Shlens, Yoram Singer
arXiv (2013)
-
Video Motion for Every Visible Point
Susanna Ricco, Carlo Tomasi
International Conference on Computer Vision (ICCV) (2013)
-
A QCQP Approach to Triangulation
Chris Aholt, Rekha Thomas, Sameer Agarwal
European Conference on Computer Vision, Springer Verlag (2012)
-
All Smiles : Automatic Photo Enhancement by Facial Expression Analysis
Rajvi Shah, Vivek Kwatra
Conference for Visual Media Production (CVMP 2012) [Best Paper]
-
Apparel silhouette attributes recognition
Wei Zhang, Emilio Antunez, Salih Gokturk, Baris Sumengen
Proceedings of the 2012 IEEE Workshop on the Applications of Computer Vision, IEEE Computer Society, Washington, DC, USA, pp. 489-496
-
Automatically Discovering Talented Musicians with Acoustic Analysis of YouTube Videos
Eric Nichols, Charles DuHadway, Hrishikesh Aradhye, Richard F. Lyon
Proceedings of the 2012 IEEE 12th International Conference on Data Mining (ICDM), IEEE Computer Society, Washington, DC, USA, pp. 559-565
-
Building Musically-relevant Audio Features through Multiple Timescale Representations
Philippe Hamel, Yoshua Bengio, Douglas Eck
Proceedings of the 13th International Society for Music Information Retrieval Conference, Porto, Portugal (2012)
-
Building high-level features using large scale unsupervised learning
Quoc Le, Marc'Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg Corrado, Jeff Dean, Andrew Ng
International Conference in Machine Learning (2012)
-
Calibration-Free Rolling Shutter Removal
Matthias Grundmann, Vivek Kwatra, Daniel Castro, Irfan Essa
International Conference on Computational Photography [Best Paper], IEEE (2012)
-
Capturing Indoor Scenes with Smartphones
Aditya Sankar, Steve Seitz
Proc. UIST, 651 N. 34th St. (2012) (to appear)
-
Coherent image selection using a fast approximation to the generalized traveling salesman problem
Meng Wang, Prakash Ishwar, Janusz Konrad, Cenk Gazen, Rohit Saboo
Proceedings of the 20th ACM international conference on Multimedia, ACM, New York, NY, USA (2012), pp. 981-984
-
D-Nets: Beyond Patch-Based Image Descriptors
Felix von Hundelshausen, Rahul Sukthankar
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'12) (2012)
-
Efficient Closed-Form Solution to Generalized Boundary Detection
Marius Leordeanu, Rahul Sukthankar, Crisitian Sminchisescu
Proceedings of European Conference on Computer Vision (ECCV'12) (2012)
-
Efficient model based single and double thresholding for real time recognition
Dror Aiger, Silvio Guimarães
ACCV Workshop on Detection and Tracking in Challenging Environments (2012)
-
Embedded Voxel Colouring with Adaptive Threshold Selection Using Globally Minimal Surfaces
Carlos Leung, Ben Appleton, Mitchell Buckley, Changming Sun
IJCV, vol. 99 (2012), pp. 215-231
-
General and Nested Wiberg Minimization
Dennis Strelow
Computer Vision and Pattern Recognition, IEEE (2012)
-
General and nested Wiberg minimization: L2 and maximum likelihood
Dennis Strelow
European Conference on Computer Vision, Springer (2012)
-
IMPROVED PREDICTION OF NEARLY-PERIODIC SIGNALS
Bastiaan Kleijn, Jan Skoglund
International Workshop on Acoustic Signal Enhancement 2012 (IWAENC2012)
-
Improving Book OCR by Adaptive Language and Image Models
Dar-Shyang Lee, Ray Smith
Proceedings of 2012 10th IAPR International Workshop on Document Analysis Systems, IEEE, pp. 115-119
-
Joint Image and Word Sense Discrimination For Image Retrieval
Aurelien Lucchi, Jason Weston
ECCV (2012)
-
Learning Hierarchical Bag of Words Using Naive Bayes Clustering
Siddhartha Chandra, Shailesh Kumar, C. V. Jawahar
Asian Conference on Computer Vision (2012), pp. 382-395
-
MEASURING NOISE CORRELATION FOR IMPROVED VIDEO DENOISING
Anil Kokaram, Damien Kelly, Hugh Denman, Andrew Crawford
IEEE International Conference on Image Processing, IEEE, 1600 Amphitheatre Parkway (2012)
-
Measuring the Objectness of Image Windows
Bogdan Alexe, Thomas Deselaers, Vittorio Ferrari
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34/11 (2012), pp. 2189-2202
-
Mobile Music Modeling, Analysis and Recognition
Pavel Golik, Boulos Harb, Ananya Misra, Michael Riley, Alex Rudnick, Eugene Weinstein
International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2012)
-
Model Recommendation for Action Recognition
Pyry Matikainen, Rahul Sukthankar, Martial Hebert
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'12) (2012)
-
Molli: Interactive Visualization for Exploratory Protein Analysis
Sara L. Su, Connor Gramazio, Megan Strait, Caitlin Crumm, Daniela Extrum-Fernandez, Matt Menke, Lenore Cowen
IEEE Computer Graphics & Applications, vol. 32 (2012), pp. 62-69
-
Multi-component Models for Object Detection
Chunhui Gu, Pablo Arbelaez, Yuanqing Lin, Kai Yu, Jitendra Malik
European Conference on Computer Vision, Springer (2012), Volume 4, 445-458
-
Multimedia Semantics: Interactions Between Content and Community
Hari Sundaram, Lexing Xie, Munmun De Choudhury, Yu-Ru Lin, Apostol Natsev
Proceedings of the IEEE, vol. 100, no. 9 (2012)
-
On Using Nearly-Independent Feature Families for High Precision and Confidence
Omid Madani, Manfred Georg, David Ross
Fourth Asian Machine Learning Conference, JMLR workshop and conference proceedings (2012), pp. 269-284
-
Photo Tours
Avanish Kushal, Ben Self, Yasutaka Furukawa, David Gallup, Carlos Hernandez, Brian Curless, Steve Seitz
3DimPVT 2012 (to appear)
-
Real-Time Human Pose Tracking from Range Data
Varun Ganapathi, Christian Plagemann, Daphne Koller, Sebastian Thrun
Proceedings of the European Conference on Computer Vision (ECCV) (2012)
-
Reconstructing the World's Museums
Jianxiong Xiao, Yasutaka Furukawa
European Conference on Computer Vision (2012) (to appear)
-
Refractive Height Fields from Single and Multiple Images
Qi Shan, Sameer Agarwal, Brian Curless
IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2012)
-
Repetition Maximization based Texture Rectification
Dror Aiger, Niloy Mitra, Daniel Cohen-Or
EUROGRAPHICS 2012
-
Scene Aligned Pooling for Complex Video Recognition
Liangliang Cao, Yadong Mu, Apostol Natsev, Shih-Fu Chang, Gang Hua, John R. Smith
ECCV (2012), pp. 688-701
-
Schematic Surface Reconstruction
Changchang Wu, Sameer Agarwal, Brian Curless, Steven M. Seitz
IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2012)
-
Semantic Segmentation Using Regions and Parts
Pablo Arbelaez, Bharath Hariharan, Chunhui Gu, Saurabh Gupta, Lubomir Bourdev, Jitendra Malik
Computer Vision and Pattern Recognition, IEEE Computer Society Washington, DC, USA (2012), pp. 3378-3385
-
Semi-Supervised Hashing for Large Scale Search
Jun Wang, Sanjiv Kumar, Shih-Fu Chang
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2012)
-
Shadow Removal for Aerial Imagery by Information Theoretic Intrinsic Image Analysis
Vivek Kwatra, Mei Han, Shengyang Dai
International Conference on Computational Photography, IEEE (2012)
-
Size Matters: Exhaustive Geometric Verification for Image Retrieval
Henrik Stewenius, Steinar H. Gunderson, Julien Pilet
12th European Conference on Computer Vision (ECCV), Springer (2012), pp. 674-687
-
Street view goes indoors: Automatic pose estimation from uncalibrated unordered spherical panoramas
Mohamed Aly, Jean-Yves Bouguet
Proceedings of the 2012 IEEE Workshop on the Applications of Computer Vision, IEEE Computer Society, Washington, DC, USA, pp. 1-8
-
Unsupervised Learning for Graph Matching
Marius Leordeanu, Rahul Sukthankar, Martial Hebert
International Journal of Computer Vision, vol. 96 (2012), pp. 28-45
-
VISQOL: THE VIRTUAL SPEECH QUALITY OBJECTIVE LISTENER
Andrew Hines, Jan Skoglund, Anil Kokaram, Naomi Harte
International Workshop on Acoustic Signal Enhancement 2012 (IWAENC2012)
-
Video Description Length Guided Constant Quality Video Coding with Bitrate Constraint
Lei Yang, Debargha Mukherjee, Dapeng Wu
Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on, IEEE, 2001 L Street, NW. Suite 700 Washington, DC 20036-4910 USA, pp. 366-371
-
Visibility Based Preconditioning for Bundle Adjustment
Avanish Kushal, Sameer Agarwal
IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2012)
-
Weakly Supervised Learning of Object Segmentations from Web-Scale Video
Glenn Hartmann, Matthias Grundmann, Judy Hoffman, David Tsai, Vivek Kwatra, Omid Madani, Sudheendra Vijayanarasimhan, Irfan Essa, James Rehg, Rahul Sukthankar
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I, Springer-Verlag, Berlin, Heidelberg (2012), pp. 198-208
-
A Hierarchical Conditional Random Field Model for Labeling and Images of Street Scenes
Qixing Huang, Mei Han, Bo Wu, Sergey Ioffe
International Conference on Computer Vision and Pattern Recognition (2011)
-
Mechanics of Hearing (2011)
-
Aesthetics and Emotions in Images
Dhiraj Joshi, Ritendra Datta, Elena Fedorovskaya, Quang-Tuan Luong, James Z. Wang, Jia Li, Jiebo Luo
IEEE Signal Processing Magazine, vol. vol. 28, no. 5 (2011), pp. 94-115
-
Steven R. Ness, Thomas Walters, Richard F. Lyon
Music Data Mining, CRC Press/Chapman Hall (2011)
-
Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths
Matthias Grundmann, Vivek Kwatra, Irfan Essa
IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011)
-
Automatic Language Identification in Music Videos with Low Level Audio and Visual Features
Vijay Chandrasekhar, Mehmet Emre Sargin, David A. Ross
Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2011)
-
Boosting Video Classification Using Cross-Video Signals
Mehmet Emre Sargin, Hrishikesh Aradhye
Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2011) (to appear)
-
Sameer Agarwal, Yasutaka Furukawa, Noah Snavely, Ian Simon, Brian Curless, Steven M. Seitz, Rick Szeliski
Communications of the ACM, vol. 54 (2011), pp. 105-112
-
Cascades of two-pole–two-zero asymmetric resonators are good models of peripheral auditory function
Journal of the Acoustical Society of America, vol. 130 (2011), pp. 3893-3904
-
Crowdsourcing Event Detection in YouTube Videos
Thomas Steiner, Ruben Verborgh, Rik Van de Walle, Michael Hausenblas, Joaquim Gabarro
Detection, Representation, and Exploitation of Events in the Semantic Web (DeRiVE 2011), Bonn, Germany
-
Discrete Point Based Signatures and Applications to Document Matching
Nemanja Spasojevic, Guillaume Poncin, Dan Bloomberg
ICIAP 2011
-
Discriminative Tag Learning on YouTube Videos with Latent Sub-tags
Weilong Yang, George Toderici
Computer Vision and Pattern Recognition, IEEE (2011)
-
Dynamic Stylized Shading Primitives
David Vanderhaeghe, Romain Vergne, Pascal Barla, William Baxter
Proc. Symposium on NonPhotorealistic Animation and Rendering (NPAR 2011), ACM
-
Ira Kemelmacher-Shlizerman, Eli Shechtman, Rahul Garg, Steven Seitz
ACM Trans. on Graphics (Proc. SIGGRAPH), vol. 30(4) (2011) (to appear)
-
Feature Seeding for Action Recognition
Pyry Matikainen, Rahul Sukthankar, Martial Hebert
International Conference on Computer Vision (ICCV) (2011)
-
Geometric Overpass Extraction from Vector Road Data and DSMs
Joshua Schpok
Proceedings of the 19th ACM SIGSPATIAL international Conference on Advances in Geographic information Systems, 2011 (to appear)
-
Handling Label Noise in Video Classification via Multiple Instance Learning
Thomas Leung, Yang Song, John Zhang
ICCV'2011, IEEE
-
Image Saliency: From Local to Global Context
Meng Wang, Janusz Konrad, Prakash Ishwar, Yushi Jing, Henry Rowley
Proc. Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
-
Improving Video Classification via YouTube Video Co-Watch Data
John Zhang, Yang Song, Thomas Leung
ACM Workshop on Social and Behavioural Networked Media Access at ACM MM 2011, ACM
-
Kernelized Structural SVM Learning for Supervised Object Segmentation
Luca Bertelli, Tianli Yu, Diem Vu, Burak Gokturk
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition 2011
-
Large-Scale Image Annotation using Visual Synset
David Tsai, Yushi Jing, Henry Rowley, Yi Liu, Sergey Ioffe, James Rehg
Proc. International Conference on Computer Vision (ICCV) (2011)
-
Quoc V. Le, Will Zou, Serena Yeung, Andrew Y. Ng
Conference on Computer Vision and Pattern Recognition (2011)
-
Limits on the Application of Frequency-based Language Models to OCR
ICDAR, IEEE (2011), pp. 538-542
-
Changchang Wu, Sameer Agarwal, Brian Curless, Steven Seitz
Proc. IEEE Conf. on Computer Vision and Pattern Recognition (2011), pp. 3057-3064
-
Privacy protection and face recognition
Andrew Senior, Sharat Pankanti
Handbook of Face recognition, Springer, 236 Gray's Inn Road | Floor 6 London | WC1X 8HL | UK (2011), pp. 671-692
-
Reading Digits in Natural Images with Unsupervised Feature Learning
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng
NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011
-
Sparse coding of auditory features for machine hearing in interference
Richard F. Lyon, Gal Chechik, Jay Ponte
Proc. ICASSP, IEEE (2011)
-
Summary of Opus listening test results
Christian Hoene, Jean-Marc Valin, Koen Vos, Jan Skoglund
IETF, IETF (2011)
-
Survey and Evaluation of Audio Fingerprinting Schemes for Mobile Query-By-Example Applications
Vijay Chandrasekhar, Matt Sharifi, David Ross
12th International Society for Music Information Retrieval Conference (ISMIR) (2011)
-
Technical Overview of VP8, an open source video codec for the web
Jim Bankoski, Paul Wilkins, Yaowu Xu
2011 International Workshop on Acoustics and Video Coding and Communication, IEEE, Barcelona, Spain (to appear)
-
The Power of Comparative Reasoning
Jay Yagnik, Dennis Strelow, David Ross, Ruei-Sung Lin
International Conference on Computer Vision, IEEE (2011)
-
Autumn Meeting of the Acoustical Society of Japan (2011), pp. 509-512
-
Visual and Semantic Similarity in ImageNet
Thomas Deselaers, Vittorio Ferrari
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011), pp. 1777-1784
-
Where's Waldo: Matching People in Images of Crowds
Rahul Garg, Deva Ramanan, Steven M. Seitz, Noah Snavely
Proc. IEEE Conf. on Computer Vision and Pattern Recognition (2011), pp. 1793-1800
-
YouTubeEvent: On Large-Scale Video Event Classification
Bingbing Ni, Yang Song, Ming Zhao
The 3rd International Workshop on Video Event Categorization, Tagging and Retrieval for Real-World Applications at IEEE ICCV'2011
-
A Large-Scale Taxonomic Classification System for Web-based Videos
Yang Song, Ming Zhao, Reto Strobl, John Zhang, Jay Yagnik
the 11th European Conference on Computer Vision (ECCV 2010)
-
Baselines for Image Annotation
Ameesh Makadia, Vladimir Pavlovic, Sanjiv Kumar
International Journal on Computer Vision (IJCV) (2010)
-
Beyond “Near-Duplicates”: Learning Hash Codes for Efficient Similar-Image Retrieval
Shumeet Baluja, Michele Covell
20th International Conference on Pattern Recognition 2010
-
Comparison of Clustering Approaches for Summarizing Large Populations of Images
Yushi Jing, Michele Covell, Henry A. Rowley
Proceedings ICME VCIDS, IEEE, Singapore (2010)
-
Discontinuous Seam-Carving for Video Retargeting
Matthias Grundmann, Vivek Kwatra, Mei Han, Irfan Essa
Computer Vision and Pattern Recognition (CVPR 2010)
-
Document Image Analysis (Chapter 18)
Dan Bloomberg, Luc Vincent
Mathematical morphology: theory and applications, ISTE-Wiley (2010), pp. 425-438
-
Efficient Hierarchical Graph-Based Video Segmentation
Matthias Grundmann, Vivek Kwatra, Mei Han, Irfan Essa
Computer Vision and Pattern Recognition (CVPR 2010)
-
Example-based Image Compression
Jing-Yu Cui, Saurabh Mathur, Michele Covell, Vivek Kwatra, Mei Han
International Conference on Image Processing (ICIP 2010)
-
Fast Covariance Computation and Dimensionality Reduction for Sub-Window Features in Images
European Conference on Computer Vision (ECCV 2010)
-
Feature Tracking for Wide-Baseline Image Retrieval
European Conference on Computer Vision (ECCV) (2010)
-
Google Street View: Capturing the World at Street Level
Dragomir Anguelov, Carole Dulong, Daniel Filip, Christian Frueh, Stéphane Lafon, Richard Lyon, Abhijit Ogale, Luc Vincent, Josh Weaver
Computer, vol. 43 (2010)
-
History and Future of Auditory Filter Models
Richard F. Lyon, Andreas G. Katsiamis, Emmanuel M. Drakakis
Proc. ISCAS, IEEE (2010), pp. 3809-3812
-
Improved Consistent Sampling, Weighted Minhash and L1 Sketching
ICDM (2010) (to appear)
-
Looking for Pieces of Needles in Millions of Haystacks: Finding Distorted Audio/Video Snippets
Michele Covell, Shumeet Baluja
International Workshop on Computer Vision (2010)
-
Machine Hearing: An Emerging Field
IEEE Signal Processing Magazine, vol. 27 (2010), pp. 131-139
-
Thomas Steiner, Michael Hausenblas
9th International Semantic Web Conference (ISWC 2010)
-
Semi-Supervised Hashing for Scalable Image Retrieval
Jun Wang, Sanjiv Kumar, Shih-Fu Chang
IEEE Conf on Computer Vision and Pattern Recognition (CVPR) (2010)
-
Sound Retrieval and Ranking Using Sparse Auditory Representations
Richard F Lyon, Martin Rehn, Samy Bengio, Thomas C. Walters, Gal Chechik
Neural Computation, vol. 22 (2010), pp. 2390-2416
-
Table Detection in Heterogeneous Documents
Faisal Shafait, Ray Smith
Document Analysis Systems 2010, ACM International Conference Proceedings series
-
Taxonomic Classification for Web-based Videos
Yang Song, Ming Zhao, Jay Yagnik, Xiaoyun Wu
IEEE Conf on Computer Vision and Pattern Recognition (CVPR), IEEE (2010)
-
Video coding mode decision as a classification problem
Rashad Jillani, Urvang Joshi, Chiranjib Bhattacharya, Hari Kalva, RK Ramakrishnan
IS&T/SPIE Electronic Imaging, vol. 7543 (2010), 7543 - 7543 - 8
-
YouTubeCat: Learning to Categorize Wild Web Videos
Zheshen Wang, Ming Zhao, Yang Song, Sanjiv Kumar, Baoxin Li
IEEE Conf on Computer Vision and Pattern Recognition (CVPR) (2010)
-
A Biomimetic, 4.5 µW, 120+dB, Log-domain Cochlea Channel with AGC
Andreas G. Katsiamis, Emmanuel M. Drakakis, Richard F. Lyon
IEEE JSSC (Journal of Solid-State Circuits), vol. 44 (2009), pp. 1006-1022
-
Adapting the Tesseract Open Source OCR Engine for Multilingual OCR
Ray Smith, Daria Antonova, Dar-Shyang Lee
MOCR '09: Proceedings of the International Workshop on Multilingual OCR (2009)
-
Adaptive, selective, automatic tonal enhancement of faces
Hrishikesh Aradhye, George D. Toderici, Jay Yagnik
ACM Multimedia, ACM, New York, NY, USA (2009), pp. 677-680
-
Audiovisual Celebrity Recognition in Unconstrained Web Videos
Mehmet Emre Sargin, Hrishikesh Aradhye, Pedro Moreno, Ming Zhao
Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2009)
-
Automatic, Efficient, Temporally-Coherent Video Enhancement for Large Scale Applications
ACM Multimedia, ACM (2009), pp. 609-612
-
Combined Orientation and Script Detection using the Tesseract OCR Engine
Ranjith Unnikrishnan, Ray Smith
Workshop on Multilingual OCR (MOCR), Proc. 10th Intl. Conf. on Document Analysis and Recognition (ICDAR), (2009)
-
Computer Vision Interfaces for Interactive Art
Andrew Senior, Alejandro Jaimes
Human-Centric Interfaces for Ambient Intelligence, Elsevier (2009)
-
Efficient and Robust Music Identification with Weighted Finite-State Transducers
Mehryar Mohri, Pedro Moreno, Eugene Weinstein
IEEE Transactions on Audio, Speech, and Language Processing, vol. to appear (2009)
-
Flight patterns
Aaron Koblin
SIGGRAPH ASIA '09: ACM SIGGRAPH ASIA 2009 Art Gallery & Emerging Technologies: Adaptation, ACM, New York, NY, USA, pp. 29-29
-
Google Newspaper Search – Image Processing and Analysis Pipeline
Krishnendu Chaudhury, Ankur Jain, Sriram Thirthala, Vivek Sahasranaman, Shobhit Saxena, Selvam Mahalingam
10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 621-625
-
Hybrid Page Layout Analysis via Tab-Stop Detection
Proceedings of the 10th international conference on document analysis and recognition, IEEE (2009)
-
Image Reconstruction in the Gigavision Camera
Feng Yang, Luciano Sbaiz, Edoardo Charbon, Sabine Susstrunk, Martin Vetterli
ICCV workshop OMNIVIS 2009
-
LSH Banding for Large-Scale Retrieval with Memory and Recall Constraints
Michele Covell, Shumeet Baluja
International Conference on Acoustics, Speech, and Signal Processing, IEEE (2009)
-
Large-scale Privacy Protection in Google Street View
Andrea Frome, German Cheung, Ahmad Abdulkader, Marco Zennaro, Bo Wu, Alessandro Bissacco, Hartwig Adam, Hartmut Neven, Luc Vincent
IEEE International Conference on Computer Vision (2009)
-
Low Cost Correction of OCR Errors Using Learning in a Multi-Engine Environment
Ahmad Abdulkader, Matthew R. Casey
Proceedings of the 10th international conference on document analysis and recognition, IEEE (2009)
-
Models for patch-based image restoration
Mithun Das Gupta, Shyamsundar Rajaram, Nemanja Petrovic, Thomas S. Huang
J. Image Video Process., vol. 2009 (2009), pp. 1-12
-
Jean-Francois Paiement, Yves Grandvalet, Samy Bengio
Connection Science, vol. 21 (2009), pp. 253-272
-
Privacy Protection in Video Surveillance
Springer (2009)
-
SD-VBS: The San Diego Vision Benchmark Suite
Sravanthi Kota Venkata, Ikkjin Ahn, Donghwan Jeon, Anshuman Gupta, Christopher Louie, Saturnino Garcia, Serge Belongie, Michael Bedford Taylor
IEEE Workload Characterization Symposium, vol. 0 (2009), pp. 55-64
-
Shape-based Object Recognition in Videos Using 3D Synthetic Object Models
Alexander Toshev, Ameesh Makadia, Kostas Daniilidis
Computer Vision and Pattern Recognition (2009)
-
Softcuts: A Soft Edge Smoothness Prior for Color Image Super Resolution
Shengyang Dai, Mei Han, Wei Xu, Ying Wu, Yihong Gong, Aggelos K. Katsaggelos
IEEE Transactions on Image Processing (T-IP), vol. 18 (2009), pp. 969-981
-
Sound Ranking Using Auditory Sparse-Code Representations
Martin Rehn, Richard F. Lyon, Samy Bengio, Thomas C. Walters, Gal Chechik
ICML 2009 Workshop on Sparse Method for Music Audio
-
State of the Art in Example-based Texture Synthesis
Li-Yi Wei, Sylvain Lefebvre, Vivek Kwatra, Greg Turk
Eurographics 2009, State of the Art Report, EG-STAR, Eurographics Association
-
Tour the World: building a web-scale landmark recognition engine
Yantao Zheng, Ming Zhao, Yang Song, Hartwig Adam, Ulrich Buddemeier, Alessandro Bissacco, Fernando Brucher, Tat-Seng Chua, Hartmut Neven
International Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
-
Tree detection from aerial imagery
Lin Yang, Xiaqing Wu, Emil Praun, Xiaoxu Ma
Proceedings of the 17th ACM SIGSPATIAL international Conference on Advances in Geographic information Systems, Seattle, Washington (2009)
-
Visualizing Web Images via Google Image Swirl
Yushi Jing, Henry A. Rowley, Chuck Rosenberg, Jingbin Wang, Michele Covell
NIPS Workshop on Statistical Machine Learning for Visual Analytics (2009)
-
A New Baseline For Image Annotation
Ameesh Makadia, Vladimir Pavlovic, Sanjiv Kumar
European Conference on Computer Vision (ECCV) (2008)
-
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search
Christoph H. Lampert, Matthew B. Blaschko, Thomas Hofmann
IEEE Computer Vision and Pattern Recognition (CVPR), Anchorage, AK (2008)
-
Coordinated Multi-Device Presentations: Ambient-Audio Identification
Michael Fink, Michele Covell, Shumeet Baluja
Encyclopedia of Wireless and Mobile Communications, Taylor & Francis (2008), pp. 274-285
-
Estimating the Spectral Reflectance of Natural Imagery Using Color Image Features
Josh Hyman, Mark Hansen, Eric Graham, Deborah Estrin
Workshop on Applications, Systems, and Algorithms for Image Sensing (2008)
-
Face Tracking and Recognition with Visual Constraints in Real-World Videos
Minyoung Kim, Sanjiv Kumar, Vladimir Pavlovic, Henry A. Rowley
IEEE Computer Vision and Pattern Recognition (CVPR) (2008)
-
Fluid in Video: Augmenting Real Video with Simulated Fluids
Vivek Kwatra, Philippos Mordohai, Rahul Narain, Sashi Kumar Penta, Mark Carlson, Marc Pollefeys, Ming C. Lin
Comput. Graph. Forum (Proc. Eurographics), vol. 27 (2008), pp. 487-496
-
Large Scale Learning and Recognition of Faces in Web Videos
Ming Zhao, Jay Yagnik, Hartwig Adam, David Bau
FG2008
-
Ameet Talwalkar, Sanjiv Kumar, Henry A. Rowley
Computer Vision and Pattern Recognition (CVPR) (2008)
-
Linear Time Maximally Stable Extremal Regions
David Nistér, Henrik Stewénius
Proc. 10th Europ. Conf. Comput. Vision (2008), pp. 183-196
-
Markovian Mixture Face Recognition with discriminative face alignment
automatic face and gesture recognition, ieee (2008)
-
Mass Personalization: Social and Interactive Applications using Sound-Track Identification
Michael Fink, Michele Covell, Shumeet Baluja
Journal of Multimedia Tools and Applications, vol. 36 (2008), pp. 115-132
-
PageRank for Product Image Search
Yushi Jing, Shumeet Baluja
WWW-2008
-
Permutation Grouping: Intelligent Hash Function Design for Audio & Image Retrieval
Shumeet Baluja, Michele Covell, Sergey Ioffe
International Conference on Acoustics, Speech and Signal Processing (ICASSP-2008)
-
Reducing Photon Mapping Bandwidth by Query Reordering
Joshua Steinhurst, Greg Coombe, Anselmo Lastra
IEEE Transactions on Visualization and Computer Graphics, vol. 14 (2008)
-
Solving the label resolution problem in supervised video content classification
MIR '08: Proceeding of the 1st ACM international conference on Multimedia information retrieval, ACM, New York, NY, USA (2008), pp. 276-282
-
Stereo Matching with Color-weighted Correlation, Hierarchical Belief Propagation and Occlusion Handling
Qingxiong Yang, Liang Wang, Ruigang Yang, Henrik Stewénius, David Nistér
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2008)
-
Visual Synset: Towards a Higher-level Visual Representation
Yantao Zheng, Ming Zhao, Shi-Yong Neo, Tat-Seng Chua, Qi Tian
CVPR (2008)
-
VisualRank: Applying PageRank to Large-Scale Image Search
Yushi Jing, Shumeet Baluja
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30 (2008), pp. 1877-1890
-
Waveprint: Efficient Wavelet-Based Audio Fingerprinting
Shumeet Baluja, Michele Covell
Pattern Recognition (2008)
-
Web-scale Image Annotation
Jiakai Liu, Rong Hu, Meihong Wang, Yi Wang, Edward Chang
Pacific-Rim Conference on Multimedia (2008) (to appear)
-
An Overview of the Tesseract OCR Engine
Proc. Ninth Int. Conference on Document Analysis and Recognition (ICDAR), IEEE Computer Society (2007), pp. 629-633
-
Audio Fingerprinting: Combining Computer Vision & Data Stream Processing
Shumeet Baluja, Michele Covell
Proceedings of the 2007 International Conference on Acoustics, Speech, and Signal Processing
-
Automated Image Orientation Detection: A Scalable Boosting Approach
Pattern Analysis and Applications (2007)
-
Automatic Alignment of Large-scale Aerial Rasters to Road-maps
James Xiaqing Wu, Rodrigo Carceroni, Hui Fang, Steve Zelinka, Andrew Kirmse
ACM GIS 2007, ACM
-
Boosting Sex Identification Performance
Shumeet Baluja, Henry A. Rowley
International Journal of Computer Vision, vol. 71 (2007), pp. 111-119
-
Canonical Image Selection from the Web
Yushi Jing, Shumeet Baluja, Henry A. Rowley
ACM International Conference on Image and Video Retrieval (2007)
-
Classification of Weakly-Labeled Data with Partial Equivalence Relations
International Conference on Computer Vision (ICCV) (2007)
-
Detail Preserving Shape Deformation in Image Editing
Hui Fang, John C. Hart
Proc. SIGGRAPH 2007, ACM, San Diego, no. 12
-
Efficient Complete and Incomplete Path Openings and Closings
Hugues Talbot, Ben Appleton
Image and Vision Computing, vol. 25, no. 4 (2007), pp. 416-425
-
GRADE-IV: Visualizing Graphics Library Operations in an Executing Program
Hidehiko Abe, Takeo Igarashi
SIGGRAPH 2007 Posters, ACM, no. 118
-
Google Books: Making the public domain universally accessible
Adam Langley, Dan Bloomberg
Document Recognition and Retrieval XIV, SPIE (2007), 65000H1-65000H10
-
Imagers as sensors: Correlating plant CO2 uptake with digital visible-light imagery
Josh Hyman, Eric Graham, Mark Hansen, Deborah Estrin
Data Management for Sensor Networks (2007)
-
Known-Audio Detection Using Waveprint: Spectrogram Fingerprinting By Wavelet Hashing
Michele Covell, Shumeet Baluja
Proceedings of the 2007 International Conference on Acoustics, Speech, and Signal Processing
-
Music Identification with Weighted Finite-State Transducers
Eugene Weinstein, Pedro J. Moreno
Proceedings of the International Conference in Acoustics, Speech and Signal Processing (ICASSP) (2007)
-
Ordinal Regression Based Subpixel Shift Estimation for Video Super-Resolution
Mithun Das Gupta, Shyamsundar Rajaram, Thomas S. Huang, Nemanja Petrovic
EURASIP Journal on Advances in Signal Processing, vol. 85963 (2007)
-
Practical Gammatone-Like Filters for Auditory Modeling
Andreas G. Katsiamis, Emmanuel M. Drakakis, Richard F. Lyon
EURASIP Journal on Audio, Speech, and Music Processing, vol. 2007 (2007), pp. 12
-
Practical MythTV: Building a PVR and Media Center PC
Michael Still, Stewart Smith
Apress (2007), pp. 350
-
Raising Global Awareness with Google Earth
Imaging Notes, vol. 22, no. 2 (2007), pp. 24-29
-
Robust music identification, detection, and analysis
M. Mohri, Pedro J. Moreno, Eugene Weinstein
Proceedings of the International Conference on Music Information Retrieval (ISMIR) (2007)
-
Temporally Consistent Reconstruction from Multiple Video Streams using Enhanced Belief Propagation
E. Scott Larsen, Philippos Mordohai, Marc Pollefeys, Henry Fuchs
Eleventh IEEE International Conference on Computer Vision (2007)
-
Advertisement Detection and Replacement using Acoustic and Visual Repetition
Michele Covell, Shumeet Baluja, Michael Fink
Proceedings of the 2006 International Workshop on Multimedia Signal Processing, IEEE
-
Content Fingerprinting Using Wavelets
Shumeet Baluja, Michele Covell
Proceedings of the Conference of Visual Media Production, IET (2006)
-
Detecting Ads in Video Streams using Acoustic and Visual Cues
Michele Covell, Shumeet Baluja, Michael Fink
Computer Magazine (2006), pp. 135-137
-
Globally Minimal Surfaces by Continuous Maximal Flows
Ben Appleton, Hugues Talbot
IEEE Trans. Pattern Anal. Mach. Intell., vol. 28 (2006), pp. 106-118
-
Large Scale Image-Based Adult-Content Filtering
Henry A. Rowley, Yushi Jing, Shumeet Baluja
1st International Conference on Computer Vision Theory, Sebutal, Portugal (2006)
-
Query by Semantic Example
Nikhil Rasiwasia, Nuno Vasconcelos, Pedro J. Moreno
CIVR (2006), pp. 51-60
-
Social- and Interactive-Television Applications Based on Real-Time Ambient-Audio Identification
Michael Fink, Michele Covell, Shumeet Baluja
European Interactive TV Conference (Euro-ITV) (2006)
-
Time-Scale Modification for 3G-Telephony Video
Michele Covell, Sumit Roy, Bo Shen
Proceedings of the 2006 International Workshop on Multimedia Signal Processing, IEEE
-
Boosting Sex Identification Performance
Shumeet Baluja, Henry A. Rowley
Proceedings of the Seventeenth Innovative Applications of Artificial Intelligence Conference, AAAI (2005), pp. 1508-1513
-
Large Scale Performance Measurement of Content-Based Automated Image-Orientation Detection
Shumeet Baluja, Henry A. Rowley
International Conference on Image Processing, Genova, Italy (2005)
-
The Definitive Guide to ImageMagick
Michael Still
Apress, Apress, Inc. 2560 Ninth St., Ste. 219 Berkeley, CA 94710 (2005), pp. 335
-
Efficient Face Orientation Discrimination
Shumeet Baluja, Mehran Sahami, Henry A. Rowley
International Conference on Image Processing (ICIP-2004)