【学术报告】Transparent Composite Model for Large Scale Image/Video Processing

发布者:系统管理员发布时间:2014-09-22浏览次数:37

 

学术报告

 

报告题目: Transparent Composite Model for Large Scale Image/Video Processing

 

报告人: 杨恩辉

加拿大首席科学家

加拿大滑铁卢大学终身教授

加拿大皇家科学院院士

         Department of Electrical and Computer Engineering                        

         University of Waterloo, Canada      

 

报告时间:926日(周五)上午900-10:00

报告地点:省身楼一楼报告厅

 

欢迎师生参加!

 

 

统计研究院

陈省身数学所

计算机与控制学院

2014917


Abstract

 With the wide availability of imaging and sensor devices, there has been a tremendous growth in the volume of image/video data. It is estimated that at any second, there are over 1 trillion photos available online from web pages, social media, ad photos, etc. Similarly, according to the forecast from Cisco VNI Mobile 2012, the global mobile data traffic will grow to 10.8 exabytes per month by 2016, and over 70% of mobile data will be video.

The volume, velocity, and variety of image/video data including image/video data for road conditions and road traffic information, however, make it challenging to transmit, store, and more importantly make sense of them. To provide a stimulus for new methodologies and new ways of thinking to advance the fields of image/video coding and understanding, in this talk, we will present a new model developed recently and dubbed a transparent composite model (TCM) for transformed image/video data to facilitate large scale image/video processing. Specifically, to handle the heavy tail phenomenon commonly seen in Discrete Cosine Transform (DCT) coefficients of image/video data, a TCM first separates the tail of a sequence of DCT coefficients from the main body of the sequence. Then, a uniform distribution is used to model the tail while a different parametric distribution is used to model the main body.The separate boundary and other parameters of the TCM can be estimated via maximum likelihood (ML) estimation. Efficient online algorithms with exponentially fast convergence will also be presented for computing ML estimates of these parameters. Experimental results based on Kullback-Leibler divergence and $/chi^2$ test show that for real-valued continuous AC coefficients, the TCM based on truncated Laplacian (LPTCM) offers the best trade-off between modeling accuracy and complexity. For discrete DCT coefficients, the discrete TCM based on truncated geometric distributions (GMTCM) models AC coefficients more accurately than pure Laplacian models and generalized Gaussian models in majority cases while having simplicity and practicality similar to those of pure Laplacian models.  In addition, it is demonstrated that the GMTCM also exhibits a good capability of data reduction/feature extraction---DCT coefficients in the heavy tail identified by the GMTCM are truly outliers, and these outliers represent an outlier image revealing some unique global features of the image. Applications of TCM to image/video coding, understanding, and management will also be discussed.


 

Short Bio:

 

En-hui Yang has been with the Dept. of Electrical and Computer Engineering, University of Waterloo, Ontario, Canada since June 1997, where he is now a Professor and Canada Research Chair in information theory and multimedia compression. He is the founding director of the Leitch-University of Waterloo multimedia communications lab, and a co-founder of SlipStream Data Inc. (now a subsidiary of BlackBerry (formerly Research In Motion)). He currently also serves as an Executive Council Member of China Overseas Exchange Association and an Overseas Advisor for the Overseas Chinese Affairs Office of the City of Shanghai, and serves on the Overseas Expert Advisory Committee for the Overseas Chinese Affairs Office of the State Council of China.

 

He served, among many other roles, as an Associate Editor for IEEE Transactions on Information Theory, a general co-chair of the 2008 IEEE International Symposium on Information Theory, the largest premier international conference on information theory in the world, a technical program vice-chair of the 2006 IEEE International Conference on Multimedia & Expo (ICME), the chair of the award committee for the 2004 Canadian Award in Telecommunications, a co-editor of the 2004 Special Issue of the IEEE Transactions on Information Theory, a co-chair of the 2003 US National Science Foundation (NSF) workshop on the interface of Information Theory and Computer Science, the purpose of which is to advise NSF about research directions and support in the interface area, and a co-chair of the 2003 Canadian Workshop on Information Theory.

 

An innovator and a pioneer in his fields, Dr. Yang is a Fellow of IEEE, a Fellow of the Canadian Academy of Engineering, and a Fellow of the Royal Society of Canada (The Academies of Arts, Humanities and Sciences of Canada). He is also a recipient of several research awards and honors including the prestigious Inaugural  Ontario Premier’s Catalyst Award in 2007 for the Innovator of the Year (Proceeds of the award were donated entirely to the University of Waterloo to create an annual En-hui Yang Engineering Research Innovation Award eligible to all Engineering Faculty at Waterloo to encourage and support research and innovation by Waterloo Engineering faculty), the 2007 Ernest C. Manning Award of Distinction, one of the Canada's most prestigious innovation prizes, the 2013 CPAC Professional Achievement Award, and the 2014 Padovani Lecture. Products based on his inventions and commercialized by SlipStream received the 2006 Ontario Global Traders Provincial Award. With over 210 papers and more than 200 patents/patent applications worldwide, his research work has had an impact on the daily life of hundreds of millions of people over 170 countries either through commercialized products, or video coding open sources, or video coding standards. In 2011, he was selected for inclusion in Canadian Who’s Who.