Egocentric (First-Person) Vision
Instructor
|
Giovanni Maria Farinella
Università di Catania
Italy
|
|
Brief Bio
Giovanni Maria Farinella obtained the degree in Computer Science (egregia cum laude) from the University of Catania, Italy, in 2004. He is Founder Member of the IPLAB Research Group at University of Catania since 2005. He was awarded a Doctor of Philosophy (Computer Vision) from the University of Catania in 2008. He is currently a Full Professor at the Department of Mathematics and Computer Science, University of Catania, Italy. His research interests lie in the fields of Computer Vision, Pattern Recognition and Machine Learning, with focus on First Person (Egocentric) Vision. He is Associate Editor of the international journals IEEE Transactions on Pattern Analysis and Machine Intelligence, Pattern Recognition - Elsevier and IET Computer Vision. He has been serving as Area Chair for CVPR 2020/21/22, ICCV 2017/19/21, ECCV 2020, BMVC 2020, WACV 2019, ICPR 2018, and as Program Chair of ECCV 2022, ICIAP 2021 and VISAPP 2019/20/21/22/23. Giovanni Maria Farinella founded (in 2006) and currently directs the International Computer Vision Summer School. He also founded (in 2014) and currently directs the Medical Imaging Summer School. He is member of the European Laboratory for Learning and Intelligent Systems (ELLIS), Senior Member of the IEEE Computer Society, Scientific Advisor of the NVIDIA AI Technology Centre (NVAITC), and board member of the CINI Laboratory of Artificial Intelligence and Intelligent Systems (lead of the area AI for Industry - since 2021). He was awarded the PAMI Mark Everingham Prize 2017. In addition to academic work, Giovanni's industrial experience includes scientific advisorship to different national and international companies and startups, as well as the leadership as Founder and Chief Scientific Officer of Next Vision - Spinoff of the University of Catania.
|
Abstract
In the next years a huge number of images and videos acquired by using wearable cameras and related to our daily life will be available. The increasing use of Egocentric (First-Person) Cameras poses new challenges for the computer vision community, and gives the opportunity to build new applications with possibility of commercialization. This tutorial will give an overview of the advances in the field of Egocentric (First-Person) Vision. Challenges, applications and algorithms will be discussed by considering the past and recent literature.
Keywords
Computer Vision, Wearable Devices, Egocentric (First-Person) Vision, Learning to see, What Happens Next, Applications
Aims and Learning Objectives
The objective of this tutorial to provide an overview of the latest advances of Computer Vision also considering challenges and applications in the context of Egocentric (First-Person) Vision. The attendees will become familiar with current and future devices and computer vision technologies, as well as with the current state-of-the-art algorithms.
Target Audience
This course is intended for those with a general computing background, and with interest in the topic of image processing, computer vision and machine learning. Ph. D. students, post-docs, young researchers (both academic and industrial), senior researchers (both academic and industrial) or academic/industrial professionals will benefit from the general overview and the introduction of the most recent advances of the field.
Prerequisite Knowledge of Audience
Basic knowledge in the fields of Image Processing, Computer Vision, Machine Learning
Detailed Outline
- Introduction and Motivation
- Open Challenges
- State-of-the-Art Algorithms
- Applications and Opportunities
Bayesian and Quasi Monte Carlo Spherical Integration for Illumination Integrals
Instructors
|
Kadi Bouatouch
IRISA, University of Rennes 1
France
|
|
Brief Bio
Professor Kadi Bouatouch is an electronics and automatic systems engineer (ENSEM 1974). He was awarded a PhD in 1977 (University of Nancy 1, France) and a higher doctorate on computer science in the field of computer graphics in 1989 (University of Rennes 1, France). He is/was working on global illumination, lighting simulation for complex environments, GPU based rendering, HDR imaging and computer vision. He is currently Emeritus Professor at the university of Rennes 1 (France) and researcher at IRISA Rennes (Institut de Recherche en Informatique et Systèmes Aléatoires). He was the head of the FRVSense team within IRISA. He was/is member of the program committee of several conferences and workshops and reviewer for several Computer Graphics journals such as: The Visual Computer, ACM Trans. On Graphics, IEEE Computer Graphics and Applications, IEEE Trans. On Visualization and Computer Graphics, IEEE Trans. On image processing, etc. He also acted as a reviewer for many conferences and workshops. He has reported for several PhD theses or higher doctorates in France and abroad (USA, UK, Belgium, Cyprus, The Netherlands, Spain, Germany, Algeria, etc.). He was an associate editor for the Visual Computer Journal.
|
|
Ricardo Marques
Department of Mathematics and Informatics, Universitat de Barcelona
Spain
|
|
Brief Bio
Ricardo Marques received his MSc degree in Computer Graphics and Distributed Parallel Computation from Universidade do Minho, Portugal, (fall 2009), after which he worked as a researcher in the same university. He joined INRIA (Institut National de Recherche en Informatique et Automatique) and the FRVSense team as a PhD student in the fall 2010 under the supervision of Kadi Bouatouch. His thesis work has focused on spherical integration methods applied to light transport simulation. He defended his PhD thesis in the fall 2013 and joined the Mimetic INRIA research team as a research engineer in 2014, where he worked in the field of Crowd Simulation. In the fall 2015 he joined the Interactive Technologies Group (GTI) of Universitat Pompeu Fabra (UPF) in Barcelona as a post-doc. In August 2016 he received a Marie Curie Fellowship. In June 2020 he joined the Department of Mathematics and Informatics of Universitat de Barcelona as a tenure-eligible Lecturer, hired through the Serra Húnter Excellence Programme. Since then he further broadened his research interest which now also include computer vision and deep-learning methods.
|
|
Christian Bouville
IRISA
France
|
|
|
Abstract
The tutorial addresses two quadrature methods: Quasi Monte Carlo (QMC) and Bayesian Monte Carlo (BMC). These two approaches are applied to compute the shading integral in global illumination. First, we will show that Bayesian Monte Carlo can significantly outperform importance sampling Monte Carlo through a more effective use of the information produced by sampling. As for for QMC, we will show that QMC methods exhibit a faster convergence rate than that of classic Monte Carlo methods. This feature has made QMC prevalent in image synthesis, where it is frequently used for approximating the value of spherical integrals (e.g.shading integral). In this tutorial we present a strategy for producing high-quality QMC sampling patterns for spherical integration by resorting to spherical Fibonacci point sets.
Keywords
Quasi Monte Carlo, Bayesian Monte Carlo, Global illumination, sampling, rendering
Aims and Learning Objectives
To show many details on QMC methods for global illumination. To make the attendees discover a new approach which is the BMC method.
Target Audience
e intended audiences are Ph.D. students and researchers in the field of realistic image synthesis or global illumination
algorithms, or any person with a solid background in graphics and numerical techniques.
Prerequisite Knowledge of Audience
realistic image synthesis or global illumination
algorithms, or any person with a solid background in graphics and numerical techniques.
Detailed Outline
Monte Carlo method has proved to be very powerful to cope with global illumination problems however it is not really sparing with regard to sampling operations. Previous work has shown Bayesian Monte Carlo can significantly outperform importance sampling Monte Carlo through a more effective use of the information produced by sampling. The main goal in this tutorial is to propose a generalized approach of Bayesian Monte Carlo in the context of global illumination rendering. In particular, we propose solutions to the problems of Bayesian quadra- ture computation, samples set optimization and prior knowledge modeling. Our results show the benefits of our method even with a very simple parametrization of the prior model.
The second part of the tutorial is concerned with Quasi-Monte Carlo.
Quasi-Monte Carlo (QMC) methods exhibit a faster convergence rate than that of classic Monte Carlo methods. This feature
has made QMC prevalent in image synthesis, where it is frequently used for approximating the value of spherical integrals (e.g.
illumination integral). The common approach for generating QMC sampling patterns for spherical integration is to resort to unit
square low-discrepancy sequences and map them to the hemisphere. However such an approach is suboptimal as these sequences
do not account for the spherical topology and their discrepancy properties on the unit square are impaired by the spherical
projection. In this tutorial we present a strategy for producing high-quality QMC sampling patterns for spherical integration by
resorting to spherical Fibonacci point sets. We show that these patterns, when applied to illumination integrals, are very simple
to generate and consistently outperform existing approaches, both in terms of root mean square error (RMSE) and image quality.
Image Quality Assessment based on
Machine Learning for the Special Case of Computer-generated Images
Instructor
|
Andre Bigand
ULCO
France
|
|
Brief Bio
André Bigand received the Ph.D. Degree in 1993 from the University Paris 6 and the HDR degree in 2001 from the Université du Littoral of Calais (ULCO, France). He is currently senior associate professor in ULCO, since 1993. His current research interest include uncertainty modelling and machine learning with applications to image processing and synthesis (particularly noise modelling and filtering). He is currently with the LISIC Laboratory (ULCO). He has 33 years experience teaching and lecturing. He is a visiting professor at UL - Lebanese University, where he teaches "machine learning and pattern recognition" in research master STIP.
|
Abstract
Unbiased global illumination methods based on stochastic techniques provide phororealistic images. They are however prone to noise that can only be reduced by increasing the number of computed samples. The problem of finding the number of samples that are required in order to ensure that most of the observers cannot perceive any noise is still open since the ideal image is unknown. Image quality assessment is well-known considering natural scene images and this is summed up in the tutorial introduction. Image quality (or noise evaluation) of computer-generated images is slightly diferent, since image acquisition is diferent. In this tutorial we address this problem focusing on visual perception of noise. But rather than use known perceptual models we investigate the use of machine learning approaches classically used in the Artifcial Intelligence area as full-reference and reduced-reference metrics. We propose to use such approaches to create a machine learning model based on Learning Machines as SVM, RVM, ... in order to be able to predict which image highlights perceptual noise. We also investigate the use of soft computing approaches based on fuzzy sets as no-reference metric. Learning is performed through the use of an example database which is built from experiments of noise perception with human users. These models can then be used in any progressive stochastic global illumination method in order to fnd the visual convergence threshold of diferent parts of any image.
This tutorial is structured as a half day presentation (3 hours). The goals of this course are to make students familiar with the underlying techniques that make this possible (machine learning, soft computing).
Keywords: Computer-generated Images; Quality Metrics; Machine Learning, Soft Computing.
Natural Human-Computer-Interaction in Virtual and Augmented Reality
Instructor
|
Manuela Chessa
University of Genoa
Italy
|
|
Brief Bio
Manuela Chessa is a Assistant Professor at Dept. of Informatics, Bioengineering, Robotics, and Systems Engineering of University of Genova, Italy. She received her MSc in 2005, and the Ph.D. in 2009, under the supervision of Prof. S. P. Sabatini. Her research interests are focused on the study of biological and artificial vision systems, on the development of bioinspired models, and of natural human-machine interfaces based on virtual, augmented and mixed reality. She studies the use of novel sensing technologies (e.g. Microsoft Kinect, Leap Motion, Intel Real Sense) and of visualization devices (e.g. 3D monitors, head-mounted-displays, tablets) to develop natural interaction systems, always having in mind the human perception. She has been involved in several national and international research projects. She is author and co-author of 55 peer reviewed scientific papers, both on ISI journal and on International Conferences, of 5 book chapters, and of 2 edited book.
|
Abstract
One of the goals of Human-Computer-Interaction (HCI) is to obtain systems where people can act in a natural and intuitive way. In particular, the aim of Natural Human Computer Interaction (NHCI) is to create new interactive frameworks that mimic as much as possible real life experience. Nevertheless, the gap among computer vision, computer graphics, cognitive science, behavioral and psychophysics studies is still preventing to obtain a real NHCI. In this tutorial, I will review the past and recent literature about human-computer-interaction systems, focusing on the recent development in the field. In particular, I will address the topics of misperception, visual fatigue and cybersickness in virtual and augmented reality scenarios, and I will discuss the open issues and the possible ways to improve such systems.
Perception for Visualization: From Design to Evaluation
Instructor
|
Haim Levkowitz
University of Massachusetts, Lowell
United States
|
|
Brief Bio
Haim Levkowitz is the Chair of the Computer Science Department at the University of Massachusetts Lowell, in Lowell, MA, USA, where he has been a Faculty member since 1989. He was a twice-recipient of a US Fulbright Scholar Award to Brazil (August – December 2012 and August
2004 – January 2005). He was a Visiting Professor at ICMC — Instituto de Ciencias Matematicas e de Computacao (The Institute of Mathematics and Computer Sciences)—at the University of Sao Paul, Sao Carlos – SP, Brazil (August 2004 - August 2005; August 2012 to August 2013). He co-founded and was Co-Director of the Institute for Visualization and Perception Research (through 2012), and is now Director of the Human-Information Interaction Research Group. He is a world renowned authority on visualization, perception, color, and their application in data mining and information retrieval. He is the author of “Color Theory and Modeling for Computer Graphics, Visualization, and Multimedia Applications” (Springer 1997) and co-editor of “Perceptual Issues in Visualization” (Springer 1995), as well as many papers in these subjects. He is also co-author/co-editor of "Writing Scientific Papers in English Successfully: Your Complete Roadmap," (E.
Schuster, H. Levkowitz, and O.N. Oliveira Jr., eds., Paperback: ISBN:
978-8588533974; Kindle: ISBN: 8588533979, available now on Amazon.com:
http://www.amazon.com/Writing-Scientific-Papers-English-Successfully/dp/8588533979).
He has more than 44 years experience teaching and lecturing, and has taught many tutorials and short courses, in addition to regular academic courses. In addition to his academic career, Professor Levkowitz has had an active entrepreneurial career as Founder or Co-Founder, Chief Technology Officer, Scientific and Strategic Advisor, Director, and venture investor at a number of high-tech startups.
|
Abstract
What is the smallest sample I can show that will be perceived? What is the smallest sample I can show that will be perceived in color? Can I afford using image compression? If yes, how much and what kind? Should I use a grayscale or another color scale to present data? How many gray levels do I absolutely need? What color scale should I use? How many bits for color do I need to have? Should I use 3D, stereo, texture, motion? If so what kinds? and Has my visualization been successful meeting its goals and needs?
If you have ever designed a visualization, you probably have asked yourself (perhaps others) some of these questions; at least you should have.
Since visualization “consumers” are humans, the answers to these questions can only come from a thorough analysis and understanding of human perceptual capabilities and limitations, combined with the visualization's goals and needs.
This tutorial will teach you the basics of human perception and how to utilize them in the complete process of visualization: from design to evaluation.
Depth Video Enhancement
Instructor
|
Djamila Aouada
University of Luxembourg
Luxembourg
|
|
Brief Bio
Djamila Aouada received the State Engineering degree in electronics in 2005, from the École Nationale Polytechnique (ENP), Algiers, Algeria, and the Ph.D. degree in electrical engineering in 2009 from North Carolina State University (NCSU), Raleigh, NC. She is Research Scientist at the Interdisciplinary Centre for Security, Reliability, and Trust (SnT), at the University of Luxembourg. Dr. Aouada has been leading the computer vision activities at the SnT since 2009. She has worked as a consultant for multiple renowned laboratories (Los Alamos National Laboratory, Alcatel Lucent Bell Labs., and Mitsubishi Electric Research Labs.). Her research interests span the areas of signal and image processing, computer vision, pattern recognition and data modelling. She is the co-author of two IEEE Best Paper Awards, and member of IEEE, IEEE SPS, and IEEE WIE.
|
Abstract
3D sensing technologies have witnessed a revolution in the past years making depth sensors cost-effective and part of accessible consumer electronics. Their ability in directly capturing depth videos in real-time has opened tremendous possibilities for multiple applications in computer vision. These sensors, however, have some shortcomings due to their high noise contamination, including missing and jagged measurements, and their low spatial resolutions. In order to extract detailed 3D features from this type of data, a dedicated data enhancement is required. This tutorial reviews the different approaches proposed in the literature, and especially focuses on strategies targeting dynamic depth scenes with non-rigid deformations.
Keywords
Depth enhancement, super-resolution, dynamic scene, fusion
Aims and Learning Objectives
The goal is to overview existent approaches and show the potential of cost-effective 3D sensors.
Target Audience
Students, researchers and engineers interested in 3D sensing
Prerequisite Knowledge of Audience
No prerequisites are required. The tutorial should be accessible by all.
Detailed Outline
1. Motivation: cost effective depth sensing
a. Working Principles
b. Data Properties
c. Limitations
2. Multi-modal fusion approaches
3. Learning-based approaches
4. Depth multi-frame super-resolution approaches
a. Background
b. Enhancement of static scenes
c. Enhancement of dynamic scenes
5. Conclusion:
a. Applications
b. Discussions
c. Questions