Real Time Hand Pose Estimation for HCI.

É com prazer que convido a tod@s para mais um seminário do grupo de eScience. Nosso convidado será Dr. Philip Krejov, um recém doutor da universidade de Surrey, Inglaterra, que estará visitando o departamento nesse dia.

Data: 24/5/2016

Horário: 14h

Local: sala 136 do bloco A

Título: Real Time Hand Pose Estimation for HCI.

The aim of this presentation is to address the challenge of real-time pose estimation of the hand. Specifically aiming to determine the joint positions of a non-augmented hand. This methods discussed focus on the use of depth, performing localisation of the parts of the hand through efficient fitting of a kinematic model and consists of four main contributions.

The first part presents an approach to Multi-touch(less) tracking, where the objective is to track the fingertips with a high degree of accuracy without sensor contact. Using a graph based approach, the surface of the hand is modelled and extrema of the hand are located. The tracking approach allows for collaborative interactions due to its highly efficient tracking, resolving 4 hands simultaneously in real-time.

The second contribution applies a Randomised Decision Forest (RDF) to the problem of pose estimation and presents a technique to identify regions of the hand, using features that sample depth. The RDF is an ensemble based classifier that is capable of generalising to unseen data and is capable of modelling expansive datasets, learning from over 70,000 pose examples. The approach is also demonstrated in the challenging application of American Sign Language (ASL) finger-spelling recognition.

The third contribution combines a machine learning approach with a model based method to overcome the limitations of either technique in isolation. A RDF provides initial segmentation allowing surface constraints to be derived for a 3D model, which is subsequently fitted to the segmentation. This stage of global optimisation incorporates temporal information and enforces kinematic constraints. Using Rigid Body Dynamics for optimisation, invalid poses due to self-intersection and segmentation noise are resolved.

Accuracy of the approach is limited by the natural variance between users and the use of a generic hand model. The final contribution therefore proposes an approach to refine pose via cascaded linear regression which samples the residual error between the depth and the model. This combination of techniques is demonstrated to provide state of the art accuracy in real time, without the use of a GPU and without the requirement for model initialisation.

Short BIO:
Philip Krejov received a BEng (Hons) degree in Electronic Engineering from the University of Surrey, United Kingdom in 2011, including an industrial placement at National Instruments. On graduation he was awarded the prize for best final year dissertation. He is currently a Research Fellow in the Centre for Vision Speech and Signal Processing at the University of Surrey, having completed his PhD in January 2016.  The focus of his work is in Human Computer Interaction (HCI) with specialisation towards hand pose estimation. He has presented on many occasions, including a press conference held at the Royal Society, London. Philip has also published several international papers regarding hand pose estimation and novel methods for human computer interaction. His research has lead to the development of two different approaches for estimating hand pose and has been demonstrated as a real time user interaction system.

Video Object Tracking with Deep Siamese Networks

É com prazer que convidamos a tod@s para mais um seminário de
eScience. A palestra será dada pelo Dr. Henrique Morimitsu, ex-aluno
do nosso programa e que acaba de voltar de um posdoc no INRIA com a
Dra. Cordélia Schmid.

Título: Video Object Tracking with Deep Siamese Networks

Palestrante: Henrique Morimitsu

Data: 25/05/2018

Horário: 14h

Local: Auditório Jacy Monteiro

Resumo: Video object tracking consists in following an object of
interest in a video sequence. Visual tracking is a challenging task
because the only available information is contained in a single
annotated frame. Recently, deep siamese network architectures have
been proposed to tackle the tracking problem, and they have achieved
high accuracy and speed in standard benchmarks. This talk will provide
an overview of the concept and motivations behind siamese networks and
how recent methods have applied them to video object tracking.

A apresentação será em português.

Towards More Robust Outdoor Compute Vision: A Case Study on Haze

Nesta sexta-feira, 29/6, teremos a palestra do Dr. Zhangyang (Atlas)
Wang, que trabalha com visão computacional, multimedia e aprendizado
de máquina.

Título: Towards More Robust Outdoor Compute Vision: A Case Study on Haze

Abstract: While many sophisticated models are developed for visual
information  processing, very few pay attention to their usability in
the presence of data  quality degradations. Most successful models are
trained and evaluated on high quality visual datasets. On the other
hand, the robustness of those computer vision models are often not
assured in degraded visual environments. Especially for outdoor
scenarios. low target resolution, occlusion, motion blur, missing
data, poor light, and bad weather conditions, are all ubiquitous for
visual recognition in the wild. In this talk, I will use haze as a
case example, to introduce our progress on handling non-standard
outdoor visual degradations, using deep learning methods. I will go
through first single image dehazing, follow by video dehazing, then
coming to introduce our recent benchmarking and evaluation efforts.

Horário: 14h

Local: Sala 144 bloco B - IME - USP

Short Bio:
Dr. Zhangyang (Atlas) Wang is an Assistant Professor of Computer
Science and Engineering (CSE), at the Texas A&M University (TAMU).
During 2012-2016, he was a Ph.D. student in the Electrical and
Computer Engineering (ECE) Department, at the University of Illinois
at Urbana-Champaign (UIUC), working with Professor Thomas S. Huang.
Prior to that, he obtained the B.E. degree at the University of
Science and Technology of China (USTC), in 2012. He was a former
research intern with Microsoft Research (summer 2015), Adobe Research
(summer 2014), and US Army Research Lab (summer 2013). Dr. Wang’s
research has been addressing machine learning, computer vision and
multimedia signal processing problems, as well as their
interdisciplinary applications, using advanced feature learning and
optimization techniques. He has co-authored over 40 papers, and
published several books and chapters. He has been granted 3 patents,
and has received around 20 research awards and scholarships. He served
as a guest editor for IEEE TNNLS and EURASIP JASP/JBSB; an Area Chair
for WACV 2019 and ICIP 2017; a TPC co-chair for ICCV AMFG 2017; a
special session co-chair for VCIP 2017; a tutorial organizer/speaker
in SIAM IS 2018, CVPR 2017 and ECCV 2016; a workshop organizer in IEEE
FG, IJCAI and SDM; and a regular reviewer or TPC member for over 40
top journals and conferences. His research has been covered by
worldwide media, such as BBC, Fortune, International Business Times,
TAMU news, and UIUC news & alumni magazine. More could be found at:

More Articles...

  1. About eScience Group