Behavior and Gesture Analysis

Project-ASL Recognition

The first video demonstrats detection of a wh-question non-manual marker. Tracked face and head is shown on left, while the right image shows the extracted spatial pyramid features. Red bars indicate detection of the wh non-manual marker, while blue bars indicate that the system detects no wh non-manual marker. In the second video we demonstrate that we are able to track eyebrow height (top right) and head pitch angle (bottom right) in an isolated utterance of wh-question. The red graph line...


In this project we try to assess natural movements of children with Autism Spectrum Disorders in the classroom settings. We use touch screens, a motion capture system (Polhemus Liberty sampling movement at 240Hz), multiple cameras and all sorts of fun toys to engage the children. Currently, we have developed visual stimulus to encourage the autistic children to perform designed tasks. Currently we are analyzing the data captured by Polhemus system.

Project-Expression Recognition

Facial Expression Recognition and Intensity Estimation. Age Estimation and Human Action Recognition.


We address the problem of correcting an undesirable expression on a face photo by transferring local facial components, such as a smiling mouth, from another face photo of the same person which has the desired expression. Direct copying and blending using existing compositing tools results in semantically unnatural composites, since expression is a global effect and the local component in one expression is often incompatible with the shape and other components of the face in another...


Fatigue from chronic partial sleep deprivation, circadian misalignment (e.g., slam-shifts), and work overload (e.g.,EVAs) is a risk factor for people driving a car, or performing critical tasks. There is a need for techniques that objectively and unobtrusively identify the presence of fatigue on-line. Tracking slow eyelid closures is one of the most reliable ways to detect fagitues. We develop a system to tracking slow eyelid closure using a single camera. The eyes are modeled by a singe...


Accurate face tracking and 3D head pose prediction (shown in top left as a 3D vector of pitch, yaw and tilt) while the face is making various facial expressions as well as out of plane rotations. The 79 tracked landmarks corresponding to the eyes, eyebrows, nose, mouth and face contour are shown as red dots.


This approach effectively models group activities based on social behavioranalysis. Different from previous work that uses independent local features,this project explores the relationships between the current behavior stateof a subject and its actions. Our method does not depend on human detectionor segmentation, so it is robust to detection errors. Instead, trackedspatio-temporal interest points are able to provide a good estimation ofmodeling group interaction. SVM is usedto find abnormal...


We investigate how degree of interactional synchrony can signal whether trust is present, absent, increasing or declining. We propose an automated, data-driven and unobtrusive framework for deception detection and analysis in interrogation interviews from visual cues only. Our framework consists of the face tracking, the gesture detection, the expression recognition, and the synchrony estimation. This framework is able to automatically track gestures and expressions of both the subject and the...

Computer Graphics


We extend the As-Rigid-As-Possible Surface Modeling by considering skeleton information. It is able to roughly preserve the volume magnitude during deformation.


We consider the minimization of a smooth convex function regularized by the mixture of prior models. In this paper, we present two algorithms to effectively solve it, and apply the proposed algorithms to compressed MR image reconstruction and low-rank tensor completion.

Generative Adversarial Networks (GAN)


Deep models are usually trained in a two-phase paradigm, where data collection and network training are separated. This may not be efficient since data collection is blind to network training.


We present Optimal Transport GAN (OT-GAN), a variant of generative adversarial nets minimizing a new metric measuring the distance between the generator distribution and the data distribution

Medical Image Analysis


4D cardiac reconstruction using high resolution CT images captures complex anatomical features, such as the papillary muscles and the ventricular trabeculae.


4D cardiac reconstructions are created using high resolution CT images captures complex anatomical features. A simulation of the blood flow is performed and visualizations are generated to find problems.


4D cardiac reconstruction using high resolution CT images captures complex anatomical features, such as the papillary muscles and the ventricular trabeculae.


The deformable contours and moving trajectories of cancer cells are tracked in computerized Video Time-lapse Microcopy.


We propose Sparse Shape Composition model (SSC) to deal with these three challenges in a unified framework.


The in-wall motion and the strain field of the left ventricle myocardium are reconstructed from tagged MRI.


We propose composite splitting algorithms which are applied to the compressed MR image reconstruction.


Organ shape plays an important role in various clinical practices such as segmentation. Effective modeling of shape priors is challenging because: (1) shape variation is complex and cannot always be modeled by a parametric probability distribution; (2) a shape instance derived from image appearance cues (input shape) may have gross errors; and (3) local details of the input shape are difficult to preserve if they are not statistically significant in the training data. We propose Sparse Shape...


We introduce a tag separation method for better cardiac boundary segmentation and tag tracking.



This project address the problem of registering a sequence of images in a moving dynamic texture video.

Project-Image Retrieval

Recent image retrieval algorithms based on local features indexed by a vocabulary tree and holistic features indexed by compact hashing codes both demonstrate excellent scalability. However, their retrieval precision may vary dramatically among queries. This motivates us to investigate how to fuse the ordered retrieval sets given by multiple retrieval methods, to further enhance the retrieval precision.


Automatically assigning relevant text keywords to images is an important problem. We introduce a regularization based feature selection algorithm to leverage both the sparsity and clustering properties of features, and incorporate it into the image annotation task.


This projects address the issue of motion saliency detection in video sequences.


We propose a framework for obtaining transformation-invariant image sparse representation. W can simultaneously recover the sparse representation of a target image and the image plane transformation between the target and the model images.