Recovering human motion by visual analysis is a challenging computer vision research area with a lot of potential applications, Model-based tracking approaches, and in particular particle filters, formulate the problem as a Bayesian inference task whose aim is to sequentially estimate the distribution of the parameters of a human body model over time. These approaches strongly rely on good dynamical and observation models to predict and update configurations of the human body according to measurements from the image data. However, it is very difficult to design observation models which extract useful and reliable information from image sequences robustly. This results specially challenging in monocular tracking given that only one viewpoint from the scene is available. Therefore, to overcome these limitations strong motion priors are needed to guide the exploration of the state space.
The work presented in this Thesis is aimed to retrieve the 3D motion parameters of a human body model from incomplete and noisy measurements of a monocular image sequence. These measurements consist of the 2D positions of a reduced set of joints in the image plane. Towards this end, we present a novel action-specific model of human motion which is trained from several databases of real motion-captured performances of an action, and is used as a priori knowledge within a particle filtering scheme. Body postures are represented by means of a simple and compact stick figure model which uses direction cosines to represent the direction of body limbs in the 3D Cartesian space. Then, for a given action, Principal Component Analysis is applied to the training data to perform dimensionality reduction over the highly correlated input data. Before the learning stage of the action model, the input motion performances are synchronized by means of a novel dense matching algorithm based on Dynamic Programming. The algorithm synchronizes all the motion sequences of the same action class, finding an optimal solution in real-time.
Then, a probabilistic action model is learnt, based on the synchronized motion examples, which captures the variability and temporal evolution of full-body motion within a particular action. In particular, for each action, the parameters learnt are: a representative manifold for the action consisting of its mean performance, the standard deviation from the mean performance, the mean observed direction vectors from each motion subsequence of a given length and the expected error at a given time instant.
Subsequently, the action-specific model is used as a priori knowledge on human motion which improves the efficiency and robustness of the overall particle filtering tracking framework. First, the dynamic model guides the particles according to similar situations previously learnt. Then, the state space is constrained so only feasible human postures are accepted as valid solutions at each time step. As a result, the state space is explored more efficiently as the particle set covers the most probable body postures.
Finally, experiments are carried out using test sequences from three well-known databases. Results point out that our tracker scheme is able to estimate the rough 3D configuration of a full-body model providing only the 2D positions of a reduced set of joints. Separate tests on the sequence synchronization method and the subsequence probabilistic matching technique are also provided.