Sunday, April 7, 2013

Expectation Maximization Algorithm


The expectation-maximization (EM) algorithm is a broadly applicable approach to the iterative computation of maximum likelihood (ML) estimates, useful in a variety of incomplete-data problems. In particular, the EM algorithm simplifies considerably the problem of fitting finite mixture models by ML, where mixture models are used to model heterogeneity in cluster analysis and pattern recognition contexts.
EM algorithm is an iterative method for finding maximum likelihood or maximum a posteriori probability (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables.
The EM algorithm has a number of appealing properties, including its numerical stability, simplicity of implementation, and reliable global convergence. There are also extensions of the EM algorithm to tackle complex problems in various data mining applications. It is, however, highly desirable if its simplicity and stability can be preserved. Maximum likelihood estimation and likelihood-based inference are of central importance in statistical theory and data analysis. The EM algorithm is used to find the maximum likelihood parameters of a statistical model in cases where the equations cannot be solved directly. Maximum likelihood estimation is a general-purpose method with attractive properties. Finite mixture distributions provide a flexible and mathematical-based approach to the modeling and clustering of data observed on random phenomena.
The EM algorithm is an iterative algorithm. Its iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step.

Reference:
Wu, X., Kumar, V., (2009), The Top Ten Algorithms in Data Mining, Chapman & Hall / CRC.

No comments:

Post a Comment