数据降维和聚类中的若干问题研究(英文版)
出版时间:2011年版
内容简介
A central research area in data mining and machine learning is probabilis-tic modeling because it has a number of advantages over non-probabilistic methods. Given a probabilistic model, one could fit the model using max-imum likelihood (ML) method or Variational Bayesian (VB) method. In ML method, (1) many algorithms may converge very slowly and thus com- putationally efficient algorithms are often desirable; and (2) the choice of a suitable modelis difficult though many model selection criteria exist and thus criteria with higher accuracy are desired. In VB method, employingdifferent priors may yield different performances and thus studies on how to choose a suitable prior are important. In this book, three sub-topics were studied: Modeling, Estimation and Model selection for dimension reduc- ition and clustering.
目录
1 Introduction
1.1 PCA and Latent Variable Models
1.1.1 PCA
1.1.2 Latent Variable Models
1.1.3 FA and PPCA
1.2 Motivations and Contributions
1.3 Organization of the Book
2 ML Estimation for Factor Analysis: EM or non-EM
2.1 Introduction
2.2 FA Model and Three Estimation Algorithms
2.2.1 FA model
2.2.2 Lawley (1940)'s simple iteration algorithm
2.2.3 EM type algorithms
2.3 TheECME2 algorithm
2.3.1 The maximization in the first CM-step
2.3.2 The maximization in the second CM-step
2.3.3 Practical consideration
2.3.4 ECME2 vs. simpleiteration algorithm
2.4 The CMAlgorithm
2.4.1 The maximizationin the second CM-step
2.4.2 When will conditionlbe satisfied
2.4.3 Recursive computation ofthe matrix Bz
2.4.4 On the nature of stationary points
2.5 Simulations
2.5.1 Simulation Data
2.5.2 Performance Analysis
2.5.3 On different starting values
2.6 Conclusion and Future Work
2.7 Appendix
2.7.1 Proofs
2.7.2 Some Notes
3 Fast ML estimation for the Mixture of Factor Analyzers via an ECM Algorithm
3.1 Introduction
3.2 MFA model and an ECM algorithm
……
4 Mixture Model Selection:BIC or Hierarchical BIC
5 A Note on Variational Bayesian Factor Analysis
6 Bilinear Probabilistic Principal Component Analysis
7 Conclusions and discussions
References