|
Contents |
7 |
|
|
Foreword |
13 |
|
|
Preface |
15 |
|
|
1 INTRODUCTION |
18 |
|
|
1. Research Issues on Learning in Computer Vision |
19 |
|
|
2. Overview of the Book |
23 |
|
|
3. Contributions |
29 |
|
|
2 THEORY: PROBABILISTIC CLASSIFIERS |
32 |
|
|
1. Introduction |
32 |
|
|
2. Preliminaries and Notations |
35 |
|
|
2.1 Maximum Likelihood Classification |
35 |
|
|
2.2 Information Theory |
36 |
|
|
2.3 Inequalities |
37 |
|
|
3. Bayes Optimal Error and Entropy |
37 |
|
|
4. Analysis of Classification Error of Estimated (Mismatched) Distribution |
44 |
|
|
4.1 Hypothesis Testing Framework |
45 |
|
|
4.2 Classification Framework |
47 |
|
|
5. Density of Distributions |
48 |
|
|
5.1 Distributional Density |
50 |
|
|
5.2 Relating to Classification Error |
54 |
|
|
6. Complex Probabilistic Models and Small Sample Effects |
57 |
|
|
7. Summary |
58 |
|
|
3 THEORY: GENERALIZATION BOUNDS |
62 |
|
|
1. Introduction |
62 |
|
|
2. Preliminaries |
64 |
|
|
3. A Margin Distribution Based Bound |
66 |
|
|
3.1 Proving the Margin Distribution Bound |
66 |
|
|
4. Analysis |
74 |
|
|
4.1 Comparison with Existing Bounds |
76 |
|
|
5. Summary |
81 |
|
|
4 THEORY: SEMI-SUPERVISED LEARNING |
82 |
|
|
1. Introduction |
82 |
|
|
2. Properties of Classification |
84 |
|
|
3. Existing Literature |
85 |
|
|
4. Semi-supervised Learning Using Maximum Likelihood Estimation |
87 |
|
|
5. Asymptotic Properties of Maximum Likelihood Estimation with Labeled and Unlabeled Data |
90 |
|
|
5.1 Model Is Correct |
93 |
|
|
5.2 Model Is Incorrect |
94 |
|
|
5.3 Examples: Unlabeled Data Degrading Performance with Discrete and Continuous Variables |
97 |
|
|
5.4 Generating Examples: Performance Degradation with Univariate Distributions |
100 |
|
|
5.5 Distribution of Asymptotic Classi.cation Error Bias |
103 |
|
|
5.6 Short Summary |
105 |
|
|
6. Learning with Finite Data |
107 |
|
|
6.1 Experiments with Artificial Data |
108 |
|
|
6.2 Can Unlabeled Data Help with Incorrect Models? Bias vs. Variance Effects and the Labeled-unlabeled Graphs |
109 |
|
|
6.3 Detecting When Unlabeled Data Do Not Change the Estimates |
114 |
|
|
6.4 Using Unlabeled Data to Detect Incorrect Modeling Assumptions |
116 |
|
|
7. Concluding Remarks |
117 |
|
|
5 ALGORITHM: MAXIMUM LIKELIHOOD MINIMUM ENTROPY HMM |
120 |
|
|
1. Previous Work |
120 |
|
|
2. Mutual Information, Bayes Optimal Error, Entropy, and Conditional Probability |
122 |
|
|
3. Maximum Mutual Information HMMs |
124 |
|
|
3.1 Discrete Maximum Mutual Information HMMs |
125 |
|
|
3.2 Continuous Maximum Mutual Information HMMs |
127 |
|
|
3.3 Unsupervised Case |
128 |
|
|
4. Discussion |
128 |
|
|
4.1 Convexity |
128 |
|
|
4.2 Convergence |
129 |
|
|
4.3 Maximum A-posteriori View of Maximum Mutual Information HMMs |
129 |
|
|
5. Experimental Results |
132 |
|
|
5.1 Synthetic Discrete Supervised Data |
132 |
|
|
5.2 Speaker Detection |
132 |
|
|
5.3 Protein Data |
134 |
|
|
5.4 Real-time Emotion Data |
134 |
|
|
6. Summary |
134 |
|
|
6 ALGORITHM: MARGIN DISTRIBUTION OPTIMIZATION |
136 |
|
|
1. Introduction |
136 |
|
|
2. A Margin Distribution Based Bound |
137 |
|
|
3. Existing Learning Algorithms |
138 |
|
|
4. The Margin Distribution Optimization (MDO) Algorithm |
142 |
|
|
4.1 Comparison with SVM and Boosting |
143 |
|
|
4.2 Computational Issues |
143 |
|
|
5. Experimental Evaluation |
144 |
|
|
6. Conclusions |
145 |
|
|
7 ALGORITHM: LEARNING THE STRUCTURE OF BAYESIAN NETWORK CLASSIFIERS |
146 |
|
|
1. Introduction |
146 |
|
|
2. Bayesian Network Classifiers |
147 |
|
|
2.1 Naive Bayes Classifiers |
149 |
|
|
2.2 Tree-Augmented Naive Bayes Classifiers |
150 |
|
|
3. Switching between Models: Naive Bayes and TAN Classifiers |
155 |
|
|
4. Learning the Structure of Bayesian Network Classifiers: Existing Approaches |
157 |
|
|
4.1 Independence-based Methods |
157 |
|
|
4.2 Likelihood and Bayesian Score-based Methods |
159 |
|
|
5. Classification Driven Stochastic Structure Search |
160 |
|
|
5.1 Stochastic Structure Search Algorithm |
160 |
|
|
5.2 Adding VC Bound Factor to the Empirical Error Measure |
162 |
|
|
6. Experiments |
163 |
|
|
6.1 Results with Labeled Data |
163 |
|
|
6.2 Results with Labeled and Unlabeled Data |
164 |
|
|
7. Should Unlabeled Data Be Weighed Differently? |
167 |
|
|
8. Active Learning |
168 |
|
|
9. Concluding Remarks |
170 |
|
|
8 APPLICATION: OFFICE ACTIVITY RECOGNITION |
174 |
|
|
1. Context-Sensitive Systems |
174 |
|
|
2. Towards Tractable and Robust Context Sensing |
176 |
|
|
3. Layered Hidden Markov Models (LHMMs) |
177 |
|
|
3.1 Approaches |
178 |
|
|
3.2 Decomposition per Temporal Granularity |
179 |
|
|
4. Implementation of SEER |
181 |
|
|
4.1 Feature Extraction and Selection in SEER |
181 |
|
|
4.2 Architecture of SEER |
182 |
|
|
4.3 Learning in SEER |
183 |
|
|
4.4 Classification in SEER |
183 |
|
|
5. Experiments |
183 |
|
|
5.1 Discussion |
186 |
|
|
6. Related Representations |
187 |
|
|
7. Summary |
189 |
|
|
9 APPLICATION: MULTIMODAL EVENT DETECTION |
192 |
|
|
1. Fusion Models: A Review |
193 |
|
|
2. A Hierarchical Fusion Model |
194 |
|
|
2.1 Working of the Model |
195 |
|
|
2.2 The Duration Dependent Input Output Markov Model |
196 |
|
|
3. Experimental Setup, Features, and Results |
199 |
|
|
4. Summary |
200 |
|
|
10 APPLICATION: FACIAL EXPRESSION RECOGNITION |
204 |
|
|
1. Introduction |
204 |
|
|
2. Human Emotion Research |
206 |
|
|
2.1 Affective Human-computer Interaction |
206 |
|
|
2.2 Theories of Emotion |
207 |
|
|
2.3 Facial Expression Recognition Studies |
209 |
|
|
3. Facial Expression Recognition System |
214 |
|
|
3.1 Face Tracking and Feature Extraction |
214 |
|
|
3.2 Bayesian Network Classifiers: Learning the “Structure” of the Facial Features |
217 |
|
|
4. Experimental Analysis |
218 |
|
|
4.1 Experimental Results with Labeled Data |
221 |
|
|
4.1.1 Person-dependent Tests |
222 |
|
|
4.1.2 Person-independent Tests |
223 |
|
|
4.2 Experiments with Labeled and Unlabeled Data |
224 |
|
|
5. Discussion |
225 |
|
|
11 APPLICATION: BAYESIAN NETWORK CLASSIFIERS FOR FACE DETECTION |
228 |
|
|
1. Introduction |
228 |
|
|
2. Related Work |
230 |
|
|
3. Applying Bayesian Network Classifiers to Face Detection |
234 |
|
|
4. Experiments |
235 |
|
|
5. Discussion |
239 |
|
|
References |
242 |
|
|
Index |
254 |
|