Sale!

Artificial Intelligence II – Assignment #2 solution

\$30.00

CS4442B & CS9542B: Artificial Intelligence II – Assignment #2

1 Generative Models, Naive Bayes Classifier [20 points]
Based on the Table in page 13 of Lecture 5, compute the followings
(a) [5 points] p(Water = cool|Play = yes), p(Water = cool|Play = no)
(b) [5 points] p(Play = yes|Water = warm), p(Play = no|Water = warm)
(c) [5 points] p(Play = yes|Humid = high), p(Play = yes|Humid = normal)
(d) [5 points] p(Water = cool|Play = yes), p(Water = cool|Play = no) with Laplace smoothing
2 Kernels [30 points]

Category:

Description

CS4442B & CS9542B: Artificial Intelligence II – Assignment #2

1 Generative Models, Naive Bayes Classifier [20 points]
Based on the Table in page 13 of Lecture 5, compute the followings
(a) [5 points] p(Water = cool|Play = yes), p(Water = cool|Play = no)
(b) [5 points] p(Play = yes|Water = warm), p(Play = no|Water = warm)
(c) [5 points] p(Play = yes|Humid = high), p(Play = yes|Humid = normal)
(d) [5 points] p(Water = cool|Play = yes), p(Water = cool|Play = no) with Laplace smoothing
2 Kernels [30 points]
In this problem, we consider constructing new kernels by combining existing kernels. Recall that for
some function k(x, z) to be a kernel, we need to be able to write it as a dot product of vectors in some
high-dimensional feature space defined by φ:
k(x, z) = φ(x)
φ(z)
Mercer’s theorem gives a necessary and sufficient condition for a function k to be a kernel function: its
corresponding kernel matrix K has to be symmetric and positive semidefinite.
Suppose that k1(x, z) and k2(x, z) are two valid kernels. For each of the cases below, state whether
k is also a valid kernel. If it is, prove it. If it is not, give a counterexample. You can use either Mercer’s
theorem, or the definition of a kernel as needed to prove it.
(a) [10 points] k(x, z) = k1(x, z)k2(x, z)
(b) [10 points] k(x, z) = f1(x)f1(z) + f2(x)f2(z), where f1, f2 : R
n → R are a real-valued functions
(c) [10 points] k(x, z) = √
k1(x,z)
k1(x,x)k1(z,z)
, where k1(x, x) 0 for any x.
3 PCA and Eigenface [50 points]
For this exercise, you will use principal component analysis (PCA) to analyze face images in any programming language of your choice (e.g., Python/Matlab/R). The data set faces.dat; each row represents an
image (400 images), and each column represents a pixel (64 × 64 = 4096 pixels).
(a) [5 points] Display the 100th image.
(b) [5 points] Remove the mean of the images, and then display the 100th image.
(c) [10 points] Perform PCA on the mean-centered data matrix. You can either implement PCA by
yourself using eigenvalue decomposition over the sample covariance matrix (e.g., the eig function
in Matlab), or use a existing machine learning toolbox (e.g., pca in Matlab). Sort the eigenvalues
in a descending order and plot them.
(d) [5 points] You will find the last (i.e., 400th) eigenvalue is 0. Explain why.
(e) [5 points] Based on the eigenvalues, determine the dimensionality of the data you want to keep
(i.e., how many principal components you want to keep), which accounts for most of the variance.
1
(f) [10 points] Display the top-5 leading eigenvectors (corresponding to the top-5 largest eigenvalues)
in 5 figures.
(g) [10 points] Display, respectively, the reconstructed 100th images using 10, 100, 200, and 399 principal components. (Hint: Lecture 10 (page 19), we have learned that ˆx = vvx if we project x into
1-dimensional space using the 1st principal component. Reconstructed ˆx using top-K principal
components is a straightforward extension: ˆx =
PK
k=1 vkv

k x)
Hint: in order to display the images and principal components (i.e., 3(a), 3(b), 3(f), 3(g)), you should
first reshape the 4096-dimensional vector into a 64 × 64 matrix (e.g., use reshape in Matlab). Then, in
Matlab, you can use the function imshow to display them.
2