Sale!

# Intr to ML/PR Assignment 4

\$30.00

EECS4404/5327 Intr to ML/PR
Assignment 4

Note: This assignment is mainly for you to review several basic generative models. You have
to work individually. You must use the same mathematical notations in textbook or lecture
slides to answer these questions. You must use this latex template to write up your solutions.
Remember to fill in your information (name, student number, email) at above. No handwriting
is accepted.

Category:

## Description

EECS4404/5327 Intr to ML/PR
Assignment 4

Note: This assignment is mainly for you to review several basic generative models. You have
to work individually. You must use the same mathematical notations in textbook or lecture
slides to answer these questions. You must use this latex template to write up your solutions.
Remember to fill in your information (name, student number, email) at above. No handwriting
is accepted.
Exercise 1
Bayesian Decision Theory (20 marks)
(a) Assume that we are allowed to reject an input as unrecognizable in a pattern-classification
task. For an input x belonging to class ω, we can define a new loss function for any decision
rule g(x) as follows:
l

ω, g(x)

=

0 : g(x) = ω
1 : g(x) 6= ω
λr
: rejection,
where λr ∈ (0, 1) is the loss incurred for choosing a rejection action. Derive the optimal decision rule for this three-way loss function.
(b) What would happen if we set λr > 1?
1. x −→ g(x)e{0, 1, λr}
g

(x) = arg maxk Pr({0, 1, λr}k) · p(x|{0, 1, λr}k)
2. If γr was set to greater than 1, then the risk function would be negative and would
result in unsatisfactory reuslts.
Exercise 2
Gaussian Models (20 marks)
Derive the maximum likelihood estimation (MLE) for multivariate Gaussian models with a
diagonal covariance matrix, i.e. N (x|µ, Σ) with x, µ ∈ Rd and
Σ =

σ1
.
.
.
σd

Show that the MLE of µ is the same as Eq.(11.3) on page 238 and that of {σ1, · · · , σd} equals to
the diagonal elements in Eq.(11.4) on page 239.
Department of Electrical Engineering and Computer Science
York University EECS4404/5327 Intr to ML/PR (Winter 2021)
1. |Σ| = (σ1 · σ2 . . . σd−1
· σd)
(x − µ)

−1
(x − µ) =
x1 − µ1 . . . xd − µd

1
σ1
(x1 − µ1)

1
σd
(xd − µd)

= ( 1
σ1
(x1 − µ1)
2 + … + 1
σd
(xd − µd)
2
)
pµ,Σ(x) = 1
(2π)
d/2|Σ|
1/2 e
−1
2
(x−µ)

−1
(x−µ)
= 1
(2π)
d/2(σ1·σ2…σd−1
·σd)
1/2 e
(− 1
2σ1
(x1−µ1)
2−…− 1
2σd
(xd−µd)
2
)
At this point we can see that this is just the product of d univariate gaussian models and
so we know that the µ and σ would have to be the same
Exercise 3
Gaussian Mixture Models (40 marks)
You will solve a simple binary classification problem (class A vs. class B) using simple multivariate Gaussian models as well as Gaussian mixture models. Assume two classes have equal
prior probabilities. Each observation feature is a three-dimensional (3D) vector. You can download the data set from: http://www.eecs.yorku.ca/~hj/MLF-gaussian-dataset.zip.
You will use several different methods to build such a classifier based on the provided training
set, and then the estimated models will be evaluated on the provided test set. You will have to
implement all training and test methods from scratch.
1. (10 marks) Build a simple classifier using multivariate Gaussian models. Each class is
modeled by a single 3D Gaussian distribution. You should consider the following structures for the covariance matrices:
• Each Gaussian uses a diagonal covariance matrix.
• Each Gaussian uses a full covariance matrix.
Use the provided training data to estimate the Gaussian mean vector and covariance
matrix for each class based on MLE. Report the classification accuracy of the MLE-trained
models as measured by the test set for each choice of the covariance matrix.
2. (30 marks) Improve the Gaussian classifier from the previous step by using a GMM to
model each class. You need to use the k-means clustering method to initialize all parameters in the GMMs, and then improve the GMMs based on the EM algorithm. Investigate
GMMs that have 2, 4, 8, or 16 Gaussian components, respectively. Determine the best
model configuration in terms of the number of Gaussian components and the covariance
matrix structure (diagonal vs. full) for this data set.
The csv data format: All training samples are given in the file train-gaussian.csv, and all test
samples are given in the file test-gaussian.csv. Each line represents a feature vector in the format
as follows:
y, x1, x2, x3,
where y ∈ {A, B} is class label, and [x1 x2 x3] is a 3D feature vector.
You can use the method (item 4) in