Introductory Data Mining Exam

All Computational Questions
Time: 120 minutes | Total: 100 points
Show all work. Final answers must be boxed. Calculators allowed.

Question 1: RMSE Calculation (12 pts)

Using the model $$\hat{y} = 3.2 + 1.4X_1 - 0.9X_2 + 0.6X_3$$, compute the RMSE for the following new points:

$X_1$$X_2$$X_3$$y$ (actual)
2601
3524
0300.3

a) (6 pts) Fill in the table below with $\hat{y}$ and $(y - \hat{y})^2$.

b) (6 pts) Compute the RMSE and box your answer.

$X_1$$X_2$$X_3$$y$$\hat{y}$$(y - \hat{y})^2$
2601??
3524??
0300.3??
Sum =?

RMSE = $\sqrt{\frac{1}{3} \sum (y - \hat{y})^2}$ = ?

Question 2: Logistic Regression (14 pts)

Model: $$\log\left(\frac{P(Y=1)}{1-P(Y=1)}\right) = -1.8 + 0.7X_1 - 1.2X_2$$

a) (6 pts) Compute $P(Y=1)$ when $X_1 = 3$, $X_2 = 1$. Use $e \approx 2.718$.

b) (4 pts) Compute the odds $P(Y=1)/(1-P(Y=1))$.

c) (4 pts) By how much do the odds multiply when $X_1$ increases by 1 (holding $X_2$ fixed)?

Question 3: Lasso vs OLS (10 pts)

PredictorOLS CoefLasso Coef
$X_1$0.450.42
$X_2$0.030.00
$X_3$−0.71−0.68

a) (4 pts) Which variable was shrunk to zero by Lasso?

b) (6 pts) The Lasso objective adds $\lambda \sum | \beta_j |$. With $\lambda = 0.1$, compute the $L_1$ penalty term using the Lasso coefficients.

Question 4: K-Nearest Neighbors (12 pts)

The following 5 points are given with their coordinates and class labels:

Point$X_1$$X_2$Class
A11Red
B21Blue
C55Red
D65Blue
E33Red

A new point is observed at: $P = (3, 3)$.

a) (6 pts) Compute the Euclidean distance from $P$ to each of the 5 points. Then, list the 3 nearest neighbors (in order of increasing distance).

b) (6 pts) Using $k=3$ and majority vote, predict the class of point $P$. Box your final answer.

Show your distance calculations below:

PointDistance to $P$
A?
B?
C?
D?
E?

Predicted class of $P$: ?

Question 5: K-Means Clustering (14 pts)

Assume $K=2$. At a certain iteration, cluster 1 has 25 observations with centroid at (2, 3) and cluster 2 has 20 observations with centroid at (8, 7).
New point: P = (5, 4)

a) (6 pts) Find the Euclidean distance from the new point to each centroid.

b) (4 pts) Assign the new point to the closer cluster.

c) (4 pts) After assigning P, can you compute the new centroids? If yes, give the coordinates (rounded to two decimals). If not, explain why.

Question 6: Hierarchical Clustering (10 pts)

The table below shows the pairwise Euclidean distances between points A, B, and C:

ABC
A25
B23
C53

a) (6 pts) Using single linkage, which two points are merged first?

b) (4 pts) After the first merge, what is the single-linkage distance between the new cluster and the remaining point?

Question 7: Classification Tree – Entropy (12 pts)

Parent node: 60 Class 0, 40 Class 1 (total 100).

a) (4 pts) Compute the entropy of the parent node. Use natural logarithm (base $e$). Round to three decimals.

b) (8 pts) A split produces:
Left child: 50 Class 0, 10 Class 1
Right child: 10 Class 0, 30 Class 1
Compute the weighted entropy of the children (round to three decimals).

Question 8: SVM (10 pts)

Hyperplane: $w = (1, -1)$, $b = -3$. Decision function $f(x) = w \cdot x + b$.

a) (6 pts) Compute $f(3,3)$.

b) (4 pts) Predict the label (sign of $f$).

Question 9: Neural Network – Forward Pass (16 pts)

We train a neural network with two inputs $x_1=2$, $x_2=-1$ and a binary output ($Y=0$ or $Y=1$). The architecture has:

Weights and biases:

FromToWeightBias (at target)
$x_1$Hidden$w_1 = 0.5$$b_h = 0.3$
$x_2$Hidden$w_2 = -1.2$
Hidden$Y=1$ node$w_{o1} = 1.5$$b_o = 0.0$
Hidden$Y=0$ node$w_{o0} = 0.8$

Draw the neural network below with all weights and biases clearly labeled. Include input nodes, one hidden node (ReLU), and two output nodes (softmax). Show arrows and label each connection with its weight, and each node with its bias (if any).

[Draw your neural network diagram here]

a) (4 pts) Compute the pre-activation $z_h$ at the hidden node.

b) (3 pts) Compute the hidden node output after ReLU.

c) (4 pts) Compute the pre-softmax values $a_1$ and $a_0$ at the two output nodes.

d) (5 pts) Apply softmax to get $P(Y=1)$ and $P(Y=0)$. Round to three decimals and box both.

End of Exam

SOLUTIONS

Question 1: RMSE Calculation

a)

$X_1$$X_2$$X_3$$y$$\hat{y}$$(y - \hat{y})^2$
26013.04
35244.90.01
0300.30.50.04
Sum =4.05

b) $$\text{RMSE}=\sqrt{\frac{4.05}{3}}\approx 1.16$$ 1.16

Question 2: Logistic Regression

logit = −0.9

a) $$P\approx 0.289$$ 0.289

b) $$\text{Odds}\approx 0.406$$ 0.406

c) $$e^{0.7}\approx 2.014$$ 2.014

Question 3: Lasso vs OLS

a) $X_2$ $X_2$

b) $$0.1 \times 1.10 = 0.11$$ 0.11

Question 4: K-Nearest Neighbors

a) E (0), B (≈2.236), A/C (≈2.828)

b) Red Red

Question 5: K-Means Clustering

a) $d_1\approx3.162$, $d_2\approx4.243$

b) Cluster 1 Cluster 1

c) $(2.12, 3.04)$ (2.12, 3.04)

Question 6: Hierarchical Clustering

a) A and B A and B

b) 3 3

Question 7: Classification Tree – Entropy (Base $e$)

a) $$H = -0.6 \ln 0.6 - 0.4 \ln 0.4 \approx 0.673$$ 0.673

b) Left: $H_L \approx 0.451$, Right: $H_R \approx 0.562$, Weighted: $0.497$ 0.497

Question 8: SVM

a) $f(3,3) = -3$ -3

b) $-1$ -1

Question 9: Neural Network – Forward Pass (Softmax Output)

Neural Network Diagram (for reference):

x₁
=2
x₂
=−1
h
ReLU
Y=1
Y=0
softmax
w₁=0.5
w₂=−1.2
+ bₕ = 0.3
wₒ₁=1.5
wₒ₀=0.8
+ bₒ = 0.0

a) $z_h = 0.5·2 + (-1.2)·(-1) + 0.3 = 2.5$ 2.5

b) ReLU(2.5) = 2.5 2.5

c)
$a_1 = 1.5·2.5 + 0.0 = 3.75$
$a_0 = 0.8·2.5 + 0.0 = 2.0$

d)
$e^{3.75} \approx 42.52$, $e^{2.0} \approx 7.39$
Sum = 42.52 + 7.39 = 49.91
$P(Y=1) = 42.52 / 49.91 \approx 0.852$
$P(Y=0) = 7.39 / 49.91 \approx 0.148$

P(Y=1) = 0.852,   P(Y=0) = 0.148
End of Solutions