SVM: Hand Example + General Rationale

Point	\(\mathbf{x}_i\)	\(y_i\)
1	(1,1)	+1
2	(2,2)	+1
3	(0,0)	-1
4	(-1,-1)	-1

Rationale Behind the Hand‑Calculated Example

Why this dataset? Four points on two parallel lines in opposite quadrants → intuitive geometry, simple algebra.

Hard‑margin: Perfect separation allows strict constraints → dual is a 2‑variable QP, solvable by hand.

Support vectors: Only \((1,1)\) and \((0,0)\) lie on margin planes → others have \(\alpha_i = 0\).

Dual: \(\alpha_1 = \alpha_3\), objective \(2\alpha - \alpha^2 \to \alpha = 1\).

Margin: \(\|\mathbf{w}\| = \sqrt{2} \Rightarrow\) width \(\sqrt{2}\), matches distance between \(x_1 + x_2 = 0\) and \(x_1 + x_2 = 2\).

General Rationale Behind SVM for Classification

Core Idea: Find the maximum‑margin hyperplane that separates classes. Why maximum margin?

Geometric intuition: The farthest possible decision boundary from both classes → most robust to noise.
Statistical learning theory: Larger margin → lower generalization error (Vapnik‑Chervonenkis theory).
Optimization: Maximizing margin = minimizing \(\|\mathbf{w}\|^2\) → convex quadratic program → unique global solution.
Reference: https://en.wikipedia.org/wiki/Support_vector_machine

Key Components

Concept	Purpose
Support Vectors	Only points on the margin matter. SVM is sparse — ignores non‑support data.
Dual Form	Depends only on dot products \(\mathbf{x}_i \cdot \mathbf{x}_j\) → enables kernel trick for non‑linear data.
Hard vs Soft Margin	Hard: assumes perfect separation. Soft (C‑parameter): allows misclassification for robustness.

Why SVM Works So Well

Regularization via margin: Controls model complexity without explicit feature selection.
Kernel flexibility: Maps data to high‑dimensional space where classes become separable.
Theoretical guarantees: Bounds on test error based on margin and support vector count.

In one sentence:
SVM finds the safest possible decision boundary by pushing it as far from the data as possible, using only the most critical points (support vectors) to define it.

Support Vector Machine (SVM): Hand‑Calculated Example + Theory

Step 1 – Dataset

Step 2 – Objective

Step 3 – Hyperplane Guess

Step 4 – Support Vectors

Step 5 – Dual Solution

Step 6 – Margin

Step 7 – Plot (Equal Axes)

Step 8 – Test Point

Rationale Behind the Hand‑Calculated Example

General Rationale Behind SVM for Classification

Key Components

Why SVM Works So Well