Support Vector Machine (SVM): Hand‑Calculated Example + Theory

Linear hard‑margin SVM – step‑by‑step + equal axes + full rationale

Step 1 – Dataset

Point\(\mathbf{x}_i\)\(y_i\)
1(1,1)+1
2(2,2)+1
3(0,0)-1
4(-1,-1)-1

Step 2 – Objective

\[ y_i (\mathbf{w} \cdot \mathbf{x}_i + b) \geq 1 \quad \forall i \]
\[ \text{minimize } \frac{1}{2} \|\mathbf{w}\|^2 \]

Step 3 – Hyperplane Guess

\[ x_1 + x_2 - 1 = 0 \quad\Rightarrow\quad \mathbf{w} = (1,1),\ b = -1 \]

Step 4 – Support Vectors

Step 5 – Dual Solution

\[ \alpha_1 = \alpha_3 = 1 \quad\Rightarrow\quad \mathbf{w} = (1,1),\ b = -1 \]

Step 6 – Margin

\[ \|\mathbf{w}\| = \sqrt{2},\quad \text{Margin} = \frac{2}{\sqrt{2}} = \sqrt{2} \]

Step 7 – Plot (Equal Axes)

Step 8 – Test Point

Point \((1.5, 0)\):

\[ \mathbf{w} \cdot (1.5, 0) + b = 1.5 - 1 = 0.5 > 0 \quad\Rightarrow\quad +1 \]
Final Answer
Hyperplane: \(x_1 + x_2 - 1 = 0\)
\(\mathbf{w} = (1,1),\ b = -1\)
Support vectors: \((1,1)\) and \((0,0)\)
Margin: \(\sqrt{2}\)
Test point \((1.5, 0)\) → class +1

Rationale Behind the Hand‑Calculated Example

Why this dataset? Four points on two parallel lines in opposite quadrants → intuitive geometry, simple algebra.

Hard‑margin: Perfect separation allows strict constraints → dual is a 2‑variable QP, solvable by hand.

Support vectors: Only \((1,1)\) and \((0,0)\) lie on margin planes → others have \(\alpha_i = 0\).

Dual: \(\alpha_1 = \alpha_3\), objective \(2\alpha - \alpha^2 \to \alpha = 1\).

Margin: \(\|\mathbf{w}\| = \sqrt{2} \Rightarrow\) width \(\sqrt{2}\), matches distance between \(x_1 + x_2 = 0\) and \(x_1 + x_2 = 2\).

General Rationale Behind SVM for Classification

Core Idea: Find the maximum‑margin hyperplane that separates classes. Why maximum margin?

Key Components

ConceptPurpose
Support VectorsOnly points on the margin matter. SVM is sparse — ignores non‑support data.
Dual FormDepends only on dot products \(\mathbf{x}_i \cdot \mathbf{x}_j\) → enables kernel trick for non‑linear data.
Hard vs Soft MarginHard: assumes perfect separation. Soft (C‑parameter): allows misclassification for robustness.

Why SVM Works So Well

In one sentence:
SVM finds the safest possible decision boundary by pushing it as far from the data as possible, using only the most critical points (support vectors) to define it.