Mason Wang

SVM

uses hinge loss

minimize max(0, 1 - y * f(x))

max(0, 1 - y_i (w * x_i + b))

1 - y_i y_hat_i

if y_i is 1, this is 1 - score

if it is -1, this is 1 + score

“1” is the margin

with no margin, when GT is positive, y_hat_i wants to be POSITIVE (- score will be negative) with no margin, when GT is negative, y_hat_i wants to be large, negative (+ score will be small)

when we have the margin, 1 - score and y_i positive, then if we are super correct (y_hat_i > 1) there is NO LOSS when we have the margin, 1 + score is the loss (y_i negative), then there is NO LOSS if score is < -1 (we are super correct above a margin.) If the score is -0.5, there is still a loss!

see linear classifier notes

Last Reviewed: 10/28/2025