A supervized learning problem - learning a mapping from input variables to output variables - may be seen as a conditional probability estimation problem. For instance, consider a toy example below:
Patient ($x^{(i)}$) | Age | Sex (M=0, W=1) | Outcome ($y^{(i)}= \text{Cancer}$) |
---|---|---|---|
1 | 55 | 0 | 1 |
2 | 65 | 1 | 0 |
Our training data set consists of two patients, or two examples ($x^{(i)}, y^{(i)}), i = 1, 2$. Where $x^i$ are features from the variables: Age and Sex, and $y^i$ is the binary outcome: 1 if a patient has cancer, and 0 otherwise. Our prediction problem
My ground truth
Learn $f_{\Theta}(x)$ to approximate $p(y=\text{Cancer}|x)$