# #1: Neural nets and function approximation

## Estimating conditional probabilities

A supervized learning problem - learning a mapping from input variables to output variables - may be seen as a conditional probability estimation problem. For instance, consider a toy example below:

Patient ($x^{(i)}$) Age Sex (M=0, W=1) Outcome ($y^{(i)}= \text{Cancer}$)
1 55 0 1
2 65 1 0

Our training data set consists of two patients, or two examples ($x^{(i)}, y^{(i)}), i = 1, 2$. Where $x^i$ are features from the variables: Age and Sex, and $y^i$ is the binary outcome: 1 if a patient has cancer, and 0 otherwise. Our prediction problem

• I want to predict the probability of cancer given data, i.e., $p(y=\text{Cancer}|x)$
• My ground truth

• $p(y=\text{Cancer}|x^{(1)} = \{\text{Age} = 55, \text{Sex} = 0) = 1$
• $p(y=\text{Cancer}|x^{(2)} = \{\text{Age} = 65, \text{Sex} = 1) = 0$
• I can approximate this conditional probability $p(y=\text{Cancer}|x)$ with a function $f_{\Theta}(x)$ parameterized with $\Theta$

Learn $f_{\Theta}(x)$ to approximate $p(y=\text{Cancer}|x)$