Main focus is to give you a working set of concepts that you will be able to apply in practice. One lecture will be divided into two parts: theory (~70-90 min) and hands-on session (~45-60 min). Traditional lectures will give a conceptual framework, while the subsequent hands-on sessions will solidify the knowledge. Bring your laptops with you to follow along, or watch the screen.
function approximation, neural networks, supervized learning, activation functions
We will see how the supervized learning task can be seen as conditional probability estimation, and how neural networks can be used to estimate these probabilities.
maximum likelihood principle, regularization, cross-entropy loss, KL divergence, Bias/Variance trade off, cross-validation, hyperparameter tunning
How do we train neural networks to learn complex functions, which we can use as our prediction models? Which optimization problem do we need to solve in order to guarantee a good prediction model? What are best practices, what are pitfalls, and what can go wrong when we train neural networks. Ultimately, how do we measure how good is our prediction model?
backpropagation, convolutional neural network, supervized vs. unsupervized, self-supervized, matrix factorization
How does backpropagation algorithm work? How do we find the optimal parameters for our prediction model? Switch gears to convolutional neural networks. Review convolutional and pooling layers. What is the difference between supervized and unsupervized learning in the context of Deep Learning? How do we learn tasks with NNs?
convolution, skip-connection, transfer learning, segmentation, upconvolution, cardiac applications
Why convolutional neural networks have gained so much attention, and why they are so good at learning features from images? We will review some of the features of different CNN architectures that are responsible for their widespread adoption in different domains. Indeed, CNNs are used to solve various prediction problems, from diagnosis to image segmentation. A review of CNN architectures for the diagnosis of cardiovascular diseases will conclude this lecture.
language models, n-grams, word2vec, matrix factorization, point-wise mutual information
Today we will talk about language models, how can we use neural networks to understand text and extract global linguistic properties. We will talk about the well acclaimed word2vec model, which embeds words into a dense vector space, and view it as a matrix factorization problem. This connection to matrix factorization is becoming more important and may gives us a better understanding of how neural networks process and extract features from text.
adjacency matrix, deepwalk, knowledge graph embeddings, link prediciton, random walks, matrix factorization
Today we will see how (biological) knowledge can be encoded as networks or graphs, and how neural networks can be used to obtain numeric representations of nodes in a graph. These representations will approximate the structure of the graph, such that we could predict new structural properties in graphs, and therefore infer new knowledge. Most importantly, we will see how a mature theoretical framework of matrix factorization is used to connect word and graph embeddings.
unsupervized learning, feature learning, dimensionality reduction, autoencoders, variational autoencoders, generative adversarial networks
Unsupervized learning is the holly grail of learning systems. Automatically learn patterns from training data and understand the underlying structure, or even better, the process that generated them. Today we will see how neural networks can be used to learn data distributions.
saliency maps, class model visualizations, guided backprop, class activation maps, gradient-weighted class activation maps
Small changes in the input that have big repercussion on the class prediction. Rate of change is the central idea in Calculus, and partial derivatives play an immense role at neural network training. It turns out, gradients can be used to explain decision of neural networks. Moreover, we can modify the original backpropagation algorithm, to alter the way that error information propagates back to inputs, in order to come up with compelling visualizations that shed light into our black boxes (CNNs)
interpretable machine learning, partial dependence plots, individual conditional expectation, accumulated local effects, local interpretable model-agnostic explanation
Originally, this lecture was supposed to be on alternative ways of training neural networks, however, this has been changed. Instead, we continue our discussion on explainable AI. In Lecture 8 we familiarized ourselves with XAI techniques specific to CNNs and images. Today we will discover XAI techniques that are model-agnostic. Techniques that are flexible enough to explain predictions of any machine learning or deep learning model.