CAS 751: Information-Theoretic Methods in Trustworthy Machine Learning

   

The interplay between information theory and computer science is a constant theme in the development of both fields. This course discusses how techniques rooted in information theory play a key role in (i) understanding the fundamental limits of classical high-dimensional problems in machine learning and (ii) formulating emerging objectives such as privacy, fairness, and interpretability. The course begins with an overview of f-divergences data-processing inequalities, two important concepts in information theory, and then delves into central and local differential privacy and algorithmic fairness. No background in information theory is required, but some knowledge of machine learning, statistics and probability (equivalent to undergraduate courses in the topic) is needed.

Office Hours:

Virtual: Mondays 11.30 - 12.30pm (Zoom: 943 0824 3533 - CAS751)

Contact me anonymously through here

The course outline and policy are described here.

Course schedule:

Lecture date Lecture Note References and Readings
Sept 4 Why “Trustworthy” machine learning? Watch this and this, Lecture slides
Sept 11 f-divergences 1 Ch 7 of PW2023
Sept 18 f-divergences 2 and SDPI Ch 33 of PW2023
Sept 25 Foundations of DP Section 1.4-1.6 of this and Sections 2, 3.1-3.2 of this
Oct 2 Properties of DP Section 1.4-1.6 of this and Sections 2, 3.1-3.2 of this
Oct 9 Approximate DP and composition theorems Appendix A of this, this, this new blog post
Oct 16 Advanced composition and private gradient descent A proof of advanced composition theorem here and a primer on ML here
Oct 23 Mid-term
Oct 30 Private SGD and Renyi DP private SGD, , optimal RDP-to-DP conversion
Nov 6 Local DP Trust models
Nov 13 Statistical estimation under LDP Sec 31.1 of WP23 and this
Nov 20 A non-comprehensive exposition of fairness criteria in ML Landmark article by ProPublica Machine Bias
Nov 27 Algorithmic fairness 1
Dec 4 Presentation