Responsible AI: Privacy, Fairness, and Robustness Seminar
Course Description
This seminar-style course delves into the ethical dimensions of Artificial Intelligence (AI), with a particular focus on the intersectionality of privacy, fairness, and robustness. The course is structured around reading, discussing, and critically analyzing seminal and state-of-the-art papers in the field. Participants will engage in intellectual discourse to understand the challenges, methodologies, and emerging trends related to responsible AI. The course is designed for graduate students with good ML, stats, and optimization background.
Course Objectives
- Critically assess and discuss the literature on privacy, fairness, and robustness in AI.
- Identify challenges and propose potential solutions for responsible AI.
- Foster interdisciplinary discussions to explore the ethical dimensions of AI.
- Engage in a deep intellectual exploration of the field through paper discussions and presentations.
Prerequisites
- Basic understanding of machine learning.
- Basic understanding of optimization.
Syllabus
This is a tentative calendar and it is subject to change.
| Date |
Topic |
Subtopic |
Papers |
Presenting |
| Wed Jan 17 |
Intro to class |
|
class slides |
Fioretto |
| Mon Jan 22 |
Intro to class |
Safety and Alignment |
class slides |
Fioretto |
| Wed Jan 24 |
Intro to class |
Privacy (settings and attacks) |
class slides |
Fioretto |
| Mon Jan 29 |
Intro to class |
Privacy (cont) |
class slides |
Fioretto |
| Wed Jan 31 |
Intro to class |
Privacy and Fairness |
class slides |
Fioretto |
| Mon Feb 5 |
Fairness |
Intro and bias sources |
[1] – [4] |
Group 1 |
| Wed Feb 7 |
Fairness |
Statistical measures |
[5] – [8] |
Group 2 |
| Mon Feb 12 |
Fairness |
Tradeoffs |
[9] – [12] |
Group 3 |
| Wed Feb 14 |
Fairness |
LLMs: Toxicy and Bias |
[13] – [16] |
Group 4 |
| Mon Feb 19 |
Fairness |
LLMs: Fairness |
[17] – [19] |
Group 5 |
| Wed Feb 21 |
Fairness |
Policy aspects |
[20] – [22] |
Group 6 |
| Mon Feb 26 |
No class (AAAI) |
|
|
|
| Wed Feb 28 |
Safety |
Distribution shift |
[23] – [25] |
Group 1 |
| Mon Mar 4 |
Spring break |
|
|
|
| Wed Mar 6 |
Spring break |
|
|
|
| Mon Mar 11 |
Safety |
Poisoning |
[26] – [29] |
Group 2 |
| Wed Mar 13 |
Safety |
Adversarial Robustness |
[30] – [34] |
Group 3 |
| Mon Mar 18 |
Safety |
Adversarial Robustness |
[35] – [39] |
Group 4 |
| Wed Mar 20 |
Safety |
LLMs: Prompt injection |
[40] – [45] |
Group 5 |
| Mon Mar 25 |
Safety |
LLMs: Jailbreaking |
[46] – [50] |
Group 6 |
| Wed Mar 27 |
Privacy |
Differential Privacy 1 |
[51] – [55] |
Group 1 |
| Mon Apr 1 |
Privacy |
Differential Privacy 2 |
[56] – [58] |
Group 2 |
| Wed Apr 3 |
Privacy |
Differentially Private ML |
[59] – [61] |
Group 3 |
| Mon Apr 8 |
Privacy |
Auditing and Membership inference |
[62] – [65] |
Group 4 |
| Wed Apr 10 |
Privacy |
Privacy and Fairness |
[66] – [69] |
Group 5 |
| Mon Apr 15 |
Privacy |
LLMs: Privacy in LLMs |
[70] – [73] |
Group 6 |
| Wed Apr 17 |
Evaluation |
Model cards |
[74] – [77] |
Group 1 |
| Mon Apr 22 |
Evaluation |
LLMs: evaluation |
[78] – [82] |
Group 2 |
| Wed Apr 24 |
Unlearning |
Unlearning 1 |
[83] – [86] |
Group 3 |
| Mon Apr 29 |
Unlearning |
LLMs: Targeted unlearning |
[87] – [90] |
Group 4 |
Bibliography
-
[1]. Fairness and Machine Learning, Ch 1. S. Barocas, M. Hardt, A. Narayanan, 2023
-
[2]. Big Data: A Report on Algorithmic Systems, Opportunity, and Civil Rights. The White House, 2016
-
[3]. Big Data’s Disparate Impact. S. Barocas, A. Selbst, 2014
-
[4]. Semantics derived automatically from language corpora contain human-like biases A. Caliskan, J.J. Bryson, A. Narayanan, 2017
-
[5]. Fairness and Machine Learning, Ch 3. S. Barocas, M. Hardt, A. Narayanan, 2023
-
[6]. Fairness Through Awareness. C. Dwork, M. Hardt, T. Pitassi, O. Reingold, R. Zemel, 2011
-
[7]. Learning Fair Representations. R. Zemel, Y Wu, K. Swersky, T. Pitassi, C Dwork, 2013
-
[8]. Equality of Opportunity in Supervised Learning. M. Hardt, E. Price, N. Srebro, 2016
-
[9]. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. A. Chouldechova, 2016
-
[10]. Algorithmic decision making and the cost of fairness. S. Corbett-Davies, E. Pierson, A. Feller, S. Goel, A. Huq, 2017
-
[11]. Inherent Trade-Offs in the Fair Determination of Risk Scores. J. Kleinberg, S. Mullainathan, M. Raghavan, 2017
-
[12]. On the (im)possibility of fairness. S.A. Friedler, C. Scheidegger, S. Venkatasubramanian, 2017
-
[13]. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. E.M. Bender, T. Gebru, A. McMillan-Major, S. Shmitchell, 2021.
-
[14]. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models. S. Gehman, S. Gururangan, M. Sap, Y. Choi, N.A. Smith, 2020
-
[15]. OPT: Open Pre-trained Transformer Language Models. Zhang et al., 2022
-
[16]. StereoSet: Measuring stereotypical bias in pretrained language models. M. Nadeem, A. Bethke, S. Reddy, 2021
Assessment
Each group will be assessed through the following activities:
- Paper Summaries (blogging): 33.3%
- Presentation: 33.3%
- Discussion Lead: 33.3%
1. Paper Summaries (Blogging) – 33.3%
Objective: To develop the ability to critically analyze and summarize AI research papers in a clear and accessible manner.
Expectations:
- Each group will reivew all paper from the provided list, and they may propose additional ones for approval.
- Summaries should be written in Markdown format (supporting images and formulas) and committed to the course’s GitHub repository.
- The summary should include the following sections: Introduction and Motivations, Methods, Key Findings, and Critical Analysis.
- The Critical Analysis section should evaluate the strengths, weaknesses, potential biases, and ethical considerations of the paper.
- Summaries must be submitted four days prior to the presentation for review and potential feedback.
Assessment Criteria:
- Clarity and coherence of the written summary.
- Depth of critical analysis and understanding of the paper’s content.
- Proper use of formatting and adherence to submission guidelines.
- Timeliness of submission.
2. Presentation – 33.3%
Objective: To enhance students’ ability to communicate complex AI concepts and engage in public speaking.
Expectations:
- 45-minute presentation per group.
- Presentations can include slides, code demonstrations, videos, or other creative methods.
- The presentation should cover the key aspects of the paper, including its contribution to responsible AI.
- A critical evaluation of the paper is essential, including discussing its limitations and implications.
- Preparation of thought-provoking questions to stimulate audience engagement.
Assessment Criteria:
- Effectiveness of communication and presentation skills.
- Accuracy and depth of content presented.
- Creativity and engagement in the presentation method.
- Ability to provoke thoughtful discussion through prepared questions.
3. Discussion Lead – 33.3%
Objective: To cultivate skills in leading intellectual discourse and fostering collaborative learning.
Expectations:
- 30-minute discussion session following the presentation.
- Groups should prepare and facilitate a discussion based on their presentation.
- Use of supplementary materials (e.g., videos, code snippets) to enrich the discussion is encouraged.
- The discussion should engage the audience (with active questions), encouraging diverse viewpoints and deeper understanding of the topic.
Assessment Criteria:
- Ability to foster an inclusive and constructive discussion.
- Relevance and depth of prepared questions and discussion points.
- Engagement level of the audience during the discussion.
- Use of supplementary materials to enhance understanding.
General Notes:
- All group members are expected to contribute equally to each component, but two to three members are expected to lead one of the three components.
- Peer evaluation within groups may be used to ensure fair contribution.
Recommended Reading
- A curated list of papers will be provided at the start of the course.
Groups
| Group |
Members |
| Group 1 |
Lei Gong, Archit Uniyal, Luke Benham, Chien-Chen Huang, Stuart Paine |
| Group 2 |
Saswat Das, Wenqian Ye, Benny Bigler-Wang, Parker Hutchinson, Linyun Wei, Zhiyang Yuan |
| Group 3 |
Nibir Mandal, Guangzhi Xiong, Neh Joshi, Sree Esshaan Mahajan, Esshaan Mahajan |
| Group 4 |
Sarvin Motamen, Parth Kandharkar, Ellery Yu, Hongyan Wu, Kefan Song, |
| Group 5 |
Mati Ur Rehman, Jeffrey Chen, Candace Chen, Kaylee Liu, Robert Bao |
| Group 6 |
Stephanie Schoch, Aidan Hesselroth, Joseph Moretto, Jonathan McGee, ShiHe Wang |
Instructor
Ferdinando Fioretto
Assistant Professor in Computer Science
University of Virgina
This syllabus is subject to changes to meet the learning needs of the course participants.