Topic: Agent-Based Monitoring and Enforcement of Fairness in Multimodal LLMs using Monpoly

Topic: Agent-Based Monitoring and Enforcement of Fairness in Multimodal LLMs using Monpoly

Personal details

Title Agent-Based Monitoring and Enforcement of Fairness in Multimodal LLMs using Monpoly
Description

Large multimodal language models (MLLMs) such as GPT-4V or Gemini have become powerful but often exhibit unfair behavior — for example, producing biased outputs regarding gender, race, or socio-economic status. Recently, research has proposed agent-based fairness enforcement, where one agent monitors the output of a generative AI for fairness violations and another agent modifies prompts (prompt appending) to steer the model toward fairer behavior

The goal of this thesis is to connect agent-based fairness enforcement with runtime monitoring tools. Specifically, the student will investigate how Monpoly (ETH Zurich’s monitoring tool for metric temporal logic) can be used to track fairness properties over sequences of model interactions. The system should allow one agent to formally monitor fairness constraints using Monpoly and another agent to enforce fairness through prompt modification.

Home institution Department of Computing Science
Associated institutions
Type of work practical / application-focused
Type of thesis Bachelor's
Author Prof. Dr. Chih-Hong Cheng
Status available
Problem statement

Fairness in generative AI is challenging because:

  • Fairness is conditional: depending on the context of the user prompt, the AI must balance outputs across sensitive attributes (e.g., gender balance when generating “leaders” vs. “economically disadvantaged people”).
  • Bias may accumulate over time: monitoring requires reasoning about sequences of interactions, not just single outputs.
  • Enforcement must be minimal: prompt interventions should only occur when fairness violations are imminent, so as not to distort user intent.

This thesis will address the following questions:

  • How can fairness constraints be formally specified and checked using Monpoly in real time?
  • How can a monitoring agent interact with Monpoly to detect potential violations?
  • How can an enforcement agent use prompt appending to prevent fairness violations while minimizing disruption?
Requirement
  • Programming skills in Python.
  • Interest in formal methods and/or AI fairness.
  • (Preferred) Familiarity with machine learning models and APIs for LLMs and diffusion models.
  • (Optional) Knowledge of runtime monitoring tools or logic (e.g., temporal logic).
Created 24/09/25

Study data

Departments
Degree programmes
  • Bachelor's Programme Business Informatics
  • Dual-Subject Bachelor's Programme Computing Science
  • Bachelor's Programme Computing Science
Assigned courses
Contact person