Research

Papers

The Hard Problem of AI Alignment: Value Forks in Moral Judgment, Markus Kneer & Juri Viehoff, Proceedings of the 2025 Acm Conference on Fairness, Accountability, and Transparency (2025)

Abstract: Complex moral trade-offs are a basic feature of human life: for example, confronted with scarce medical resources, doctors must frequently choose who amongst equally deserving candidates receives medical treatment. But choosing what to do in moral trade-offs is no longer a ‘humans-only’ task, but often falls to AI agents. In this article, we report findings from a series of experiments (N=1029) intended to establish whether agent-type (Human vs. AI) matters for what should be done in moral trade-offs. We find that, relative to a human decision-maker, participants more often judge that AI agents should opt for fairness at the expense of maximizing utility. In our discussion, we explain how the reported differences (we call them agent-type ‘value forks’) matters for the study of moral value alignment, and we hypothesize what could explain these value forks. We close by reflecting on limits of our results and indicate avenues of further research.

Trust and Responsibility in Human-AI Interaction, Markus Kneer, Michele Loi & Markus Christen, preprint.

Two topics at the center of Ethics of AI and HRI regard trust in AI agents as well as the adjudication of moral responsibility in situations where AI causes harm. In this paper we aim to advance the state of the art concerning these topics in several regards: First, we propose and evaluate a new empirical paradigm for measuring appropriate or calibrated trust in AI, that is, attitudes which are neither too trusting nor too cautious. The best way to measure calibrated trust, we argue, is by contrasting trust vested in AI agents when their relevant capacities equal those of a human expert in the domain. A second shortcoming of extant work regards generalizability: Trust in, and reliance on, AI are standardly explored with respect to a single context or domain. To investigate context-sensitivity, we ran experiments (total N=1276) across five key areas of AI application. Finally, we explored perceived moral responsibility for harm caused in human-AI interaction, with a particular focus on recent philosophical debates on the topic. Our findings suggest that approximately half of the participants vest equal trust in AI and human agents when their capacities are the same. However, there is considerable variation in trust calibration across domains, suggesting that context-sensitivity needs more attention. Human agents are attributed more moral responsibility than AI agents, whereas their supervisors are blamed less than those of AI agents. This suggests that, at least according to folk morality, there are no perceived “responsibility gaps” (Matthias 2004; Sparrow 2007) and that “retribution gaps” are a genuine possibility (Danaher 2016).

Talks

Markus Kneer delivered the keynote address, Value Forks and AI Alignment, at the annual conference of the Society for Philosophy of AI (PhAI), held in Amsterdam on October 23, 2025. The program is available: https://www.pt-ai.org/2025/programme/

Cristina Voinea presented the poster, Can We Talk? The Ethics of Human–AI Communication, for LLMs @ Oxford, Department of Computer Science, University of Oxford, on September 14, 2025.

Mihaela Constantinescu delivered the keynote, How should we live well with LLMs?, for the AI for Flourishing conference, University of Navarra, on June 30, 2025. Event details here.

Jakub Figura gave the talk, A great research problem! Legal evaluation of epistemic risks caused by sycophancy of LLMs, for the Institute of Law and Technology, Masaryk University (Brno), and the European Academy of ICT Law (Vienna), on November 29, 2025.

Events / Outreach

Cristina Voinea gave the talk, Automated Moral Reasoning, for BiteSize Ethics 2025: Ethics in the Age of AI, a public outreach program organized by the Uehiro Oxford Institute, University of Oxford, on August 13, 2025.

Cristina Voinea presented her research on the ethics of human–AI interaction to a delegation from the Department for Science, Innovation and Technology (DSIT), a ministerial department of the Government of the United Kingdom, during their visit to the University of Oxford on September 30, 2025.

Mihaela Constantinescu discussed the responsible use of LLMs in a fireside chat on the Inspire Stage of the IMPACT Hub Bucharest, a business and innovation event, on September 17, 2025. More details here.

Markus Christen presented and discussed at the public event, Zwischen Freiheit und Verantwortung – KI-Regulierung in der Schweiz, on October 30, 2025. Event details here.

Alexandra Zorilă participated, as an invited guest, in the round-table discussion “Knowledge in the Age of Artificial Intelligence,” an event organized within the Alifanti Library project and hosted at Rezidența 9, December 5, 2025.

The NIHAI team participated in the Knowledge Exchange for Slow Hope conference held in Nottingham on 24–25 November 2025, an event that gathered all HERA/CHANSE Crisis research teams for two days of exchange, reflection, and community-building. The meeting provided a rich opportunity to share developments across the projects and to strengthen the collaborative spirit that drives this wider research network.

The Latest

Norms of Assertion in Human-AI Communication (NIHAI)