Study Reveals AI Language Models' Biases in Moral Guidance, Urges Caution in Relying on AI Decisions

July 8, 2025

AI Research

The tendency to favor inaction is especially pronounced when questions are reworded, impacting moral judgments.
Overall, the study underscores the necessity for caution and ongoing improvements in the moral reasoning capabilities of LLMs to prevent bias propagation.
A recent study published in the Proceedings of the National Academy of Sciences highlights significant biases in the moral advice provided by large language models like ChatGPT, Claude, and Llama, urging caution in relying on AI for moral guidance.
The research indicates that these biases could reinforce existing societal prejudices and introduce new ones, potentially influencing real-world moral and societal decisions.
The study warns against uncritically trusting AI responses, as LLMs may produce answers that seem thoughtful but are often inconsistent and biased, especially when questions are rephrased.
Yes.
While GPT-4o showed a slight preference for certain responses, the models generally exhibited biases.
Responding inconsistently when questions were reworded, LLMs tend to favor inaction over action, which can significantly influence moral decision-making.
These biases are driven by the fine-tuning process used during model development.
LLMs tend to answer in ways that reflect these biases, often leaning towards inaction.
Follow-up studies confirmed the presence of both omission and yes-no biases, showing that irrelevant question features can sway LLM responses.
Research found that LLMs frequently prefer inaction over action in moral dilemmas, often answering 'no' more than 'yes' even in identical situations.
In moral dilemmas, LLMs consistently endorse inaction, maintaining the status quo even when action could benefit more people.
The models exhibit a notable 'yes-no' bias, where slight wording changes can significantly alter their recommendations, unlike more stable human responses.
The study involved four experiments comparing responses of LLMs to 285 humans across 22 scenarios, including moral dilemmas and low-stakes situations.
This research highlights the instability of LLM responses due to 'yes-no' biases, contrasting with the more consistent answers of humans.
Unlike humans, LLMs display a tendency to endorse inaction, which may not reflect genuine moral reasoning.
Researchers emphasize the importance of applying cognitive psychology methods to test for inconsistencies in AI responses and call for further studies on how these biases influence human decision-making.
The study suggests that fine-tuning LLMs for chatbot applications can inadvertently amplify biases, underscoring the need for improved alignment methods.
In collective action problems, LLMs tend to endorse altruistic responses more often than humans, though this may not indicate true moral reasoning.

Summary based on 2 sources

Get a daily email with more AI stories

Sources

Phys.org • Jul 8, 2025

Seeking moral advice from large language models comes with risk of hidden biases

PsyPost Psychology News • Jul 4, 2025

New research reveals hidden biases in AI’s moral advice

Study Reveals AI Language Models' Biases in Moral Guidance, Urges Caution in Relying on AI Decisions

Get a daily email with more AI stories

Sources

More Stories