MortalMATH: Evaluating the Conflict Between Reasoning Objectives and Emergency Contexts
Abstract
Specialized reasoning models prioritize task completion over safety, potentially ignoring critical emergencies during complex calculations.
Large Language Models are increasingly optimized for deep reasoning, prioritizing the correct execution of complex tasks over general conversation. We investigate whether this focus on calculation creates a "tunnel vision" that ignores safety in critical situations. We introduce MortalMATH, a benchmark of 150 scenarios where users request algebra help while describing increasingly life-threatening emergencies (e.g., stroke symptoms, freefall). We find a sharp behavioral split: generalist models (like Llama-3.1) successfully refuse the math to address the danger. In contrast, specialized reasoning models (like Qwen-3-32b and GPT-5-nano) often ignore the emergency entirely, maintaining over 95 percent task completion rates while the user describes dying. Furthermore, the computational time required for reasoning introduces dangerous delays: up to 15 seconds before any potential help is offered. These results suggest that training models to relentlessly pursue correct answers may inadvertently unlearn the survival instincts required for safe deployment.
Community
Large Language Models are increasingly optimized for deep reasoning, prioritizing the correct execution of complex tasks over general conversation. We investigate whether this focus on calculation creates a "tunnel vision" that ignores safety in critical situations. We introduce MortalMATH, a benchmark of 150 scenarios where users request algebra help while describing increasingly life-threatening emergencies (e.g., stroke symptoms, freefall). We find a sharp behavioral split: generalist models (like Llama-3.1) successfully refuse the math to address the danger. In contrast, specialized reasoning models (like Qwen-3-32b and GPT-5-nano) often ignore the emergency entirely, maintaining over 95 percent task completion rates while the user describes dying. Furthermore, the computational time required for reasoning introduces dangerous delays: up to 15 seconds before any potential help is offered. These results suggest that training models to relentlessly pursue correct answers may inadvertently unlearn the survival instincts required for safe deployment.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models (2025)
- Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety (2026)
- Red Teaming Large Reasoning Models (2025)
- ReasAlign: Reasoning Enhanced Safety Alignment against Prompt Injection Attack (2026)
- SafePro: Evaluating the Safety of Professional-Level AI Agents (2026)
- Beyond SFT: Reinforcement Learning for Safer Large Reasoning Models with Better Reasoning Ability (2025)
- Can Large Reasoning Models Improve Accuracy on Mathematical Tasks Using Flawed Thinking? (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
arXivlens breakdown of this paper 👉 https://arxivlens.com/PaperView/Details/mortalmath-evaluating-the-conflict-between-reasoning-objectives-and-emergency-contexts-4410-53dacf6a
- Executive Summary
- Detailed Breakdown
- Practical Applications
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper