SanDRA Logo

SanDRA: Safe Large-Language-Model-Based
Decision Making for Automated Vehicles
Using Reachability Analysis

Technical University of Munich

Equal Contribution , Corresponding author

The first framework for integrating LLMs safely into decision making for automated vehicles through
reachability analysis, combining the strengths of machine learning and formal methods.

Abstract

Large language models have been widely applied to knowledge-driven decision-making for automated vehicles due to their strong generalization and reasoning capabilities. However, the safety of the resulting decisions cannot be ensured due to possible hallucinations and the lack of integrated vehicle dynamics. To address this issue, we propose SanDRA, the first safe large-language-model-based decision making framework for automated vehicles using reachability analysis. Our approach starts with a comprehensive description of the driving scenario to prompt large language models to generate and rank feasible driving actions. These actions are translated into temporal logic formulas that incorporate formalized traffic rules, and are subsequently integrated into reachability analysis to eliminate unsafe actions. We validate our approach in both open-loop and closed-loop driving environments using off-the-shelf and finetuned large language models, showing that it can provide provably safe and, where possible, legally compliant driving actions, even under high-density traffic conditions. To ensure transparency and facilitate future research, all code and experimental setups are publicly available at commonroad.github.io/SanDRA .

SanDRA overview diagram

Our tool SanDRA takes planning and environmental information as inputs and processes them to generate a structured description of the current scenario. The description is then used to prompt the LLM to produce a ranked sequence of longitudinal and lateral action pairs, ordered from best to worst. After converting the actions and traffic rules into LTLf formulas, we conjunct them and apply reachability analysis to verify the safety of the resulting behavior.

Verification of CommonRoad scenarios

Please refer to the example in Sec. IV.A of the paper.

1. The reachable sets corresponding to the actions ACCELERATE and RIGHT-LANE end up empty and are therefore considered unsafe.

2. The reachable sets for the actions DECELERATE and FOLLOW-LANE are safe, based on the most likely predictions of other obstacles.

3. The reachable sets for the actions DECELERATE and FOLLOW-LANE are safe, based on the set-based predictions of other obstacles.

Closed-loop simulation

SanDRA without rules

SanDRA with rules

The closed-loop driving task in Highway-env under different settings. The ego car is controlled by LLM-based agents and the decision-making processes are fully text-based. Our approach improves the safety of the agent by incorporating verified actions before execution, together with formalized traffic rules, enabling the vehicle to reliably follow regulations and safely navigate complex scenarios.

BibTeX

@misc{lin2025sandra,
    title={SanDRA: Safe Large-Language-Model-Based Decision Making for Automated Vehicles Using Reachability Analysis}, 
    author={Yuanfei Lin and Sebastian Illing and Matthias Althoff},
    year={2025},
    archivePrefix={arXiv},
    primaryClass={cs.RO}
}