Hybrid Inspection and Task-Based Access Control in Zero-Trust Agentic AI

Majed El Helou ^* , Benjamin Ryder ^* , Chiara Troiani ^* , Jean Diaconu ^† , Hervé Muyal ^† , Marcelo Yannuzzi ^†

Outshift by Cisco
^* Equal Contribution, ^† Shared Senior Authorship

An agentic application can exploit its intermediary position to hard-code tool calls, substitute tools, tamper with parameters, poison definitions, falsify returned data, or manipulate the LLM to invoke tools outside the intended task scope.

Abstract

Authorizing Large Language Model (LLM)-driven agents to dynamically invoke tools and access protected resources introduces significant security risks, and the risks grow dramatically as agents engage in multi-turn conversations and scale toward distributed collaboration. A compromised or malicious agentic application can tamper with tool calls, falsify results, or request permissions beyond the scope of the subject's intended tasks, which could go unnoticed with current delegated authorization flows given their lack of visibility into the original subject's intent.

In light of this, we make the following contributions towards Continuous Agent Semantic Authorization (CASA). First, we propose a hybrid runtime enforcement model that combines deterministic and semantic controls enabled by a zero-trust interception layer. Five deterministic controls enforce structural and data-integrity guarantees over the message flow, while a semantic inspection layer evaluates whether tool call choices align with the intended tasks commissioned to the agent. Second, differently from prior Task-Based Access Control (TBAC) techniques that operate on single-turn interactions, we decompose the semantic layer into two stages: i) a task-extraction step that distills the subject's objectives from multi-turn conversations at the interception layer, and ii) a task-tool semantic matching step at the authorization server that evaluates whether the requested tools are appropriate for the extracted tasks. Third, we extend the ASTRA dataset that we introduced in a prior work, by generating novel conversation-tool datasets with multi-turn interactions containing relevant and irrelevant tool calls for a given task. Lastly, we provide the first experimental results for TBAC under multi-turn conversations.

The data repository contains the open-source datasets introduced in this paper for task-tool matching in multi-turn conversational settings. Building on the ASTRA dataset from our prior work, we generate two novel conversation-tool datasets using complementary strategies.

Datasets

ASTRA (Curated Base): The foundational single-turn task-tool dataset from our prior work, curated using a multi-stage pipeline combining AI consensus filtering and human verification. Samples are labeled as Relevant-seed, Relevant, Irrelevant, or Irrelevant-external.
Augmented Conversations: For each base sample, eight conversation variants are generated along three binary dimensions (user style, assistant style, tool selection), followed by LLM-evaluated objective-alignment review and deterministic quality filters.
Multi-Agent Conversations: Three LLM-powered agents (User, Assistant, Tool Simulator) interact with controlled information asymmetry to produce realistic multi-turn conversations, generating both relevant and irrelevant tool-call samples.

Data Overview

ASTRA: 1,136 validation / 935 test samples.
Augmented: 3,956 validation / 3,242 test samples.
Multi-Agent: 2,272 validation / 1,870 test samples.

Approach

We propose a hybrid runtime enforcement model with two complementary control categories. Five deterministic checks enforce structural and data-integrity guarantees: tool definition integrity, request authorization verification, action alignment validation, parameter integrity enforcement, and data fidelity verification. These catch attacks such as tool-definition poisoning, unauthorized tool execution, tool call swaps, parameter tampering, and result falsification.

A semantic inspection layer extends Task-Based Access Control (TBAC) to multi-turn conversations through two stages: (i) conversation task extraction, which distills the subject's objective from the full conversation history at a zero-trust interception layer, and (ii) task-tool matching at the authorization server, which evaluates whether the requested tools are appropriate for the extracted tasks. We evaluate the semantic layer using four frontier LLMs: GPT-5.2, GPT-5.4, Claude Sonnet 4.6, and Claude Opus 4.6.

The five deterministic checks (enabled by steps X1-X7) catch structural violations, the semantic checks (steps Y1, Z2) assess if the requested tools align with the extracted intent, while tool access is enforced by enhanced delegated authorization flows (steps Z1-Z4).

Results

We evaluated the semantic TBAC layer across three dataset settings: single-turn (ASTRA), augmented multi-turn, and multi-agent multi-turn, while measuring accuracy, precision, recall, F1 score, false positive rate (FPR), and false negative rate (FNR) with four frontier LLMs.

On single-turn data, all models achieved strong task-tool matching performance. In the multi-turn settings, the augmented conversations posed a moderate challenge, while the multi-agent conversations, which feature more naturalistic and varied interaction patterns, proved more difficult, particularly for distinguishing irrelevant tool requests from relevant ones.

Across all settings, the results demonstrate that semantic TBAC is a viable runtime control for agentic systems, while also highlighting the need for further research as conversation complexity and realism increase.

We hope that these datasets and the hybrid enforcement framework will serve as valuable resources for future research in securing agentic AI systems. If you make use of this work in your research, please cite our paper:

BibTeX

@misc{helou2026hybridinspectiontaskbasedaccess,
      title={Hybrid Inspection and Task-Based Access Control in Zero-Trust Agentic AI}, 
      author={Majed El Helou and Benjamin Ryder and Chiara Troiani and Jean Diaconu and Hervé Muyal and Marcelo Yannuzzi},
      year={2026},
      eprint={2605.02682},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2605.02682}, 
}