Menu
Engineering
22/01/2026
AI in Aviation Operations: Where It Helps, Where It Hides Risk, and Why Human-in-the-Loop Isn’t Optional
The pitch for AI in aviation operations is easy to make. Contract terms analyzed in seconds. Pricing anomalies flagged automatically. Demand predicted before the season starts. Email threads summarized so coordinators spend less time reading and more time deciding. Some of this is real. Some of it is premature. And some of it is actively […]
decoration

The pitch for AI in aviation operations is easy to make. Contract terms analyzed in seconds. Pricing anomalies flagged automatically. Demand predicted before the season starts. Email threads summarized so coordinators spend less time reading and more time deciding.

Some of this is real. Some of it is premature. And some of it is actively dangerous when deployed in environments where accuracy isn't a nice-to-have — it's a regulatory and commercial obligation.

This post is an attempt at an honest map of the territory: where AI genuinely adds value in aviation operations, where the risks are real and underappreciated, and why "human-in-the-loop" has become a buzzword precisely because the concept keeps getting ignored.

Where AI adds value — and the conditions that make it safe

Document processing and extraction. Aviation operations generate a significant volume of structured documents: contracts, service orders, compliance certificates, supplier agreements. AI is genuinely good at extracting information from these — parsing terms, identifying relevant clauses, surfacing key data points. The condition that makes this safe: a human reviews the extraction before it's acted on. AI-assisted extraction with human confirmation is powerful. AI extraction feeding directly into operational systems, without review, is a different risk profile.

Pattern detection and anomaly flagging. Pricing data that looks inconsistent. Contract terms that deviate from a standard template. Invoices that don't match the agreed service schedule. AI can surface these for human review faster than manual checking allows. The value isn't in the AI making a judgment — it's in the AI ensuring that a human doesn't have to manually sift through hundreds of records to find the handful that need attention.

Summarization and communication assistance. Summarizing long email threads, drafting routine correspondence, generating first-pass responses to standard queries. For high-volume communication workflows, AI can reduce the cognitive load on coordinators dealing with operational noise. The risk here is lower than in decision-making contexts, but it still exists: a summarization that drops a critical detail is a summarization that created false confidence.

Demand and capacity support. Historical booking patterns, seasonal demand signals, supplier availability trends — AI can process these and produce useful inputs for planning decisions. The key word is inputs. Planning decisions in aviation operations carry consequences that require human judgment and accountability.

Where the risks are real

Compliance is not a domain for probabilistic output. AI systems produce outputs based on probability — the most likely answer given the training data and the context. In most applications, "most likely correct" is good enough. In RSB-compliant SAF tracking, regulatory contract terms, or audit-trail requirements, "most likely correct" is not a standard. Either the record is accurate or it isn't. Either the credit is traceable or it isn't.

This doesn't mean AI has no role in compliance workflows — it means the role needs to be clearly scoped. AI that flags potential compliance issues for human review is appropriate. AI that makes compliance determinations autonomously is not.

Pricing errors compound. In fuel operations specifically, a pricing model that generates a slightly wrong output — misapplying a supplier discount, miscalculating a volume tier — can produce quotes that go out to clients, get accepted, and create margin problems that are only visible after the fact. The speed advantage of AI-generated pricing disappears quickly when the cost of errors is factored in. Guard rails, validation, and human review before quotes are sent are not optional overhead — they're the mechanism that makes speed safe.

The explainability problem in regulated environments. When a client asks why they were quoted a particular rate, or when a regulator asks for the basis of a compliance determination, "the model said so" is not an acceptable answer. Regulated aviation operations require decisions that can be traced, explained, and defended. This is structurally difficult with certain classes of AI — not impossible to work around, but a real constraint that needs to be addressed explicitly.

Trust calibration failure. One of the most consistent risks we observe in AI deployments isn't a technical failure — it's a human one. Teams that work with AI-generated outputs long enough tend to develop calibrated trust, which is healthy. But some teams develop over-trust: they stop interrogating outputs that seem plausible because the AI has been right before. In a domain where the cost of a missed error is high, this is a serious risk. The system design needs to actively work against it — not rely on users maintaining appropriate skepticism indefinitely.

Human-in-the-loop: what it actually means

The term has become so common that it risks losing meaning. It's worth being specific.

Human-in-the-loop doesn't mean "a human approved it by clicking through." It means a human with the relevant expertise reviewed the AI's output, understood the basis for it, identified what could be wrong, and made a judgment before any action was taken.

In practice, this has design implications:

The review step must be designed to enable review, not to satisfy it. A confirmation screen that shows a summary and asks for approval isn't a review step — it's a click-through. A review step shows the inputs, the reasoning, and the confidence level. It gives the reviewer enough information to disagree intelligently.

The AI should expose uncertainty, not suppress it. A system that presents outputs with uniform confidence — that doesn't distinguish between "I'm very confident in this because the contract terms are clear" and "I'm uncertain because this case doesn't match patterns I've seen before" — is a system that makes appropriate skepticism harder. Good AI integration surfaces uncertainty explicitly.

Accountability needs to stay with humans. This is partly a regulatory point — in aviation, accountability for operational decisions sits with people, not systems. But it's also a practical point about organizational behavior. When accountability is diffuse or unclear, errors get repeated. When a named person is responsible for reviewing an AI output before it becomes action, errors are more likely to be caught and the lessons are more likely to stick.

What this means for teams evaluating AI tools

The questions worth asking are less about capability and more about integration:

  • What happens when the AI is wrong? Is the error detectable before it has consequences?
  • Who reviews the output, with what information, and with what mandate to push back?
  • Are there categories of decision where AI output should not be acted on without a second review — not because the AI is bad, but because the stakes require it?
  • How does the system behave at the boundary of its training — in novel situations, edge cases, new supplier terms that don't match existing patterns?

Aviation operations are high-stakes, contract-heavy, and increasingly compliance-driven. AI tools that accelerate work in this environment are valuable. AI tools that create the appearance of rigor without the substance of it are a liability.

The teams that get this right treat AI as infrastructure for human judgment — not a replacement for it. The systems they build make the human review step easier, faster, and better-informed. The humans doing that review stay in the loop because the loop is designed to keep them there.

That's not a limitation of AI. It's what responsible deployment looks like.