The pursuit of truly autonomous systems has captured the imagination of engineers, researchers, and technologists for decades. Yet the gap between what we aspire to build and what we can safely deploy remains substantial.
On June 18, 2026, the AIAA Journal Seminar Series hosted a keynote panel titled “Advances in Intelligent Machines and Autonomy,” bringing together three leading experts to address this challenge: Martial Hebert of Carnegie Mellon University, Michael S. Francis of United Technologies Research Center and Sikorsky, and Ella Atkins of Virginia Tech.
Their discussion explored the critical research needs and gaps that define the future of autonomous systems. As I listened to their insights and had time to reflect, several key ideas emerged that could fundamentally shift how we think about certifying learning-enabled systems. Here are my key takeaways from the seminar.
Meet the Panelists
Martial Hebert of Carnegie Mellon University is one of the world’s leading roboticists and Dean of Carnegie Mellon’s School of Computer Science. A native of France with a doctorate in computer science from the University of Paris, Hebert joined the Robotics Institute in 1984, just five years after it was founded, and became part of the Autonomous Land Vehicles program, a precursor to today’s research on self-driving vehicles. For decades, he has led major research programs in autonomous systems for ground and air vehicles, with major contributions in perception, environment understanding, and human interaction.
Michael S. Francis, Col., USAF, Ret., is an aerospace technologist with nearly 50 years of experience in advanced research and technology programs. Francis is known for his pioneering work in unmanned air systems development, including the original Unmanned Combat Air Vehicle and Micro Air Vehicle programs while serving at DARPA in the early 1990s. At United Technologies Research Center, he created and led its initiative in Autonomous and Intelligent Systems. From 2011 to 2015, he served as Program Executive for Optionally Piloted and Autonomous Systems at Sikorsky Aircraft, where he guided Sikorsky’s R&D program that led to the MATRIX autonomous technology suite and Sikorsky’s Autonomy Research Aircraft.
Ella Atkins of Virginia Tech is the Fred D. Durham Professor and Head of the Kevin T. Crofton Aerospace and Ocean Engineering Department. She holds B.S. and M.S. degrees in Aeronautics and Astronautics from MIT and M.S. and Ph.D. degrees in Computer Science and Engineering from the University of Michigan. An AIAA Fellow and private pilot, Atkins has pursued research in AI-enabled autonomy and control to support resilience and contingency management in crewed and uncrewed aerospace applications. She also serves as Editor-in-Chief of the AIAA Journal of Aerospace Information Systems.
Learning Systems Already in the Sky: Why Certification Matters
One of the most striking realizations from the seminar was this: passenger aviation operates under an incredibly strict and difficult certification environment for safety-critical systems, yet aviation already depends every day on adaptive learning agents. We call them human pilots.
Human pilots are not certified through first-principles mathematical proofs of their decision-making. Instead, aviation has developed a rigorous but pragmatic framework: staged training, supervised experience, checkrides, recurrent proficiency requirements, operational limits, and human oversight. Pilots learn, adapt, and respond to new aircraft, weather conditions, failures, and unforeseen situations. Yet we have built a system that qualifies them, monitors them, and trusts them with hundreds of lives.
This does not mean that machine-learning systems can or should be certified exactly like pilots. Personnel qualification is not the same thing as software certification. But the analogy is still useful. Human pilot certification gives us a mature example of how to qualify adaptive decision-makers in high-stakes environments through progressive validation, demonstrated competency, bounded authority, and continuing oversight.
That may be one conceptual path forward for AI certification in aviation. The challenge is not inventing assurance from scratch. It is learning which parts of our existing aviation safety culture can be adapted to systems that learn from data, operate under uncertainty, and may behave differently outside their training distributions.
Where Autonomy Works Now: Deep Space as a Natural Proving Ground
While aviation presents tight certification constraints, there is an entire domain where autonomy is not just useful but necessary: deep space operations.
The latency problem is unforgiving. When commanding a rover on Mars or maneuvering a spacecraft during entry, descent, and landing, humans on Earth simply cannot send commands and receive feedback at the rate needed. The one-way communication delay between Earth and Mars varies with planetary geometry, ranging from several minutes to more than twenty minutes. Real-time human control is impossible.
Mars Entry, Descent, and Landing: The “Seven Minutes of Terror”
Consider the Mars Science Laboratory, which delivered the Curiosity rover to Mars in August 2012. During entry, descent, and landing, the roughly 900-kilogram rover had to decelerate from nearly 13,000 miles per hour through the thin Martian atmosphere to zero velocity, then be gently lowered to the surface by a sky crane descent stage. The entire process took about seven minutes.
Because of the communication delay, NASA engineers could not intervene. The spacecraft had to execute its entry, descent, and landing sequence onboard. Curiosity’s EDL involved six vehicle configuration changes, 79 pyrotechnic events, and roughly 500,000 lines of flight software, all coordinated with extremely little margin for error.
More recently, NASA’s Perseverance rover used terrain-relative navigation during its 2021 landing in Jezero Crater. By comparing onboard imagery with preloaded maps of the landing area, the spacecraft could estimate its position during descent and autonomously divert toward a safer landing site. Earlier Mars landers and rovers also relied on autonomous entry, descent, and landing sequences because Earth-based operators could not intervene in real time, but Perseverance represented a major step forward in onboard terrain-relative hazard avoidance.
Orbital Maneuvers and Planetary Flybys
Beyond Mars, spacecraft have relied for decades on onboard autonomy for command execution, attitude control, fault protection, and time-critical operations. The Voyager missions used gravity-assist flybys planned by mission navigators, while onboard systems executed stored command sequences and maintained spacecraft pointing and health without real-time human intervention. More recently, missions such as MESSENGER used carefully planned Earth, Venus, and Mercury flybys, with onboard systems supporting the precise execution and protection of the spacecraft during complex mission phases.
Deep space teaches us something crucial: autonomy thrives where humans cannot intervene quickly enough, where timescales are too short for ground-in-the-loop control, and where delay itself can become catastrophic. We have been learning how to build trustworthy autonomous systems in this domain for decades, proving core principles across planetary and deep-space missions.
As we tackle certification challenges in aviation and other domains, we should ask: what can we learn from deep-space autonomy? Which principles transfer? Which do not? And how do we distinguish between autonomy that executes bounded, well-designed functions and autonomy that exercises broader judgment?
The Hybrid Approach: Physics Meets Learning
One of the most promising insights from the panel centered on a powerful idea: what if we could combine explicit reasoning grounded in physics with data-driven machine learning methods into a single, unified hybrid system?
One panelist highlighted machine-learning approaches built with physical constraints embedded directly into the system, such as a hearing tool that respects the laws of acoustics. This is where there may be real tolerance for autonomy. When learning systems are constrained by the underlying physics of the problem, those constraints can act as guardrails, reducing the likelihood of nonsensical or unsafe behavior while still allowing the system to learn and adapt.
This is particularly compelling because it suggests a middle path. We do not have to choose between classical physics-based control, which is often interpretable but limited, and pure data-driven learning, which can be flexible but opaque. Instead, we can build systems where:
- Physics provides foundational constraints and safety boundaries
- Machine learning optimizes performance within those boundaries
- The combination may become more trustworthy than either approach alone
Researchers who can merge explicit reasoning with data-driven methods may unlock a new generation of systems where autonomy is not just powerful but genuinely certifiable.
A Historical Precedent: When Autonomy Became Essential
Another detail from the panel discussion that stayed with me was the historical example of aircraft designed with relaxed or neutral static stability. These aircraft would not naturally return to straight and level flight if disturbed. The most famous example is the F-16 Fighting Falcon.
The YF-16 prototype was intentionally designed with relaxed static stability, which allowed it to respond more quickly and aggressively to control inputs. It was one of the first operational fighters designed around relaxed static stability and fly-by-wire control. That combination enabled the theoretical ideal for a fighter aircraft: a highly maneuverable airplane that could be unstable aerodynamically but controllable through a flight control system.
The key point is that such a design would have been extremely difficult, if not impossible, for a human pilot to manage through mechanical controls alone. The aircraft relied on a fly-by-wire system to perform continuous stabilization while the pilot commanded the desired maneuver.
During high-speed taxi testing, the YF-16 began to oscillate severely. Parts of the aircraft scraped the runway, throwing sparks, and General Dynamics test pilot Phil Oestricher decided the safest course of action was to take off rather than continue the test on the ground. The aircraft’s electronic flight control system was too sensitive to pilot input, and engineers later refined the system so it could better manage the high-frequency stabilization work automatically.
This is a powerful historical precedent. We built automated control systems not because we wanted them philosophically, but because human workload demands made them necessary. The F-16 showed that automation could free humans to do what they do best: strategic thinking, decision-making, and high-level control, while machines handled the fast, relentless corrections needed for continuous stabilization.
This pattern suggests something important for modern autonomy. The goal should not always be to remove humans from the loop. Often, the better goal is to distribute tasks intelligently. Let machines handle what they are good at: rapid, precise, repetitive control. Let humans handle judgment, creativity, mission goals, and oversight. That is when autonomy becomes not just acceptable but essential.
The Limits of AI in Safety-Critical Systems
Perhaps the most sobering insight from the panel was the stark reality that AI has fundamental limitations in safety-critical applications.
For systems operating in high-stakes environments, including medical robotics, autonomous flight in civilian airspace, and military applications, the current state of AI methods presents serious assurance challenges. Unlike traditional engineering approaches with well-understood requirements, traceability, safety margins, and failure modes, machine-learning models can fail in unexpected ways. They may perform well in training and test conditions but degrade under distribution shift, adversarial conditions, sensor anomalies, or edge cases that were poorly represented in the training data.
The panelists did not dismiss AI’s potential. Rather, they called for an honest assessment of where it can and cannot be responsibly deployed. For safety-critical functions, traditional, verifiable methods remain essential, at least for now. AI may be valuable, but it must be bounded, monitored, and integrated into systems with robust safety architectures.
The Trust Problem
Building trust in AI methods is perhaps the central challenge facing the autonomous systems community. Three closely related issues emerged from the discussion.
1. Establishing Trust Through Transparency
How do engineers, operators, and regulators gain confidence in AI-driven decisions? This requires moving beyond black-box performance metrics toward interpretable, explainable, and inspectable systems. The research community is actively pursuing this goal, but it has not yet been fully achieved.
2. Certification of Autonomous Capabilities
Current certification processes were largely developed for systems with explicit requirements, traceable implementations, and bounded behavior. Learning-enabled systems challenge that framework. How do we certify systems trained on data? How do we account for emergent behavior, distribution shift, and non-obvious failure modes? This question has no easy answer, yet regulators and safety-critical industries urgently need solutions.
3. The Evolutionary Path Forward
The panelists discussed an intriguing concept: what traits must AI methods develop to render trustworthy, acceptable judgment in complex autonomous missions? This is not just a technical problem. It is also a question about how humans and machines interact, how we define acceptable risk, and how we build systems that are transparent about their own limitations.
The Research Agenda Ahead
The seminar revealed several critical gaps in current research:
- Verification and validation methods that scale to real-world autonomous systems
- Human-machine teaming models that leverage the strengths of both
- Interpretability frameworks that do not sacrifice performance for transparency
- Standards and certification pathways adapted to data-driven systems
- Failure-mode analysis for machine-learning systems operating in unpredictable environments
- Runtime monitoring and bounded autonomy that keep learning-enabled systems inside acceptable operational envelopes
- Hybrid physics-and-learning architectures that combine adaptability with domain constraints
What This Means for the Future
Listening to the panel, I came away with a clearer sense of where the real challenges, and real opportunities, lie.
The certification challenge is not insurmountable. We already have mature ways to qualify adaptive human operators in high-stakes environments. We do it with pilots every day. That model is not directly transferable to AI, but it offers useful principles: staged training, demonstrated competency, bounded authority, recurrent evaluation, and operational oversight.
There are domains where autonomy is not just acceptable but necessary. Deep space has used onboard autonomy safely and effectively for decades because physics leaves no alternative. We should study those successes carefully, while being precise about what kind of autonomy was involved: bounded onboard execution, guidance, control, fault protection, and, increasingly, hazard avoidance.
The hybrid approach — physics plus learning — may be one key. Rather than choosing between classical control and machine learning, we can build systems where physics provides constraints and learning optimizes performance within them.
And we have historical precedent. When the workload on humans became too great, when the task required faster, more precise, more tireless control than humans could provide, we built automated systems. That is not a flaw in the argument for autonomy. It is one of the strongest arguments for it. Autonomy is most compelling when it solves a real workload, latency, precision, or safety problem.
The panelists left the audience with a clear message: the next phase of autonomous systems development requires both humility and ambition. Humility about what current AI can safely do, and ambition to solve the remaining technical, regulatory, and human factors challenges.
Rather than debating whether AI will eventually power autonomous systems, the more useful conversation is about how, where, and under what constraints. We need to understand the operating context, build trust incrementally, learn from domains where autonomy already works, and maintain human oversight where it matters most.
For researchers, engineers, practitioners, and policymakers working on autonomous systems, the path forward is becoming clearer. We need to:
- Study successful autonomy in deep space and apply those lessons carefully
- Develop certification methods inspired by, but not identical to, human pilot qualification
- Invest in hybrid approaches that combine physics, control theory, and learning
- Recognize that autonomy is most justified when it solves genuine human workload, latency, or safety problems
- Build systems that are bounded, monitorable, and honest about their limitations
The future of autonomous systems is not about removing humans or proving that machines can do everything. It is about distributing work intelligently: letting machines do what they are uniquely good at while humans provide oversight, judgment, mission intent, and creativity.
That is not just safer. It is smarter.
Watch the full discussion here:
https://cassyni.com/events/MC2xFoyFyrKeiZDJmkrgMb/seminar/slides
The AIAA Journal Seminar Series continues to convene thought leaders to address the challenges shaping aerospace, autonomy, and intelligent systems. For more information on future seminars and research initiatives, visit the AIAA website.