Picture this: It’s a hot, dry summer’s night. A lit cigarette butt; a spark. Suddenly, a rapidly-spreading fire is making its way through the neighborhood.

The smoke and heat are too dense for firefighters to navigate, so they deploy a team of drones to map the fire and find any people who may be trapped. The drones calculate an exploration plan, and fly into one of the buildings to begin their task. But as they scurry through the building, unexpected challengesoccur: Some of their sensors are obscured by smoke. Others run their batteries down and return to home base. And others are forever lost in the fire, destroyed by falling objects.

An immediate question arises: Despite many of the drones being lost, can those that remain still scan the blaze? As the now-diminished team works through the area, another challenge is also apparent: the fire has created an environment they’ve never encountered before. How do the drones recognize where they are? That they are not visiting the same place again and again? And, importantly, how do they ignore misinformation — one apartment door looks just like any other; corridors across floors seem the same — as they stitch together their picture of the blaze?

This is where the work of LIDS postdoctoral researcher Vasileios Tzoumas comes in. Together with his colleagues (LIDS professors Luca Carlone and Ali Jadbabaie, and University of Pennsylvania professor George J. Pappas, among others), Vasileios has developed seminal algorithms for resiliency against Denial of Service (DoS) failures (as in drones destroyed by falling objects) and for robustness against outliers (as in misinformation that results in inaccurate mapping).

In a broad sense, Vasileios is working toward a vision of resilient robots and other cyber-physical systems (CPS). CPS are physical systems,such as drones and self-driving cars,that are deeply intertwined with the software that controls them. Communicating between software, sensors, and actuators (components that physically move parts of the machine, e.g., opening a valve), these systems sense, process, and interact with the physical world — and they are doing so in increasingly sophisticated ways.

Vasileios, for his part, focuses on the resource-constrained tasks of search and rescue, navigation, and surveillance. Though his results are primarily relevant to the field of control and robotics, they have found applications in statistical learning and operations research as well. In his research, Vasileios builds on fundamental methods in control theory and discrete (combinatorial) optimization. “When we design heterogeneous teams of robots, different combinations of failures have different effects; some failures can be more devastating than others,” he says. “To be resilient, we need to identify the worst combinations among all possible. And doing this in real-time, it’s hard.”

Vasileios is an animated speaker, his passion for his work coming through as he explains the importance of his research: Today, as robots and other CPS are put to work in a variety of failure-prone situations — such as search and rescue in disaster zones or self-driving vehicles in crowded cities — resilient autonomy will be increasingly necessary.

In addition to resiliency against failures and robustness against outliers, Vasileios says that resilience against attacks (coordinated misinformation) is the third domain relevant to his research. All three need to be integrated into a “resilient autonomous systems intelligence,” he says. Though the need for this kind of systems intelligence will certainly increase in the future, he argues that recent cyber-attacks, such as the wi-fi breaking attack against self- driving cars, or the Stuxnet computer-worm against sensors and actuators in nuclear reactors, have demonstrated the need for resilient autonomy right now. Overall, resilient autonomy has been recognized as an important issue at a national level: the National Institute of Standards and Technology layout resiliency frameworks as part of public policy, and the National Academy of Engineering named re- siliency (as security) against attacks and DoS failures one of the 14 Grand Challenges for Engineering in the 21st Century.

In recent papers, Vasileios and his colleagues addressed several of these resilient autonomy challenges. For example, in a paper on resilient exploration against DoS failures, the algorithm they developed plans trajectories for each drone to maximize the explored area in a way that also withstands multiple drone failures. Their paper shows the algorithm is more efficient than a brute-force method, and is as fast as established methods that ignore the possibility of failures. The algorithm was tested in small-scale drone deployments, and in larger-scale computer simulations. In another paper, Vasileios and his colleagues provide a general purpose algorithm suitable for outlier-robust mapping, among other applications. The algorithm was tested in several benchmarking datasets to evaluate its use in real-world scenarios. “Ensuring real-time performance is of utmost importance, as one plans to move from theory to practice,” Vasileios says. Looking to the future, Vasileios plans to explore resilient autonomous navigation against cyber-attacks.

“Resiliency comes down to being able to sustain losses in the long run, and yet cross the finish line, making in the middle as many steps of recovery as necessary,” he says. “Similar to a marathon run.” Fittingly, Vasileios is a marathon runner, hailing from the marathon’s birthplace, Greece, where he began his research career at the National Technical University of Athens, before heading to the University of Pennsylvania for a PhD in electrical and systems engineering. He can be found trotting along the Charles River or pounding the pavements of Cambridge; currently, he is training for an ultra-marathon in Arizona next year.

In his telling, his curiosity about control and resilient autonomy stems from his interest in human behavior. We make decisions every day, based on information we collect and plans we develop, he says. However, we all have limited time and information to develop our plans and make decisions. In other words, there are fundamental limitations to our capacity to solve a problem: time availability, quality of information, and, as a result, quality of plan. All translate to autonomous machines: they need to plan in real-time; they need to reject misinformation; and in the end, they need to ensure they are following an effective plan.

It is this human element that underpins Vasileios’s research: “Ultimately, my vision is about resilient machines that can protect themselves, and the people around them,” he says.