LIDS/ALL 2021

Against the odds, LIDS PhD student Heng Yang (who goes by Hank) has had a rather good year. For graduate students who spent much of 2020 working and attending classes remotely, life during the pandemic’s early months was a hectic, monotonous hum of research. When you’re immersed in your work, “seven days a week look alike, and this is especially the case during the pandemic,” Hank said.

To help break up the monotony, Hank started 2021 by co-organizing the 26th annual LIDS Student Conference, which was hosted virtually for the first time, so that even during the pandemic the LIDS community could stay connected and inspired. Amid student talks on control, machine learning methods, and other topics, the conference team also organized a musical performance to mark the closing and help students feel a sense of belonging from afar. Not long after, an algorithm Hank developed — together with his supervisor, Professor Luca Carlone, Vasileios Tzoumas (a LIDS post doc at the time), and LIDS graduate student Pasquale Antonante — was integrated into a MATLAB suite of navigation tools that companies use for commercial and industrial robotics systems.

Then, Hank presented his work at international conferences on robotics and computer vision. He honed his communication skills with a three-minute MIT Research Slam video. He has several papers in press and is due to complete his PhD at the end of the 2021-2022 school year. And in the coming months he will hit the job-talk circuit.

How did the earnest, energetic Hank get here? He quips that his career so far has been a ‘random walk’.

Hank grew up in China’s Jiangsu Province, just north of Shanghai. At the prestigious Tsinghua University he graduated with top honors, exploring the seemingly disparate topics of automotive engineering and the mechanics of how honeybees drink.

When he came to MIT for his graduate work, he brought this same spirit of intellectual curiosity, but was unclear about what direction to follow. “Five or ten years ago, I didn’t know what I wanted to do. I wanted to see different things. I wanted to explore,” he explains. And while he was able to do this with much success at MIT, he also feels “there was some luck or fortune involved in finding this path.”

It was while researching medical imaging methods for his master’s degree that Hank took a course with Professor Russ Tedrake and became interested in how to combine theory with practical robotics applications. That led him to LIDS, where theory and practice rub shoulders daily.

At LIDS, Hank has very happily found a home in SPARKLab, the research group led by his supervisor, Professor Luca Carlone. Here, Hank and his colleagues work on algorithms for robot perception, among other projects.

For robots like self-driving cars, perception means sensing the external world and using hardware and algorithms to create an internal model of the world around it.

For a basic example, a self-driving car would have to take a snapshot of an oncoming car. Then, it feeds this image through a neural network that detects keypoints in the 2-dimensional image — door handles, wheels, headlights — and matches them to its prior information about what cars look like in 3D.

That allows the system to figure out, and more importantly track, whether the oncoming car is hurtling toward you, or simply passing by. And it has to do all this in a split second.

But current keypoint detection methods can produce a lot of outliers, like mistakenly identifying part of a tree as part of a car. So how can you make the system robust against a large percentage of these outliers? How does the robot know that its output — ‘Oncoming Toyota sedan at 50 yards’ — is correct?

Together with colleagues, Hank developed the graduated non-convexity (GNC) algorithm, which finds a single best solution for matching the 2D image with the 3D model, then keeps checking and refining it until an optimal solution is found.

He is also designing algorithms that can be certified to be right under certain conditions. Used in tandem with the GNC algorithm, this produces an extra level of certainty when dealing with noisy, complex images such as moving traffic: you can say that your solution is not only optimal but also safe. In future, perhaps, these algorithms will become part of the way neural networks are trained to sense and identify keypoints.

So what’s next for Hank?

In the next few years, he says, he hopes to work on the problem of active perception.

To get an understanding of active perception, you can start by imagining a robot arm that needs to grab an object that’s only partially visible.

Humans do this automatically: if we’re reaching for a document under a pile of books, or a mug amid glasses in a cabinet, we move our hands and bodies to lift the books or move the glasses and pick up the item. We use our actions to help us perceive better.

A robotic system would need to use its sensors and algorithms to identify and track the object, and move itself to reach out and pick it up. If the object is moving, things get even more complicated.

Vasileios Tzoumas, now an assistant professor at the University of Michigan, says: “Believe it or not, there is almost no work on this domain, so it’s the right time for researchers to start on this problem.”

Hank seeks out problems that have real-world applications, and keeps improving on his solutions, says Vasileios. “He has the capacity to start working on a hard problem and, within a year or two, produce a robust concrete solution with almost no weaknesses.” Most importantly, he adds, Hank is always generous and willing to help others.

Hank may joke his career has been a random walk. But random walks reveal many real underlying patterns – such as the trajectory of a thoughtful and curious rising star.