Jennifer Tang is an avid ice hockey fan who loves both playing and watching the sport. A Ph.D. candidate in MIT’s Electrical Engineering and Computer Science Department and a member of the MIT Women’s Club Ice Hockey team, she was thrilled to have the opportunity to intern with the Boston Bruins in the summer of 2018, a team which made it to the 2018-2019 Stanley Cup Finals.
While there, she found that there is significant overlap between learning to play better hockey and her research on information theory and probability theory. This is because the kinds of questions that coaches want to ask about their players’ strengths and weaknesses (both identifying and addressing them) can be effectively examined using the same techniques that Jennifer uses to investigate online machine learning.
After a childhood spent first in Texas and then California, Jennifer attended Princeton for her undergraduate degree before coming to MIT and LIDS. She knew that she wanted to stay in school until she’d completed her doctorate rather than having a job first, so she took a cue from her undergraduate research, which was focused on information theory, and applied to programs with that in mind. When she arrived at LIDS, she started working with Prof. Yury Polyanskiy, saying, “He has very good insights into the research and good tips about what’s important.” Together they’ve been focusing on the theory behind online machine learning, a method of machine learning that is applied to sequential or chronological data, in which the best predictor of future data is updated as each new data point is available (think: stock price prediction). “Right now learning problems are popular,” Jennifer says. “We’re looking at the question of online learning, which is a study where you, as the statistician or the person trying to make inferences about the data, have observations at every point in time. Given all the observations you see, you are trying to guess what the next observation might be.” The guesses that Jennifer’s research is looking at are those which give a probability distribution over all the possible outcomes.
Probability looks at how mathematically likely it is that a certain thing will happen in a certain set of circumstances. Why you might want to give a guess in terms of probabilities of all the possible outcomes is simple to explain through Jennifer’s favorite lens of ice hockey. “In hockey,” she says, “there are some benchmarks that you want to be able to estimate pretty well. In particular something you might want to know is, when a certain player takes a shot, what is the probability that the shot will score a goal?”
Most of the research and papers currently available on learning problems focuses on predicting specific outcomes. In the hockey problem, this would mean predicting whether a specific shot will score a goal or not. However, this prediction is not necessarily the most useful for evaluating players.
“There’s a lot of randomness that happens in hockey — it might be that the player made a shot that was very good, but maybe the goalie got lucky and saved it, or maybe the player made a poor shot but it got through. But from the perspective of someone trying to evaluate the player, a good benchmark is one that summarizes a player’s ability to score without any random factors like luck of the goalie. Using probability accomplishes this.” The benefits of this sort of analysis are potentially huge for coaches and players, in addition to being of interest to a researcher, for what they reveal about the likelihood of a specific outcome.
Another way of understanding this is the difference between predicting whether a flipped coin will land on tails (a yes or no answer) versus giving a specific percentage of likelihood that it lands on heads or tails. “We’re trying to [recreate] a probability distribution as opposed to making a certain definite decision on something.”
Jennifer’s research focuses on approaching learning problems from an information theory perspective. “There are tools developed for information theory which can be used for prediction. Some of these tools come from data compression: the problem of figuring out how to compress a file so that instead of storing a million bytes on your computer, you can store it in something smaller, like a thousand bytes” she says. “Information theory is not just about compression. It is also about communication. A classic information theory problem may ask the question: When trying to transmit data from a sender to a receiver, what’s the best information rate that you can send at? We can actually use techniques from these kinds of information theory problems to solve machine learning problems.” Machine learning uses algorithms and statistical models to get computer systems to complete tasks for which it’s difficult to give clear, detailed instructions. Instead, the systems have to use patterns to infer what should happen, and then carry out that specific task.
Because Jennifer’s still in the middle of her research, there are still many possibilities for where her work will go in the future. She relies on her LIDS office mates for discussion about various problems as they arise. “[They] can help clarify if there’s a part [of the work] that they know well that I don’t know well,” she says. She hopes that her outcomes will one day prove helpful in understanding logistic regression, which is often used in machine learning because it models how different variables can change the probability of outcomes. To return to the hockey example, she says, “If you observe a lot of data about the characteristics of shots hockey players take, such as where on the ice [the shot] was taken from, the angle from the net, and the distance from the net — you could perhaps answer the question of how much better a closer shot is compared to a farther one, or how much better a shot from the center of the ice is than a shot from a wide angle. There’s some underlying model that governs the probability a shot will go into the net depending on different features of the shot. We don’t know what that model is, but we would like to be able to predict that as well as possible.” The applications for this kind of analysis have significant potential, for hockey and far beyond.