Will Alexa, the assistant in your home, and Siri, the assistant in your pocket, soon be joined by a research assistant called “Penny”?
“I think we’re going to stick with ‘Penn AI,’” says Jason Moore, PhD, director of the Institute for Biomedical Informatics, when asked about naming the technology his team is developing. But it is something like an assistant for medical researchers — first at Penn, then in the biomedical research world beyond — to bring the buzzed-about tools of artificial intelligence (AI) into the regular toolbox for investigators who aren’t computer programmers.
To give you a sense of what this is and why it matters, it may be helpful to back up and ask a basic question. Because, even though stories about AI seem to be everywhere lately, it’s surprisingly hard to explain. Last year on this blog, Moore brought us an introduction to five myths of artificial intelligence. That told us what AI isn’t. But now that a tool is nearly here for the broader Penn community, how do we think about what it actually is?
What Do We Mean by ‘AI’?
Definitions of AI tend to vary. There’s even a popular adage that says, in effect,
AI is whatever we haven’t yet gotten computers to do without following our explicit direction. But as far as Moore is concerned, AI is a computer program that can make decisions. It’s built with a type of computer programming called machine-learning algorithms. Sometimes the terms “AI” and “machine learning” are used interchangeably, but Moore distinguishes them in a way that is useful for understanding what he’s up to with the new Penn AI tool.
Here’s one way of thinking about that distinction: Suppose I walked into a kindergarten classroom and handed out a big piles of alphabet blocks of various sizes and instructed the children to sort the blocks in some logical way. One child might sort blocks in alphabetical order and another may group them by color. Kids can do this easily because human intelligence is built for this kind of task and we already know these categories.
But computers couldn’t do this task without being explicitly programmed with information about the alphabet, colors, or other categories — until machine learning algorithms came along. Machine learning is a kind of programming that equips computers to start from data inputs that haven’t been categorized (such as our colorful alphabet blocks) and to learn from experience, just as human brains do. But these algorithms can also see beyond human biases and expectations; where we might naturally sort blocks by color and letter order because we see these as salient categories, a machine-learning tool could be just as likely to identify differences between blocks based on their wood-grain texture.
“One of the goals of machine learning and AI is to not make any assumptions about the patterns and let the computer figure that out,” Moore says. “Once you start making assumptions, you're limiting the scope of your analysis and what can be found. You're limiting your scientific questions.”
But the kindergartener vs. the machine competition is just getting started. Now let’s talk about Moore’s definition of AI.
“[Machine learning] is really a computer algorithm learning the pattern in the data,” Moore says —discovering colors, shapes, sizes, and textures. “I see AI as a little bit higher level than that. The way I think about AI is how humans solve problems.”
So let’s ask our kindergarten class to invent and play some games. Kids can still do this easily. A child who wants to build a big tower of blocks will know to sort the blocks by size. The precocious child who wants to practice spelling words will pick the alphabetical sorting algorithm (though she probably wouldn’t call it that). Some kindergarteners might invent completely new games with completely new methods of sorting that don’t fit any of these categories. But traditional, non-intelligent computers aren’t selecting the best algorithm for a given purpose. They also aren’t inventing new purposes for themselves.
“That's been the major motivating factor for my work,” Moore says. “How can we get a computer to do that? How can we get a computer to look at data the way a human does and choose analytic approaches the way a human would? That's the way we've designed Penn AI.”
Introducing Penn AI
Moore and his team are building
Penn AI as a user-friendly tool to find complex patterns in big data, both clinical data and research data, to gain insights that might be missed using traditional statistical approaches. Penn AI isn’t necessarily limited by the expectations that a researcher has going into the analysis: If some combination of five different pieces of clinical data all interact to produce a clinical outcome like the risk of experiencing heart failure or sepsis, a clinician or researcher wouldn’t necessarily notice that pattern, and regular statistical analysis methods aren’t good at finding patterns this complex. But a machine-learning algorithm finds those kinds of patterns.
When it launches around the end of this year for the Penn community, the mobile-friendly, web-based Penn AI computer interface will allow researchers to upload their data sets and select from a variety of different machine-learning tools to analyze their data. Or, users can allow Penn AI to consult its own memory stores of past go-arounds — it keeps records of all of the analyses it has done along the way — and then Penn AI can pick the analysis tool that it has learned will work the best for similar data sets.
It’s a vital tool for the biomedical community because, even as health care and biomedical research are amassing complex big data sets, most medical researchers lack the expertise necessary to conduct analyses using AI. Plus, most commercial AI tools are proprietary and opaque. You can’t bust open the black box to understand how the AI software arrived at a given conclusion. “That's tough for clinicians,” Moore says. “If you're going to take clinical data, analyze it with a piece of software that you have no clue what it's doing, and then take the result and use that in clinical practice, that's a huge leap of faith.”
For this reason, Penn AI will be transparent to the user and allow them to work backward; if Penn AI generates a model that seems to work — such as one that predicts a cancer diagnosis from protein data in a blood sample — they can look at how the model was built and determine what underlying biological mechanisms might make sense of that finding, or potentially discard the finding if its flaws are evident at that stage. If the mechanism seems plausible, they can then return to the lab and conduct directed experiments to either rule out or confirm and better explain that mechanism.
Moore hopes that, before too long, Penn AI helps make AI and machine learning analyses part of the regular routine for analyzing data in biomedical science.
“If it's easy, then it becomes routine,” he says. “There was a time when statistics was really hard and not accessible, and people started making user-friendly statistics packages, and now everybody can do statistics. You can do it in Excel. It's user friendly, it's easy. “