In a recent experiment, researchers used large language models to translate brain activity into words.
Translating Brain Activity into Words
- The use of magnetic resonance imaging (M.R.I.) to collect brain activity data from participants.
- The development of a model to map the relationship between brain activity and semantic features of phrases.
- The unique aspect of this language decoder that doesn’t rely on implants.
In this study, researchers utilized magnetic resonance imaging (M.R.I.) to collect brain activity data from three participants while they listened to 16 hours of narrative stories. The goal of this experiment was to train a model that could accurately map the relationship between brain activity and the semantic features of certain phrases, in order to better understand how the brain responds to different linguistic stimuli. This research was conducted by Jerry Tang and Alexander Huth.
Consider the thoughts that swirl around in your mind, from the tasteless joke you wisely kept to yourself at dinner to the unvoiced impressions of your best friend’s new partner. Now imagine if someone could eavesdrop on those thoughts. On Monday, researchers from the University of Texas, Austin took another step in that direction. They published a study in the journal Nature Neuroscience, which described an AI capable of translating the private thoughts of human subjects by analyzing fMRI scans. These scans measure the flow of blood to different regions of the brain.
While researchers have previously developed language-decoding methods to pick up attempted speech from people who have lost the ability to speak, and to allow paralyzed individuals to write while only thinking of writing, this new language decoder is unique because it does not rely on implants. In the study, it was able to turn a person’s imagined speech into actual speech, and when subjects were shown silent films, it could generate relatively accurate descriptions of what was happening onscreen.
According to Alexander Huth, a neuroscientist at the university who helped lead the research, “This isn’t just a language stimulus. We’re getting at meaning, something about the idea of what’s happening. And the fact that that’s possible is very exciting.”
Large Language Models and Brain Activity
- The role of large language models such as GPT-4 and Google’s Bard in predicting the brain’s response to language
- How context embeddings can capture the semantic features of phrases and predict brain activity
- The potential implications of this for understanding how the brain processes language
The study focused on three participants who listened to “The Moth” and other narrative podcasts for a total of 16 hours over several days while an fMRI scanner recorded the blood oxygenation levels in parts of their brains. The researchers then used a large language model to match patterns in brain activity to the words and phrases that the participants had heard.
Large language models such as OpenAI’s GPT-4 and Google’s Bard are designed to predict the next word in a sentence or phrase using maps that indicate how words relate to each other, known as context embeds. Dr. Huth discovered that these contexts embedding could be used to predict how the brain responds to language by capturing the semantic features of phrases. In essence, the brain’s activity can be considered an encrypted signal that language models can decipher.
Testing the Decoder’s Accuracy and Future Applications
- The accuracy of the decoder in paraphrasing transcripts and capturing the essence of unspoken stories
- The decoding of silent animated movies and potential applications for film and media industries
- Future implications for this technology in fields such as medicine and communication
To test this theory, Dr. Huth and his team used an AI system to translate fMRI images of participants’ brains into words and phrases. They then tested the decoder’s accuracy by having participants listen to new recordings and comparing the decoded transcript to the actual one. The decoders were able to paraphrase the transcript, preserving the meaning of the passage, even though almost every word was out of place.
During the fMRI scans, the participants were asked to silently imagine telling a story, which was then repeated aloud for reference. The decoding model captured the essence of the unspoken version as well. Additionally, when the participants watched a brief, silent animated movie, the language model decoded a rough synopsis of what they were viewing based on their brain activity.