Zyphra Announces ZUNA, First Brain EEG-Based LLM
Startup Zyphra has announced a new model named ZUNA, which it claims is the first Large Language Model based on brain electroencephalography (EEG) data. While details are limited, the development suggests a novel approach to training AI by potentially leveraging neural signals directly. The announcement was part of a broader roundup of recent AI industry news.
- ZUNA is a 380-million-parameter masked diffusion autoencoder trained to reconstruct, denoise, and upsample EEG signals. Its primary function is not direct thought-to-text translation but rather to clean and standardize noisy EEG data, which is a foundational step for future brain-computer interface (BCI) applications. - The model was trained on a large-scale corpus of approximately 2 million channel-hours of EEG data aggregated from 208 publicly available datasets. This large and diverse dataset allows ZUNA to learn generalizable representations of brain signals, outperforming traditional interpolation methods for reconstructing missing data, especially when significant portions of the signal are lost. - The choice of a diffusion autoencoder architecture is particularly suited for the continuous nature of EEG signals. This generative model learns to reverse a process of gradually adding noise to the data, which is effective for reconstructing the original, clean signal from noisy or incomplete inputs. The "masked" aspect of the training, where up to 90% of channels were dropped, forced the model to learn deep correlations between different parts of the brain's electrical activity. - While Zyphra's announcement is novel, other major tech companies are also exploring non-invasive BCI. Meta AI, for instance, has a project called Brain2Qwerty which decodes sentences from brain activity with up to 80% accuracy using magnetoencephalography (MEG), a technique related to EEG. However, Meta's approach currently requires large, expensive MEG machines in controlled lab environments. - Google has also researched non-invasive BCIs that combine EEG with other methods like near-infrared spectroscopy (NIRS) to interpret neural activity for controlling devices. Their focus has been on improving accuracy and ease of use for potential applications in assistive technology. - ZUNA is released under a permissive Apache 2.0 open-source license, with the model weights and inference code available on Hugging Face and GitHub. This allows researchers and developers to integrate it into their own workflows and build upon the model, fostering further innovation in the BCI field. - Prior to ZUNA, Zyphra has a history of releasing open-source models, including the ZAYA and Zamba series of language models. Their work often focuses on efficient model architectures, which is a key consideration for deploying complex models in real-world applications.