SETI sparks Machine Learning to sift Big Data
The SETI Institute’s mission is to explore, understand and explain the origin and nature of life in the universe. A central element of the Institute’s operations is the Allen Telescope Array (ATA) located in the Hat Creek Radio Observatory in California. This phased array observatory combines over 40 radio dishes to look for faint signals which may betray the presence of intelligent extraterrestrial life. Making sense of the data produced by this installation demands advanced analytics, which must scale to handle the ATA data stream of over 60 gigabits of data per second.
100 Million Radio Events
IBM jStart team has joined with the SETI Institute to develop a Spark application to analyze the 100 million radio events detected by the ATA over several years. The complex nature of the data demands sophisticated mathematical models to tease out faint signals, and machine learning algorithms to separate terrestrial interference from true signals of interest. These requirements are well suited to the scalable in-memory capabilities offered by Apache Spark, especially when combined with the big data capabilities of IBM Cloud Data Services.
This application uses the IPython Notebook service on Apache Spark, deployed on IBM Cloud Data Services (CDS). The ATA data will be loaded into the CDS object store in a format that facilitates signal processing and experimentation. Data scientists will innovate, explore and refine their analytic methodologies using interactive IPython notebooks, which will create a self-documenting repository of ATA signal processing research which can be searched, referenced, shared and improved upon in a collaborative manner. “It is both exciting and promising to be working with IBM to better sift some of our large data sets, and possibly tease out interesting signals”, said Dr. Seth Shostak, Senior Astronomer and Director of the Center for SETI Research. IBM is also pleased to be working with the SETI Institute on one of mankind’s most central questions, and the IBM jStart team is looking forward to see new insights emerge from this exciting Spark project.