Load Your Data Into a Jupyter Notebook

You’ve heard all the flashy statistics about big data, like how every day more than 2.5 quintillion bytes of data is created and that more data has been created in the last two years alone than every previous year combined (IBM). Here’s another one to add to the list: 99.5% of newly created data is never analyzed (MIT). Only half a […]

Read more

Spark on z/OS and Jupyter: fast, flexible analysis of mainframe data

Many enterprises are faced with the need to expand data processing access to users without impacting mission-critical transactional application environments. The trending approach to this problem is to move the data from these systems of record to a data warehouse. Moving data-at-rest to a mirrored data repository for analytics can yield costly side-effects such as expensive migration workloads, data concurrency and […]

Read more

jStart Spark Data Analysis Projects for Clients

  During the IBM InterConnect 2016 conference, Scott Laningham asked me about the mission of the IBM jStart team and about our team’s Spark data analytics projects with various clients such as SolutionInc and USA Cycling Women’s Team Pursuit. Why Spark?  Spark provides data analysts, data scientists, and even line of business users the ability to find new patterns in data, […]

Read more

Kafka and Spark Streaming for IoT at the Vacation Resort

In the Internet of Things at the Vacation Resort Experience blog post we described the Jabil/IBM jStart Proof of Concept (POC) high-level architecture and business objectives. This post delves more deeply into the solution architecture and reviews a few of the lessons learned with our first experience with Kafka and Spark Streaming technologies.   System Component Diagram Training Phase As part […]

Read more

Using Notebooks and Spark on Bluemix

In this demonstration we are going to utilize the IBM Apache Spark and IBM Cloudant Bluemix services to process and persist data from the Meetup rsvp stream. On the backend the IBM Apache Spark service will be using the the Spark Kernel. The Spark Kernel provides an interface that allows clients to interact with a Spark Cluster. Clients can send libraries […]

Read more

Apache Spark – Utilizing Access Point Wi-Fi Data

Intro Did you know that Wi-Fi routers used within your home or outside in public are capable of collecting a wealth of information about your mobile devices even if you never actually sign in and connect to the Internet? Wi-Fi is ubiquitous in today’s world and cell phones and other mobile devices are almost always either passively or actively probing […]

Read more

SETI sparks Machine Learning to sift Big Data

The SETI Institute’s mission is to explore, understand and explain the origin and nature of life in the universe. A central element of the Institute’s operations is the Allen Telescope Array (ATA) located in the Hat Creek Radio Observatory in California. This phased array observatory combines over 40 radio dishes to look for faint signals which may betray the presence […]

Read more