Load Your Data Into a Jupyter Notebook

You’ve heard all the flashy statistics about big data, like how every day more than 2.5 quintillion bytes of data is created and that more data has been created in the last two years alone than every previous year combined (IBM). Here’s another one to add to the list: 99.5% of newly created data is never analyzed (MIT). Only half a […]

Read more

Spark on z/OS and Jupyter: fast, flexible analysis of mainframe data

Many enterprises are faced with the need to expand data processing access to users without impacting mission-critical transactional application environments. The trending approach to this problem is to move the data from these systems of record to a data warehouse. Moving data-at-rest to a mirrored data repository for analytics can yield costly side-effects such as expensive migration workloads, data concurrency and […]

Read more

jStart Spark Data Analysis Projects for Clients

  During the IBM InterConnect 2016 conference, Scott Laningham asked me about the mission of the IBM jStart team and about our team’s Spark data analytics projects with various clients such as SolutionInc and USA Cycling Women’s Team Pursuit. Why Spark?  Spark provides data analysts, data scientists, and even line of business users the ability to find new patterns in data, […]

Read more

Unleashing Exploration on Enterprise Data

Enterprise customers have huge investments in transactional data systems, yet they struggle to provide their users with flexible and timely exploratory access to this data. One solution to this problem is to empower these users with the ability to use Jupyter Notebooks and Apache Spark running natively on z/OS to federate analytics across business critical data as well as external […]

Read more

Kafka and Spark Streaming for IoT at the Vacation Resort

In the Internet of Things at the Vacation Resort Experience blog post we described the Jabil/IBM jStart Proof of Concept (POC) high-level architecture and business objectives. This post delves more deeply into the solution architecture and reviews a few of the lessons learned with our first experience with Kafka and Spark Streaming technologies.   System Component Diagram Training Phase As part […]

Read more

Why is IBM involved with Apache Spark?

How Emerging Technologies Works Within IBM, and How It Led to Spark. Recently, IBM announced its IBM Spark initiative, detailing how the company would move forward with the open-source compute cluster framework, including the creation of a Spark Technology Center based in San Francisco, California. This initiative, one of IBM’s largest, already includes over a dozen IBM labs, as well […]

Read more