jStart Spark Data Analysis Projects for Clients

  During the IBM InterConnect 2016 conference, Scott Laningham asked me about the mission of the IBM jStart team and about our team’s Spark data analytics projects with various clients such as SolutionInc and USA Cycling Women’s Team Pursuit. Why Spark?  Spark provides data analysts, data scientists, and even line of business users the ability to find new patterns in data, […]

Read more

Powered By Jupyter: A Survey of the Project Ecosystem

Project Jupyter has a large and growing developer community, one that both includes and extends beyond the Jupyter org on GitHub. In this post, we’ll take a walk through the wonderful things people are building based on Jupyter technology today. The Jupyter Notebook is the most well-known application in the Jupyter ecosystem. It is a web-based environment for combining text, […]

Read more

Powering applications with a Notebook Microservice

Previously, we learned how to create a microservice from a notebook using the Jupyter kernel gateway. We learned both how to annotate existing cells, as well as how to generate a new notebook with code cells ready to be filled-in. Now let’s look at this from the other direction: breaking down a problem into the microservices needed to implement its solution. For this post, we’ll solve […]

Read more

Notebook Microservice And Swagger

In previous posts we learned how to create a microservice in a notebook using the Jupyter kernel gateway. This will be the foundation for today’s post where we will be creating a notebook microservice with Swagger, a set of tools for representing REST APIs. With this this approach, notebook authors can create and deploy APIs that are easy-to-comsume by other developers. There […]

Read more

Jupyter Notebooks as RESTful Microservices

“Data science enables the creation of data products.” – Mike Loukides in What is data science? Data products take on many forms. Web articles, dashboard applications, and cloud services are all common vehicles for delivering value from data. Tools that help produce artifacts such as these are a necessary part of any data mining methodology. In the Project Jupyter ecosystem, many […]

Read more

Case Study: Delivering Transportation Insights using Jupyter Notebooks, Interactive Dashboards, and Apache Spark

Our IBM Cloud Emerging Technologies team recently worked with Executive Transportation Group (ETG) to analyze executive car service trips in New York City. ETG wanted to simulate potential changes to its driver dispatch algorithm, and assess the impact of those changes on its operations. The goal was to identify changes that might increase efficiency and have a positive impact on […]

Read more

Unleashing Exploration on Enterprise Data

Enterprise customers have huge investments in transactional data systems, yet they struggle to provide their users with flexible and timely exploratory access to this data. One solution to this problem is to empower these users with the ability to use Jupyter Notebooks and Apache Spark running natively on z/OS to federate analytics across business critical data as well as external […]

Read more

Climatology Analyst

Today, there has been a great deal of discussion around climate change and global warming in the news. The National Climatic Data Center within the National Oceanic and Atmospheric Administration (NOAA) monitors weather stations from around the world. Since NOAA data is publicly available, it is a good data source for demonstrating how the citizen analyst can access large public […]

Read more

Strata Hadoop World Conference Update

Recently in the Strata Hadoop World Conference hosted last month in Singapore, Rod Smith, our VP of the Emerging Technologies, presented a keynote expanding on the newest driving business force: realtime digital business transformation. As businesses and customers demand quicker insights to recognize threats and opportunities for their businesses, the challenge for data scientists focuses on how to leverage historical […]

Read more

Using Notebooks and Spark on Bluemix

In this demonstration we are going to utilize the IBM Apache Spark and IBM Cloudant Bluemix services to process and persist data from the Meetup rsvp stream. On the backend the IBM Apache Spark service will be using the the Spark Kernel. The Spark Kernel provides an interface that allows clients to interact with a Spark Cluster. Clients can send libraries […]

Read more
1 2 3