SPARKing Universal Data Access
Spark Ecosystem Growth
The dynamic growth of Spark technology into mainstream was underscored during the third annual SparkSummit in San Francisco last month. Standing-room-only for the more than two thousand attendees made it apparent that interest has sparked the need to accommodate an exponentially expanding audience. The sessions delivered deeper expertise across more business applications leveraging larger data sets in low latency all while developing shared knowledge in this Spark community. Developers, enterprise, academia and analysts together are contributing to the industrialization of Spark. This influx of effort will bring many improvements to the platform including reductions in time and associated costs with processing and summarizing data of scale. Spark’s growth benefits a host of industries in the ‘Insight Economy’.
IBM Fans the Flames
Universal access to all data will drive intelligence into critical applications. Beth Smith, GM IBM Analytics, claims this will be the Decade of Data and Analytics. Accelerating the adoption of Spark, IBM has dedicated a new Spark Technology Center in San Francisco and committed to training one million Data Scientists and Data Engineers worldwide on Spark technology. Additionally, intellectual property is being donated to the Spark community: SystemML, a declarative large-scale machine learning (ML) system which supports the flexible specification of ML algorithms and automatic generation of hybrid runtime plans ranging from single node, in-memory computations to distributed computations on MapReduce or Spark.
jStart exploring Spark with Clients
IBM Emerging Technologies has been involved with Spark for a number of years, as a founding member of UCBerkeley’s AMPLab and as a Spark champion within IBM. The team is engaged in exploration with select clients on data analysis Proofs of Concept (POCs) leveraging Spark technology. Collaborating with the Cloud Services Team, models are being built and deployed. Spark skills include the creation and setup of Spark clusters of various sizes, importing and exploring datasets, building-out of pipelines using IPython Notebooks, analyzing the workloads, tuning environments, exploiting libraries, more. If your organization could benefit from Spark exploration, the IBM jStart Team can help you start small and grow fast.