Big Data Maturation

This past week I attended CiscoLive! 2015, as EMC’s Big Data “expert”. It was validated during this conference that every business, in every industry is collecting data from new sources, and leveraging next generation analytics to improve their customer’s experience, deliver new products and services, and deliver those much more efficiently. Improving technology is enabling the accelerating use of Big Data solutions. We are able to deploy and embedded more cost effective data collection sensors in even the most traditional commodity devices such as light bulbs. Cisco demonstrated the availability of impressive higher capacity, more reliable wireless networks this week. I previously discussed the new EMC storage technology that enable the ingestion and analysis of data much faster and cost effectively here.

Although the technology needed for Big Data is getting better it needs to get significantly easier to use and deploy to keep pace with businesses demands. Two main industry challenges were very evident this week:

  • Analytics’ tools are difficult to use
  • Data fabrics are hard to implement

The lack of resources that could build the data models and algorithms to make the data available actionable was referenced in many of the conversations I had with both product teams and users. Many are focused on increasing the training and education capacity needed to develop analysts to be able to use the new analytics tools available today. I think we need to also greatly improve the tools to be easier to use.  Compare the complexity of the tools analysts are using for Big Data projects to traditional query languages like SQL, and analytics tools such as:

Big Data analysts need to be experienced with programming languages such as Java, Python, and R. Data access is a polyglot combination of often disparate low level API’s and formats requiring transformation of the data before it can be processed. Some of the complexity of the new tools is a result of needing new capabilities that will mature and simplify over time. Next generation analytics tools such as Splunk and Tableau are promising but I think we need to be less accepting of the poor usability of many Big Data analytics tools.  Analysts and Data Scientists need to be able to focus more on designing data models and algorithms, and less on building unique solutions requiring a lot of application programming.

BigDatLake

The second challenge for the industry is to greatly simplify the Big Data infrastructure deployment. Today it can take several months for an organization to install and configure the IT infrastructure, data fabric, and analytics tools. Look above at all the Big Data tools and products that have to deployed to create a Big Data Lake. Today there is still not a consensus IT infrastructure model. For example, some fundamental attributes are still being debated including:

  • Storage - external array vs. commodity direct attached server storage
  • Compute – bare metal vs. hypervisor 

I think a standard architecture will emerge soon. It will follow the same path as other new technology paradigms of the past such server virtualization. In the mid 2000’s timeframe it was challenging, and time consuming to deploy VMware virtualization due to some of the same challenges. Eventually the industry decided on a common architecture that all the storage, server, and network optimized around. As that happened adaption accelerated rapidly.

At the data fabric layer Hadoop has become widely accepted and it is based on an Apache open source standard there are multiple commercial distributions (Hortonworks, Cloudera, MapR, PivotalHD) it is not easy to deploy the same workload across all of these distributions. Foundations such as the Open Data Platform (ODP) have been formed by the industry to facilitate collaboration to address this issue. The growth in the number of participants in the ODP initiative demonstrates the industry recognizes this problem.

The early results to date using Big Data to improve customer experience, create new products and services, and delivering them more efficiently has been promising. Our ability for us to address the complexity of the tools and infrastructure will determine how fast the benefits of Big Data solutions will be realized. 


The New IT - We either disrupt ourselves, or we will get disrupted.

This week I attended CiscoLive! 2015. John Chambers delivered the opening keynote and he asserted that businesses either disrupt themselves, or their competitors will disrupt them. He sited examples of companies like Uber  that have disrupted mature industries and obsoleted successful businesses. Not only has a company like Uber disrupted their industry but they also have grown the market by 5x more rides. By improving the customer experience many more people are leveraging car services rather than driving themselves. John sited several examples of transformations that were happening at Cisco including his succession as CEO by Chuck Robbins.

The “disrupt or be disrupted” theme continued throughout his presentation and the entire conference. John cited a recent survey that found 87% of CEO’s believe it is imperative they become a digital based business, but only 7% actually have a digital plan today. John believes that only 30% of digital transitions will be successful. This is obviously a great opportunity for IT to lead the development and execution of your company’s digital transition.

I believe leading a company’s digital transformation will require a new type of IT. Certainly IT will need to be much more knowledgeable about their business and market. For example, how many of today’s IT organization know?

  • What products and services will customers pay us to deliver?
  • What are our costs to deliver those products and services?
  • What do we need to build? What should we buy to deliver our products and services?

I think this is one of the reasons so many of today’s successful CIO’s such as Cisco’s Rebecca Jacoby, and Intel’s Diane Bryant are running major parts of their companies business. I believe many of our future IT leadership will come from lines of business. Many IT leaders will grow their careers with by taking on line of business responsibilities.

NewitstackCertainly new technology will also be required. New applications will be consumed via
mobile devices, and we will be leveraging and collecting data from new sources to improve customer experience and outcomes. New software development languages and frameworks such as: Node.js and Rails will allows these applications to be developed faster. Application deployment will be highly automated further accelerating the delivery of services to our customers. Today, there are about 15,000 new applications created weekly. The pace of change enabled by digitization is accelerating and the disruption by a competitor will be swift. 

IT organizations in digital businesses will not be measured only by the reliability, and cost efficiency of their services to the business but will need to create, and drive the implementation of new digitally delivered business processes, product, and services. This is very exciting time for an IT professionals. Digitization will create new job opportunities, and roles that will have a direct impact on the success of your company.


CiscoLive! – It’s About Business Outcomes

Next week I will be attending one of the IT industry's largest conferences, CiscoLive! As a major partner of Cisco, EMC will have a major presence with a number of our subject matter experts in attendance. I am particularly excited about this year's event. In addition to all the changes happening in the industry, Cisco will be transitioning to a new leadership team led by Chuck Robbins. This conference has traditionally been technology heavy and a premier training and education event for IT practioner's. I'm sure this year will also have a ton of new technology announcements specifically around networking, and converged

I do think this year we will see more emphasis in the announcements on business outcomes. Every business needs to be digitally enabled to compete going forward. According to the American Enterprise Institute, 89% of the Fortune 500 companies from 1955 do not exist today. No business wants to be the next Kodak, or Blockbuster and become irrelevant in the new digital economy. EMC released a study at EMCWorld a few weeks back that identified the five imperatives most mentioned by business leaders. They are:

  • Predictively spot new opportunities
  • Deliver Personal Experiences
  • Innovate in an Agile Way
  • Operate in Real-Time
  • Demonstrate Transparency and Trust

BusinessImpWe then asked these business leaders for their relative importance (green thumbs) and two of these imperatives rose to the top. First, spotting new opportunities and doing that predictively requires new analytic's capabilities. The second is innovating in agile ways across the entire business, which requires IT to be developing new software written and deployed continuously. When we asked the same business leaders about their IT's state of readiness (red thumbs) you can see many did not feel their IT organizations were ready to meet these challenges. These requirements are coming from the business, but IT needs to lead these initiatives. The measurement of success for IT has changed from improving an organizations efficiency to leading the success of these new business imperatives.

To be successful IT will certainly rely on new technologies. Successful companies such as General Electric and Amazon are focused on building mobile, Big Data, and social applications. These applications rely on new development paradigms such as Agile, and new application development platforms like Rails, Node.JS with application platforms such as CloudFoundry. These applications will be run on cloud infrastructures. Over 90% of all net new applications developed since the beginning of 2014 were built for cloud delivery. Clearly many IT organizations will need new skillsets and knowledge of these new technologies.

CiscoLive! is one of the major industry conferences where new technologies are introduced and IT practioner's can meet with subject matter experts. This is a great opportunity for practioner's to update their skills and prepare for new roles that IT will need. I expect to see more new technologies delivered in as a service and appliances form factors this year. IT needs to be able to quickly lead the development of new products and services to meet these new business imperatives.

To hear more about the future of IT organizations please plan to attend Rebecca Jacoby's keynote session on Wednesday where she will be talking with a panel of IT leaders including EMC Global CTO, John Roese about the future IT organization. To watch via webcast more information is available here. In addition to the keynote, John will be meeting with the EMCElect, and CiscoChampions at the EMC booth on Wednesday afternoon at 3pm. More information on the meet up is available on EMC community network here along with a complete list of EMC activities at CsicoLive! here.


MongoDB World Review

MongodbworldThis week I attended MongoDB’s annual user conference, MongoDB World. Many of the attendees were developers focused on mobile, web, and analytics application development. During the first day key note Docker CEO Ben Golub said “The most interesting people are doing the most interesting things with Open Source technology”. For many of the developers I work with today, open source tools like MongoDB are fundamental to their strategy.

Since MongoDB just announced version 3.0 just four months ago and the acquisition of WiredTiger less than a year ago I was interested in the community feedback. I found a good number of users had deployed MongoDB 3.0 or had at least been testing it. Feedback was very positive with many seeing significant performance improvements based on the WiredTiger over MongoDB’s default MMapv1 engine. The management of MongoDB is also much easier with the new OpsManager.

During the breakout sessions and discussion in the solutions expo I met many customers from some very large companies planning to migrate more database workloads to MongoDB. Typically MongoDB has been used successfully for a year or two primarily for new applications.  As legacy mission critical applications are being updated, many developers I talked to are planning to use MongoDB as the database platform for both structured data that previously was stored in Oracle. Many of the new applications would be combining both structured and unstructured data (i.e. documents, video, …) content. Many of these developers were looking for enterprise grade storage and data protection infrastructure solutions in preparation for scaling their MongoDB environments.

During the second day keynote MongoDB founder, and CTO, Eliot Horowitz introduced planned features for the 3.2 release coming later this year. Eliot commented that the WiredTiger acquisition accelerated the product roadmap significantly. The most interesting 3.2 features to me were:

  • Data-at-rest-encryption: this function is coming from the WiredTiger acquisition. It will be configured as a separate storage engine, which will provide flexibility. It will be interesting to see the resource requirements on the server and whether it will be more efficient to have the function performed by an external storage array for larger deployments
  • Document Validation: MongoDB will validate the documents match the intended schema at the time they are written.
  • Dynamic Lookups: sounded like SQL left outer joins but Eliot assured the audience it was much different…
  • BI connectors: this will make it easier for analysts to use the most popular analytics tools such as Tableau, Business Objects, and Cognos. Many open source analytics tools are available that interface with MongoDB but require analysts to learn new tools.

MongoDB continues to embrace it’s open source routes and now has over 9 million downloads. Based on all the changes in MongoDB management team and acquisition of WiredTiger over the past year it was impressive to see the passion of the developer community with more workloads being targeted for MongoDB


Extending Scale of Next Generation Databases using Flash Storage

Next generation applications are leveraging new database technologies such as Cassandra, GemfireXD, and MongoDB. These next generation database products use memory to greatly improve processing speed. Many next generation applications such as real time trading applications will work with relatively small data sets using these new database technologies. As more enterprise and mission critical workloads have started using these databases, new types of IT infrastructure architectures are emerging to support these workloads. Initially, direct attach server storage (DAS) was used to persistently store the data, and act as the cold repository for the fast server memory capacity. As more data has to be serviced from the DAS, performance of the database and application degraded rapidly. Server PCIe flash card technology was used to try and address the need for persistent storage and performance, but the storage capacity of these cards was limited and they are operationally expensive to deploy and maintain. At EMCWorld this year we I wrote about our Memory Centric Architecture prototype using MongoDB we demonstrated. The prototype demonstrated the ability to leverage flash storage as an extension of memory.

MongoDBXremIOThis week is MongoDB World. MongoDB is one the leading new memory centric database solutions and has been partnering with EMC on a number of projects to create optimized storage solutions that sustain performance at the linear scale modern mobile, web, and Big Data applications require. In addition to the Memory Centric Architecture project we recently released a joint validated architecture using our new all flash storage array, XtremIO to provide persistent data storage, and consistent high performance with sub-millisecond response times. Leveraging an enterprise storage array like EMC’s XtremIO enables applications using MongoDB to access much larger data sets up to the petabyte scale while maintaining the high level of performance expected. The EMC XtremIO scale-out architecture mirrors MongoDB’s node scale-out, and pluggable storage architecture allowing customers to start small with just a few nodes and scale capacity and performance linearly. The tested solution and results are available here.

XtremIOEffIn addition to the benefits of data scale with performance EMC’s customers expect our solutions to provide data services to improve cost efficiency, protection, and security. XtremIO provides theses services and this test solution validates them with MongoDB. In this test we saw a 22:1 reduction in data storage with MongoDB. This data reduction was achieved using XtremIO’s standard always on inline de-duplication, compression, and combined with our thin provisioning service. This results in a significant capital cost saving in addition to reducing data center space, and environmental (power cooling) operational costs compared to traditional direct attached server storage architectures.

We tested the XtremIO in memory snap shot functionality with MongoDB and documented the steps necessary to rapidly create clones of the MongoDB database for backup protection or to accelerate application testing. We also validated the XtremIO data encryption at rest functionality with MongoDB. For many customers and application use cases today encryption of data at rest is required and allows customers to meet their regulatory data governance requirements at no additional cost.

I am very excited to be attending MongoDB World this week and it is great to see the industry leveraging enterprise IT solutions for next generation database’s like MongoDB. I believe this will accelerate the adoption of these technologies for enterprise mission critical use cases. The validation of MongoDB ability to leverage the linear storage capacity, and performance scalability, data reduction, protection, and security provided by the leading all flash enterprise storage solution will allow our joint customers to deploy MongoDB based applications with confidence. There will be a great session reviewing this solution on Tuesday (6/2) afternoon. A complete summary of everything EMC will be doing at MongoDB World is available here.


Next Evolution in Converged Infrastructure

Traditional IT

For year’s organizations have architected custom IT infrastructures
optimized for their application workloads. IT organizations establish standards for server, compute, and network products and even more resources installing, configuring, and integrating these products with custom configurations. This practice has many benefits including highly optimized IT infrastructures for their specific application workloads. These solutions tended to be designed to be reliable, resilient, and lasted many years. The problem is these customized architectures are very expensive to maintain, and are not quickly re-configured for new demands.

Operational cost, lack of agility, and speed are the enemy of enterprise IT today. Many application workloads today do not require customized IT infrastructures architectures. Converged Infrastructure solutions help reduce the time to deploy as well as the operational costs to maintain IT infrastructure. Converged infrastructure uses software to simplify configuration and improve agility to meet new application needs. EMC with Cisco created the converged infrastructure market in 2009 with the formation of VCE and the Vblock product set. This market is continuing to grow rapidly with VCE sales doubling in 2014. As the CI market has matured, it has been extended beyond support for just general-purpose workloads.Today there are many types of CI solutions optimized for a specific type of application workload. There is no one architecture optimized for all the different types of application workloads.

  CI Phylumm

On Monday at EMC World you will be hearing more about new application and infrastructure services we offering as embedded software on our VMAX3 storage array. By embedding these services within the storage array you can optimize performance, and significantly reduce hardware footprint. 

We are continuing to enhance both our CI Vblock, and VSPEX CI solutions with more advanced functionality with the latest hardware, and new software enabled functionality like NSX.

In February, we released our first common modular building block CI solution with VMware, called VSPEX Blue. In addition to the modular, commodity hardware architecture we included embedded enterprise grade backup, disaster recovery replication, and cloud archive storage enabled by software. The sales for this product are strong. I previously blogged about VSPEX Blue here if you want more details.

OnRack M&OThis week at EMC World we will be reviewing the next generation of converged infrastructure based on rack scale architectures. Rack scale architectures allow you scale compute, network, and storage capacity independently. This will greatly increase scalability for larger workloads and consolation of more workloads. The key innovation required for rack scale CI solutions is automated hardware management & orchestration. EMC has developed a hardware management and orchestration (M&O) software technology to interface with bare metal hardware and present an abstracted interface north bound to higher layer M&O software applications. At EMC World we will demonstrate four key functions we have developed to support our rack scale CI product development:

  • Discovery – update master element repository with raw hardware capacity details
  • Description – report on raw assets available
  • Provision – provision and configure operating system
  • Deploy – automate provisioning of infrastructure blue prints

** click on function to access recorded demos by my colleague Tom Capirchio

This architecture enables the discovery, and management of compute (processor cores, memory), network capacity, and storage as OnRack Descpools of element capacity. Systems with a variety of operating systems, compute, storage, and network capacities can be created from these pools of element cap
acity. Based on the demo’s you will see the resource pools can be created from a variety of hardware, from different manufacturers, and with different configurations.

Automation of the hardware discovery, description, provisioning, and deployment of infrastructure blueprints is critical to enable rack scale CI systems. EMC believes this is an industry challenge. We are planning to leverage an open source structure to make this architecture and this initial work available to the industry to accelerate the collaboration of industry. We believe this will result in the creation of standard API interfaces and accelerate the maturation of rack scale converged infrastructure solutions.

As you can see there is a lot of development in the converged infrastructure market. Starting in 2009 with formation of VCE, EMC has been an innovator and market leader in converged infrastructure. We see the development of rack scale converged infrastructure as the next evolution. Hardware management and orchestration automation is critical. If you are attending EMC World this week stop buy the Innovations @EMC booth (#156) in the solutions pavilion. We would love to hear your feedback.  


EMCWorld – Memory Centric Architecture Prototype

Application architectures have long relied on processing data in the memory of the server, close to the CPU to maximize performance. The challenge has always been server memory is limited in capacity and expensive. Over the last few years' server and memory density has improved and server memory is now measured in terabytes but with the proliferation of first virtualization and now containerization many more applications have to share the server memory. When the server operating system runs out of server memory it swaps data disk which significantly increases IO latency.

EMC as the data management industry leader we have been working on this problem. First we improved performance of our storage arrays with flash storage media and introduced automated data placement based on data access patterns. These technologies significantly improved the performance of databases and allowed much great virtual server density. A few years ago we introduced PCI flash cards that were added to servers to limit to keep a cache of frequently used data as close to the server memory. Again this had significant performance impact to many database workloads but proved to be difficult for IT teams to manage. We also realized with this technology that the storage capacity of these cards was limited and still being outpaced by many applications. Also with many next generation analytics application the most frequently used data placement algorithms were not optimal.

We believe application architectures IO appetite is insatiable and the next evolution will require:

  • Server memory tiering
  • Transition of memory management from the operating system to the application management

Today we are seeing a new set of high performance persistent storage media with memory (NVRAM) like performance. These high performance storage media have multiple different performance, capacity, and cost levels. As you would expect the higher performance media has lower capacity and high costs in general. We believe IT infrastructures will expose different tiers of memory pools measured by performance and capacity to applications. Applications such as MongoDB will leverage these memory pool API's to maximize the memory pools to maximize performance.

Below is a prototype my colleague Ken Taylor has created showing:

  • Multiple storage media types as an extension of memory
  • New memory pool API to allow application to control data placement
  • MongoDB version using the new memory API

 

If you are interested in this solution Ken will be at the EMC World Solutions Expo (booth #156) to meet with you. Ken is looking for application developers, and application vendors to extend this prototype for additional applications.


EMC 2 TIERS™ Solution Prototype

Customers are accelerating their investment in the development of new applications to ingest, and analyze large amounts of data quickly. We will be demonstrating a tiered storage architecture prototype, 2 TIERS™ at EMC World Solution Exchange at booth #156. We will be demonstrating this architecture for data and compute intensive applications in Life Sciences, Oil and Gas, SDN WAN Video Distribution, and Weather Sciences. These applications generally use a POSIX or NFS I/O interface, or more recently HDFS. They require high speed bandwidth, low latency IO in addition to needing to store massive amounts of data.

We have found that it is more and more challenging for a conventional storage array to provide both the performance and capacity scale required by these types of applications, at an affordable cost. As a result, new rack scale storage solutions are emerging to provide a high performance storage tier (hot edge) and cost effective capacity tier (capacity core).

We first demonstrated our 2 TIERS™storage solution prototype in December at Big Moves Forum. We have continued to refine the architecture and expanded support for traditional client-server workloads in addition to next generation analytics application workloads. One of the key requirements of our solution architecture is the ability to be able to scale the Hot Edge, and Capacity Core independently based on the application need. Typically we are seeing the capacity need of the performance tier in the 100's of terabyte capacity range. This capacity is very high performance and expensive. For the Capacity Core we are designing for ranges of 100's of petabytes. The Capacity Core requires internet performance speed and low cost per GB preferably leveraging data efficiency services such as data de-duplication, and compression capabilities.

We are uniquely addressing a major characteristic of these applications:  they require high performance for their meta-data, not just the data. A significant challenge with current Hierarchical Storage Management (HSM) based tiering is hot edge space exhaustion as the number of links to the cold data increases with time. 2 TIERS™ assures that consistent subsets of both the meta-data and data are tiered between the hot edge, and cold core. All such namespaces associated with the currently run jobs need to fit in the Fast Tier. After a job ends running, its subset of the namespace is evicted from Fast Tier, if no other job is sharing any part of its namespace.

We have architected this 2 TIERS™ storage solution to not require specialized hardware. This maximizes flexibility and cost. This architecture is also able to take advantage of specialized hardware. For example, a cluster of server-based flash cards pooled and presented by ScaleIO could provide the hot edge capacity. For even better performance and capacity can be achieved using purpose built storage appliances such as an EMC DSSD.This solution will easily incorporate the next generation of storage systems regardless of their specific hardware.

We are seeing great performance and cost benefits of our 2 TIERS™ storage solution prototype for IO and capacity intensive applications. We will have a complete 2 TIERS™ prototype system running applications built by our EMC Center of Excellence teams from Skolkovo (Russia), Rio (Brazil), and Cork (Ireland) at our EMC World Solutions Exchange booth (#156) along with our subject matter experts from our CTO Fast Data Group.

 


EMC Office of CTO @EMCWorld

Redefinenext

EMC World 2015 is about to get started next week. We have the perfect theme for this year's EMC World, Redefine.next. Transformation is occurring all across IT. Based on the latest customer feedback collected by our CTO Ambassadors, EMC customer are accelerating their investment in mobile, big data, and web based application while focusing on reducing costs associated with operating traditional client-server applications. Most customers are developing these applications using Open Source platforms. IT infrastructure is evolving to meet these new application demands will an acceleration in the adoption of flash storage, in-memory databases, and HDFS data lakes.

The EMC Office of the CTO is at the center of many of EMC's product and solution transformation and has its biggest and most visible role ever planned for EMC World. We are starting off the week hosting our Technical Advisory Board meeting. On a regular basis we meet with industry and subject matter experts to review EMC's technology strategy and hear their feedback. I will have a longer post on our Technology Advisory board in an upcoming post.

EMC's global CTO, John Roese will be joining Pivotal CEO, Paul Maritz during the general session keynote to talk about the impact of Open Source software on enterprise IT. This year the general session keynotes will be webcast live so you can hear all about the exciting product announcements and EMC's point of view. You can register for the webcasts here. Immediately following John will be hosting a customer roundtable with thought leaders from some of EMC's biggest customers and partners. After the roundtable John will be meeting with EMCElect and our CTO Ambassadors to talk technology and answer questions. Please RSVP for the EMCElect meeting here. Finally at 3pm on Tuesday John will be interview on EMC TV.

One of the major reasons so many technologists attend EMC World is the breakout sessions. We have six great break sessions scheduled. I have been fortunate to participate in the dry run preparations for these sessions and the content is awesome. Make sure you add these breakouts to your calendar.

Session ID

Title

Day

Start Time

End Time

Track

octoST.01

Understanding How The Future Of CI Will Change How You Consume Technology

Mon

1:30pm

2:30pm

IT Leadership

octoST.02

Fiber, Fabrics & Flash: The Future Of Storage Networking

Tue

4:30pm

5:30pm

IT Leadership

octoST.03

Flash: The Myth, The Media, The Magic

Tue

8:30am

9:30am

IT Leadership

octoTT.01

OpenStack @ EMC: A Holistic Primer On The Stack You Need For Tomorrow, Today

Mon

8:30am

9:30am

Technology

octoTT.01

OpenStack @ EMC: A Holistic Primer On The Stack You Need For Tomorrow, Today

Tue

12:00pm

1:00pm

Technology

octoTT.02

A Disaggregated World: CI For The Future

Tue

8:30am

9:30am

Technology

octoTT.02

A Disaggregated World: CI For The Future

Wed

12:00pm

1:00pm

Technology

octoTT.03

Hyperscale Infrastructure Is Moving To A Simplified Data Architecture: The 2 TIERS Model

Mon

1:30pm

2:30pm

Technology

octoTT.03

Hyperscale Infrastructure Is Moving To A Simplified Data Architecture: The 2 TIERS Model

Wed

8:30am

9:30am

Technology

 
OctoboothThis year we will be hosting a booth in the Solutions Expo. Our theme is Innovation @EMC. We will be featuring some of our most interesting Advanced Development projects. We will be featuring three projects:
Two Tiers storage solution,  Provisioning and Orchestration of bare metal infrastructure, Memory Centric Architecture

 

 

We will have our subject matter expert technologists available at the booth (#156) to Llai review these projects and discuss EMC's approach to these challenges. All three of these project teams are interested in engaging directly with customers to collaborate on the next phases and to pilot their work with real world workloads. When you visit our booth our staff will have our Live Long and Innovate (#llai) logo t-shirts and press on tattoo's for you to show off your innovation spirit. I will have additional blog posts on each of these projects later this week.

As you can see there is a lot of exciting activities planned for EMC World 2015. With the tremendous transformation our customers are experiencing the EMC Office of the CTO has never been more relevant. I have several more blog posts planned for this week with more details on our Advanced Development projects that will be featured in our booth and will be blogging next week during the show as well.


Using Big Data to Optimize IT Infrastructure Operations

In 2015 we will generate more new digital content than ever before. According to the most recent EMC Digital Universe study the amount of digital content we are managing will grow by 10x by the end of this decade. As a result, IT infrastructures are continuing to expand and become more complex making it more difficult, if not impossible to continue to use manual processes to maintain IT security, and optimize workload placement.

Cybersecurity has been a major public concern after numerous recent data breaches suffered by companies that include JPMorgan Chase & Co., Target Corp., Sony Pictures Entertainment, and most recently health insurer Anthem. A new approach is needed. As a major enterprise technology provider our customers look to EMC for help.

Two of the leading researchers in IT analytics are Yael Villa Phd., and Alon Kaufman Phd. Next week they will be presenting next generation IT analytics solutions at Mobile World Congress. Next generation IT security and workload placement leverage Big Data technology. I had the opportunity to talk with Yael and Alon this week.

What are the major challenges for Cloud Service Providers?

Yael/Alon: As the size and complexity of infrastructures continue to increase at an BIexponential rate, cloud service providers face the overwhelming and unprecedented challenge of capturing, managing, processing and analyzing huge amounts of operational data in order to secure, optimize, and reduce costs to provide their services. Every operator is searching for new ways to increase revenues and profits during a time of stagnant growth in the industry. A new approach is needed. Cloud Service Providers are racing to take advantage of new data analytics technologies to deliver services faster, more securely at a lower cost. Most operators conduct analytics programs that enable them to use their internal data to boost the efficiency of their networks, segment customers, and drive profitability with some success. But the potential of big data poses a different challenge: how to combine much larger amounts of information to increase revenues and profits across the entire value chain, from network operations to product development to marketing, sales, and customer service — and even to monetize the data itself.

Some examples of use cases that can be address by Big Data and Data Science approaches:

  • Optimizing routing and quality of service by analyzing network PA
  •  traffic in real time
  • Analyzing call data records in real time to identify fraudulent behavior immediately
  • Allowing call center reps to flexibly and profitably modify subscriber calling plans immediately
  • Tailoring marketing campaigns to individual customers using location-based and social networking technologies
  • Using insights into customer behavior and usage to develop new products and services

The cloud service providers that integrated Big Data and Data Science into their operations will be most successful.

Why is the EMC Federation a unique partner for Cloud Service Providers?

Yael/Alon: The EMC Federation is a unique partner for cloud service providers with our many years of experience providing storage, identify management, virtualization, and management products that power many of the services available today. The massive IT infrastructure scale supported by cloud service provider's present unique challenges compared to traditional enterprise customers. As our industry transitions to more mobile, Big Data, and web applications the IT services that supports these applications need will only grow. Few IT providers have our knowledge and history of success.

The EMC Federation understands the challenges cloud services providers are facing and have been adjusting our product portfolio with new software defined infrastructure, and next generation analytics tools from Pivotal. In 2015, we are bringing these products together to deliver complete solutions. Our Security Analytics Suite is one example of EMC Federation solutions. We are leveraging our sophisticated RSA security algorithms in combination with Pivotal Big Data analytics suite to analyze massive amounts of data generated by your infrastructure in near real time to identify behavior anomalies. Automation reduces risk and costs to secure the next generation of services customers are demanding.

Big Data analytics is not just limited to improving security. We are leveraging these same architectures to optimize workload placement based on application requirements and the current infrastructure workload. The ability for a cloud service provider to move workloads around to guarantee performance and lowest cost is critical for many next generation solutions. Next generation IT infrastructures will be highly automated using Big Data and Data Science to reduce risk, improve service availability, and reduce cost.

Ngdc

It is very exciting to see the EMC Federation product portfolio being leveraged to provide differentiated solutions for cloud service providers. I believe Big Data and Data Science is going to be key to creating the next leap in IT automation. The EMC Federation is one of the few partners that offer the breadth of products, experience, and expertise to the cloud service provider industry. I am looking forward to Yael and Alon's presentation at MWC on Wednesday March 4th at 2pm and further discussions on this topic.