In 2014, Big Data & Analytics promised to bring a lot of interest among organizations for this topic. And in 2015, Big Data & Analytics technologies and techniques continue to spread within public and private organizations that are getting into a phase of massive experimentation. Digitalization covers all the data while the economic crisis encourages them to turn most part of the data into high value business information.
Organization enter an experimental phase, often without waiting to have all strategic, organizational and technological questions, raised by Big Data, solved. Organizations look for empirical yet relevant use cases; so that, even if they fail, precious knowledge can be drawn and recorded.
Inflation of data volume, inside the enterprise, draws storage and processing infrastructure issues. However, the most important problems are about how to align applications (i.e. algorithm structure) with massive parallelization of data processing, and how to govern data. Data mining algorithms, used in conjunction with parallelization techniques, coming from the High Performance Computing (HPC) domain, will solve the first issue; but, data governance is a more difficult one to solve as it deals with business processes and operational teams organization. The main issues are about how to locate the right data and how to determine their actual and future value.
Finding value inside information, allows the emerging of new business models. Existing enterprises or new pure players develop new services about aggregation of data, and some vendors propose data acquisition, data enrichment and data analysis.
In some business sectors (e.g. pharmaceutical industry and research), the organization collaborates by sharing more and more data in order to enrich their respective offers and services. Actual and future development in research and Big Data & Analytics needs new models for research computing and a sustainable, collaborative, elastic, distributed model that promises to overcome legacy barriers and open new avenues into research sciences.
For example, the model of Arizona State University integrates the following foundational components:
- Research centers, free of walls, which are collaboration-centric bases on hybrid Cloud technology
- Open Big Data frameworks that support global meta-data management for digital curation of genomics data
- Dedicated bandwidth, free of policy or capacity restrictions, supporting research collaboration among organizations
- OpenFlow-based network component and software-defined networking and dedicated bandwidth, supporting research collaboration
- Programmable and pluggable architectures for the next generation of research computing
- Efficient and effective touch-less network management, free from the bottlenecks of legacy hardware-defined networking
- Research-as-a-Service workload-based provisioning for holistically defined scientific challenges
All these, together, form the Next Generation Cyber Capability (NGCC) Big Data & Analytics platform – a catalyst for open innovation.