Data-centric computing is an emerging concept that has relevance in information architecture and data center design. It describes an information system where data is stored independently of the applications, which can be upgraded without costly and complicated data migration. This is a radical shift in information systems that will be needed to address organizational needs for storing, retrieving, moving and processing exponentially growing data sets.[1]
Traditional information system architectures are based on an application-centric mindset. Traditionally, applications were installed, kept relatively static, updated infrequently, and utilized a fixed set of compute, storage, and networking elements to cope with a relatively small set of structured data.[2]
This approach functioned well for decades, but over the past decade, data growth, particularly unstructured data growth, put new pressures on organizations, information architectures and data center infrastructure. 90% of new data is unstructured and, according to a 2018 report, 59% of organizations manage over 10 billion files and objects[3] spread over large numbers of servers and storage nodes. Organizations are struggling to cope with exponential data growth while seeking better approaches to extracting insights from that data using services including Big Data analytics and machine learning. However, existing architectures aren't built to address service requirements at petabyte scale and beyond without significant performance limits.[4]
Traditional architectures fail to fully store, retrieve, move and utilize that data because due to limitations of hardware infrastructure as well as application-centric systems design, development, and management.[5]
Data-centric workloads
There are two problems data-centric computing aims to address.
Data-centric computing
Data-centric computing is an approach that merges innovative hardware and software to treat data, not applications, as the permanent source of value.[8] Data-centric computing aims to rethink both hardware and software to extract as much value as possible from existing and new data sources. It increases agility by prioritizing data transfer and data computation over static application performance and resilience.
Data-centric hardware and software
To meet the goals of data-centric computing, data center hardware infrastructure will evolve to address massive scale, rapid growth, the need for very high performance data movement, and extensive calculation requirements.
As far as software goes, data-centric computing accelerates the disappearance of traditional static applications.[12] Applications become short-lived, constantly added, updated, or removed as algorithms come and go. Software is redesigned to conduct analysis on all available data instead of subsets. Microservices visit data, conduct calculations and express the results of their process at speeds beyond conventional approaches.
{{cite web}}