Why data virtualisation is critical to a successful decentralised data strategy

by Ravi Shankar, Senior VP, Denodo

It is important for organisations to gain data accessibility and not just data availability. That is, dedicated users of data must be able to access the data they need at a second's notice to be competitive and effective in today’s fast-moving business environment.

For years, we have been trying to democratise data, making it more accessible for analysis so as to deliver more value. Yet centralised data architectures that consolidate data to a single place, such as a data warehouse or data lake, tend to inhibit efforts that allow for easy access to data.

One limitation of a centralised data architecture is a lack of flexibility, as it will never be flexible enough to accommodate the needs of every department within a large organisation. Another limitation is slow data provisioning, since it takes time to centralise data from multiple sources.

Zhamak Dehghani, Director of Emerging Technologies at Thoughtworks, recently introduced the idea of a decentralised data infrastructure called a 'data mesh' to resolve these problems. A data mesh is designed to move organisations from monolithic architectures such as data warehouses and data lakes to decentralised architectures.

In a data mesh architecture, organisational units called ‘domains’ are responsible for managing and exposing their own data to the rest of the organisation. The key benefit is that this approach cuts down on the processing required for data since domains have a better understanding of how their data should be used. Additionally, the approach offers domains the autonomy to use the best tools for their requirements.

While data mesh is a promising new architecture, the decentralised approach to data management may introduce challenges like data silos and data duplication. This is where data virtualisation can come in. Data virtualisation offers a complementary modern data integration and data management technology that further strengthens the data mesh concept, with no need for replacement of legacy equipment or hardware.

Data virtualisation allows for unified data access, unlike extract, transform and load (ETL) processes and other batch-oriented approaches that centralised and decentralised architectures currently require. Data virtualisation adds an enterprise-wide layer above an organisation’s diverse data sources. Data consumers simply query the data virtualisation layer to automatically retrieve the necessary data. This process shields the consumers from the complexities of actual access to data.

While data virtualisation layers do not contain actual data, they can store all necessary metadata for domains to access from multiple sources and enable organisations to automate role-based security and data governance protocols across the organisation from a single point of control.

Data virtualisation enables domains to create and implement virtual models quickly from any data source even without having to understand the complexities of the sources that feed it. By minimising replication, it also speeds up the creation of multiple versions of data products.

For example, the Denodo Platform provides “out of the box” data products that support features like data lineage tracking, self-documentation, change impact analysis, identity management and single sign-on (SSO) – which help to simplify and speed up the development of data products. 

These data products can then be made accessible via a flexible array of methods such as SQL, REST, OData, GraphQL, or MDX – and because developers do not need to write any code, the data products can be easily and automatically published in an organisation-wide data product catalogue. By centrally storing metadata, data virtualisation layers provide all the necessary ingredients for full-featured, comprehensive catalogues to an organisation’s data assets, organised by domain.

Another essential benefit of data virtualisation is that it allows domains to autonomously select and independently scale the data sources that best complement their products and suit their specific needs. For instance, many business units will already have in place existing data analytics systems which they can reuse with ease. The business units will also be able to reuse applications tailored for software-as-a-service (SaaS) use, and operate their own data marts. Organisations can also leverage data virtualisation to prevent conflicts with other internal processes and ensure adequate performance.

However, it is important to point out that data virtualisation does not replace monolithic repositories like data warehouses and data lakes; these data repositories remain dominant sources for certain data products. In such cases, having a data virtualisation layer on top of physical data repositories ensures that data products are still accessed through the virtual layer and are still governed by the same protocols applicable to the rest of the data mesh.

Comments

Popular posts from this blog

Fortinet enhances FortiRecon to align with CTEM framework

SentinelOne recognised as a 2025 Gartner Peer Insights Customers’ Choice for XDR

AWS: AI adoption grows 20% in Singapore