3-1

Data Virtualization

Data virtualization is any approach to data management that allows an application to retrieve and manipulate data without requiring technical details about the data, such as how it is formatted or where it is physically located.

Unlike the traditional extract, transform, load (“ETL”) process, the data remains in place, and real-time access is given to the source system for the data, thus reducing the risk of data errors and reducing the workload of moving data around that may never be used.

Unlike Data Federation it does not attempt to impose a single data model on the data (heterogeneous data). The technology also supports the writing of transaction data updates back to the source systems.

To resolve differences in source and consumer formats and semantics, various abstraction and transformation techniques are used.

This concept and software is a subset of data integration and is commonly used within business intelligence, service-oriented architecture data services, cloud computing, enterprise search, and master data management.


Features

Data Virtualization software is an enabling technology which provides some or all of the following capabilities:

  • Abstraction – Abstract the technical aspects of stored data, such as location, storage structure, API, access language, and storage technology.
  • Virtualized Data Access – Connect to different data sources and make them accessible from a common logical data access point.
  • Transformation – Transform, improve quality, reformat, etc. source data for consumer use.
  • Data Federation – Combine result sets from across multiple source systems.
  • Data Delivery – Publish result sets as views and/or data services executed by client application or users when requested.
  • Data virtualization software may include functions for development, operation, and/or management.

Benefits

  • Reduce risk of data errors
  • Reduce systems workload through not moving data around
  • Increase speed of access to data on a real-time basis
  • Significantly reduce development and support time
  • Increase governance and reduce risk through the use of policies
  • Reduce data storage required