Recently, International Data Corporation (IDC) released the Report on the Evolution Trends of Data Governance in the AI Era, officially naming KeenData as a representative enterprise leading the upgrade of data governance in the AI era. Against the backdrop of the rapid development of Generative AI (GenAI) technology and profound transformations in the field of data governance, KeenData has built an end-to-end governance system covering data standards, quality, and security through the deep integration of data engineering and governance. It realizes the full-lifecycle governance and safety compliance guarantee from model construction to AI implementation, setting an industry benchmark for the transformation from passive response to proactive and real-time governance.
The Rise of GenAI: New Challenges and Opportunities for Data Governance
In recent years, the emergence of GenAI technology has made enterprises attach greater importance to data democratization. Traditional deep learning and machine learning solutions mainly rely on internal enterprise data for model construction and training, while GenAI significantly enhances model output effects by integrating external models with enterprises' own business data.
IDC's GenAI ARC Survey shows that 83% of organizations agree that GenAI models utilizing enterprises' own business data will bring significant advantages. However, data is one of the key elements for achieving GenAI outcomes. If the data used for training and inference is inaccurate, low-quality, insufficient in volume, or lacks close relevance to target problems, the model may be useless for decision-making or business support. To ensure data accuracy, security, and privacy protection, enterprises must re-examine their data governance efforts to align with AI objectives.
Data governance in the AI era is presenting three key evolutionary trends, forcing enterprises to restructure their data management systems:
1. From Passive Follow-Up to Proactive Planning
In the past, data governance was mostly promoted as a special project for centralized data management and control, adopting a passive and lagging control model. It was disconnected from group business and R&D processes, making it difficult to achieve dynamic optimization through intelligent technologies.
In the AI era, within actual business scenarios, data governance should be a proactive, real-time, and adaptive process. It must take "the implementation of data governance and the release of data value" as the core business objective, creating value agilely and efficiently in business workflows. This process is not a phased task but a continuous daily activity throughout the entire data lifecycle.
Therefore, data governance urgently needs to evolve from the traditional passive and lagging model to a new governance model that is proactive, real-time, and deeply integrated with data engineering.
2. From Static Management to Real-Time Response
Enterprises can integrate AI technologies to break the passive and fragmented state of traditional data governance, embedding technologies such as access control, pipeline connection, data merging, and active metadata exploration into data engineering. When data sources change, subsequent algorithm models are automatically triggered for adjustment; AI reinforcement learning is used to predict data errors in advance and automatically merge abnormal and similar data. At the same time, management links such as data standards, master data management, data quality, and data asset catalogs are strengthened, achieving adaptive connection with data sources and data engineering through AI-driven active exploration.
Ultimately, enterprises need to upgrade from traditional static and passive information management systems to a new type of automated and systematic data governance capability deeply integrated with data engineering, building an enhanced data asset management system with real-time response capabilities.
3. From Single Structured Governance to Unified Multimodal Control
Enterprise data governance will gradually shift from the traditional model of controlling data to a results-oriented, agile daily operation, and ultimately to an autonomous intelligent system. In traditional business scenarios, enterprises mostly relied on single structured data control-oriented governance; with the deepening of the artificial intelligence process, enterprises are more inclined to adopt an agile governance model for multimodal data. Therefore, to complete the digital and intelligent business transformation, enterprises need to promote data quality to efficiently serve business endpoints and frontends, and fully release data value through unified multimodal data control.
In addition, the deepening of the DataOps concept and the popularization of the "data as a product" awareness have also driven enterprises to urgently need end-to-end, traceable, and automated data governance tools to connect the closed loop of data production and consumption.
A New Paradigm of Data Governance for the AI Era
Faced with the opportunities and challenges brought by GenAI, coupled with the new dilemmas of data management under the process of artificial intelligence, traditional data governance norms and standards can no longer meet the needs. Data governance must shift from a passive and lagging model to a new real-time and proactive data governance paradigm for the AI era. As a leading enterprise-level Data&AI technology provider, KeenData has built an independently controllable KeenData Lakehouse integrated data intelligence platform based on cloud-native technology. By deeply integrating data engineering and governance, it has constructed an end-to-end governance system covering data standards, quality, and security, realizing full-lifecycle governance from model construction to AI implementation. It also provides solid safety compliance guarantees, promoting the comprehensive upgrade of data governance from passive response to proactive and real-time governance, and creating a data governance platform (Keen Governance) adapted to the needs of the AI era.
Data Quality
It offers system rules, custom rules, and combined quality rule definition methods, providing independent quality inspection nodes integrated with offline development. Quality management is conducted before, during, and after data processing, supporting the ability to block quality issues. It generates quality assessment reports and enables traceability of quality problems after the operation of data quality tasks, supporting the viewing and export of dirty data.
Data Security
It provides data classification and grading management, sensitive data identification, and desensitization capabilities. Through a custom sensitive data rule library, it conducts full-volume/sampling scans of databases and data tables. For identified sensitive data, it supports multiple desensitization rules such as hash desensitization and masking, ensuring the safe use of sensitive data and guaranteeing the accurate and orderly supply of high-quality data for business decision-making.
Data Standards Platform
As the core hub for the implementation and management of enterprise data standards, the platform provides end-to-end management capabilities for the entire lifecycle of enterprise data standards—including creation, release, review, retrieval, mapping, and audit task management. It standardizes the application of standard specifications by developers during modeling, development, and post-business launch, enabling timely detection of whether data is implemented in accordance with data standards. It establishes a unified understanding of data across the group, provides standardized unified constraints, ensures compliance with international, national, and industry regulations as well as business application requirements from the source, guarantees the standardized production of data, and reduces costs associated with subsequent data application and processing. Its core functions include: full-lifecycle management of standards (covering standard catalogs, standard statistics, visual standard creation, reference, update, release, review, audit, and abolition); support for standard mapping and data auditing (including data warehouse auditing, business database auditing, and audit task management), as well as the generation of comprehensive audit reports.
Looking ahead, KeenData will leverage its integrated Data&AI platform to help enterprises build high-quality data governance systems, deeply integrate AI and large-model capabilities, and accurately adapt to diverse business scenarios. Ultimately, KeenData aims to create a data infrastructure oriented to the AI era and capable of supporting future development, laying a solid foundation for enterprises to continuously unlock data value and achieve digital transformation.
