Data monitoring is the daily check of data files or inputs against data validation rules and quality control rules. Data monitoring confirms information about data files and ensures that data that is being input into the data warehouse is of high quality and meets set standards for formatting and consistency. Proactive checks of new data allow operators and administrators to be notified of data quality issues with relevant feedback so timely corrections and decisions can be made.
Data monitoring is accomplished in data management platforms using checks on scheduled inputs. The platform will know what files will be received from which sources and when content is checked for content and stability to highlight outliers. Data monitoring allows detailed validation and data cleansing on the files entering the data platform to remove bad data and mitigate against corruption of the data golden copy.
Businesses or public institutions with significant data analysis operations rely on high quality data. What is more important though is knowing if your data is up to date and knowing the quality of the data that you are sending to downstream systems. Data monitoring gives a real time view of the state of your data warehouse. The temporal aspect of your data is crucial so that reference data aligns with transactional data because if they are out of step then operational, financial, logistical, governance and regulatory activities will be impacted. Transactional data is valuable for business analytics as it allows the analysis of processes to maximize efficiency and efficacy of business operations. Data monitoring allows the processing of time sensitive data to have accurate business intelligence and create competitive advantage.
The analysis of monitoring data is aimed at deciding if some characteristics of the monitored structure have deteriorated or have gotten worse over time. Data monitoring statistics are created in checking versus expected content and deviation from content subject to thresholds. Statistics can be created per data file or data source and give users important information when approving data that fall outside of the thresholds. Based on the results of the automated validation and data cleaning workflow, statistics can show the quality of the data in terms of the error percentage, the number of duplicates and the percentage of the data that ends up in the golden copy. What is important is data quality changes over time. Visualising the error percentage versus the expected error rate for the data source based on previous cleaning cycles can help users make the best decisions about current data files.
Data Monitoring and SLAs
Data monitoring can include many types of indicators based on values specified in Subject Level Agreements (SLAs). It is very useful to have a high-level overview of the data monitoring process to see where in the process the data validation and data cleansing did not meet SLA values. When monitoring many data sources, a further level of information is given at the data source level which allows data operations to rectify data quality by going back to the vendor to get an update or new set of data to correct the inaccuracies.
The Finworks Data Platform comes with preloaded workflow templates which automate data type validation, data range validation, data versioning and historisation. We provide standard templates are used to create onboarding workflows and ensure optimal latency. More complex workflows in the platform supporting full upstream source proxying, with duplicates processing.
The Finworks Data platform presents users with timely feedback on a dashboard allows users to easily visualise the stages in the data monitoring and data validation process. The dashboard shows files that did not arrive within the Reception phase. The dashboard shows the files that arrived but where the initial overall content check deviates from what is expected so manual override is expected within the Capture phase. The Data Cleansing phase is displayed in the dashboard to show 1) whether the files can be cleansed or not and 2 data quality errors found during cleaning which are above thresholds based on agreed SLAs. In addition to a comprehensive dashboard, the Finworks Data Platform provides automatic notifications, log files and other sources of usable information to report on data quality.