Cloud Data Migration Testing & Master Data Management for US Pharmaceutical company
Overview.
Client data warehouse was integrated with many data sources/vendors for data ingestion. Data comes as data files in different formats such as JASON, XML, parquet files and comma delimited text files, from different vendor systems like CRM, Axtria, IQVIA files, Kronos, Allcare, MDM, DMD and many more. Coforge implemented a robust data integration framework to seamlessly ingest and process diverse data formats from multiple vendor systems into the client's data warehouse for unified analytics and insights.
The client was developing a new data warehouse in Snowflake. Multiple pipelines were created in Azure DevOps for migrating data into Snowflake.
Testing these migrations had the following challenges.
Testing these migrations had the following challenges.
Manual testing took high effort and was prone to errors.
Sample based testing was done on huge volume of data leading to data leaks.
Solution
Coforge analyzed the input data files and the new data warehouse structure in Snowflake and created integrated scalable & reusable automation test scripts.
The test automation framework was created in PySpark. The key features of the framework are:
Automated all input file type validations, across all layers.
Data completeness checks using checksums for counts and totals.
Record wise data comparison testing between input files and Snowflake database.
Integrated with Azure DevOps pipeline for automated execution.
Automated the report validation process for each dashboard using QVDs (QlikView Data) and Excel functions.
The Impact
70% reduction in manual test execution effort
100% Data validation coverage for all file types
Streamlined data validation and testing processes, ensuring robustness, scalability, and accuracy in the client's data warehouse operations.