With more and more organizations delivering digital services through the cloud, the information systems are perceived to be centrally deployed and managed. However, emerging data security restrictions and compliance requirements around the handling of personally identifiable data in IT systems are affecting the system's design significantly.
Many countries have been enacting legislation that requires personal data to be kept within geographic boundaries. These regulations have resulted in an increase in cost, complexity, and legal implications while transferring personal data across borders. In recent customer engagements, we have come across the following instances
Almost all the data processing IT systems are facing the above issues while meeting business and legislative requirements. Cloud-based solutions that work with PII data also require capabilities for supporting the rapidly evolving data storage and processing requirements. The combination of above mentioned regulatory, technical, and compliance elements require robust and secure data hosting and processing solutions for the customer preferences.
The scope of this white paper is limited to outlining the cloud-based reference solutions for the geo-localization of Personally Identifiable data. For illustration purposes, Azure Cloud service references will be used. For the exact implementation, the solution requirements will be informed by other data requirements and policies (e.g., Privacy; Security; Archiving, etc).
The document outlines requirements identified by a group of architects and a few customer representatives at Coforge, their potential solution options. For delivering solutions with increased flexibility of data storage and/or security of data transit, operational cost implications are considered. The document also presents cloud-based solutions recommending the most preferred solution subject to conditions.
Local Data Residency of Personally Identifiable Information (PII)
For any given customer, it must be possible to physically store all PII data in a geographical area, whether that is a specific continent, country, or region within a country, and whether this storage is legally mandated (Data Localization law), legally encouraged (Data Sovereignty laws), or simply the choice of the Client (Data Residency).
Subject to customer-specific requirements, the following type of data, meets the definition of personal data:
Personal data definition: Personal data is any information that relates to an identified or identifiable living individual. Different pieces of information, when collected lead to the identification of a particular person, also constitute personal data.
Privacy laws around the globe can vary in their definition of PII/Personal Data. For example, GDPR defines personal data as any information relating to an identified or identifiable natural person. Whereas the California Consumer Privacy Act (CCPA) defines personal information as “information that identifies, relates to, describes, is reasonably capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household.” In addition to data that is more intuitive (name, email address, social security number), PII under CCPA includes inferences drawn to create a profile about a consumer reflecting the consumer’s preferences, characteristics, psychological trends, predispositions, behavior, attitudes, intelligence, abilities, and aptitudes
In Canada, Personal Information Protection and Electronic Documents Act (PIPEDA)’s definition includes any factual or subjective information, recorded or not, about an identifiable individual. This includes opinions, evaluations, records of a dispute between a consumer and a merchant, and intentions (e.g., intentions to acquire goods or services, or change jobs).
China’s PIPL’s definition of PII follows that of the GDPR, though for sensitive data and is much broader including "information that once leaked or abused may cause damage to personal reputation or seriously endanger personal and property safety" as well as race, nationality, religion, biometric information, health, financial account, personal whereabouts, and other information.
These differences in definition can create challenges in the development of a globally applicable policy on personal data localization; therefore, the solution must provide for geo-specific adaptation.
The traditional architecture approaches while designing a cloud-based or on-premises system, are centralized services, applications, and databases; whereas the new privacy and compliance requirements need architects to think otherwise. The balance between a centralized, cost-efficient solution and a decentralized privacy-compliant solution needs to be achieved by bearing in mind the nature of the system and budget.
The solution largely depends on the amount of engineering effort budgeted for the initiative and the current state of the system. There would be more options for an IT system that is being built/designed compared to an existing system that is being tuned to accommodate data localization requirements.
Following are a few architecture options that have been explored and applied by our internal practice teams; all these solutions are relevant in their own space, subject to the customer, project, and other circumstances.
Solution Overview
All services are hosted in a central environment. The central environment acts as the ‘default’ environment and other environments are created when required for serving a new region.
Benefits
Constraints
Recommended scenarios
Option 2: De-centralized system with locally hosted services and Data
Solution Overview
All services and all data are hosted and processed locally. The central environment will act as the ‘default’ environment and other environments will be created when required for serving a region.
Benefits
Constraints
Recommended scenarios
Option 3: A central system with PII data geo-located and rest of the data is central
Solution Overview
All services and non-PII data will be hosted centrally. PII data and repositories will be hosted regionally. The services will fetch PII data on demand and will cache the reference data.
Benefits
Constraints
Recommended scenarios
While designing modern digital services delivery platforms, compliance requirements like GDPR and PII must be considered to ensure the longevity of the solution. The customer and legislative demands are increasingly encouraging privacy by design principles. Modern architecture patterns like microservices are already compatible with the above-mentioned approaches since they readily allow data partitioning and separation of data processing from data storage.