Anonymization vs. Tokenization: Exploring Use Cases and Benefits
How do you protect sensitive data when you need to use it and put it to work? Privacy regulations place strict controls on how personal information can be accessed and shared. But you also can’t let your business grind to a halt.
Two technologies that can help are tokenization and anonymization. While they’re both designed to protect sensitive information from prying eyes, they work differently and meet different requirements.
Tokenization protects sensitive data (credit card numbers, social security numbers, etc.) while giving front-line staff the information they need to do their jobs. A call center is a classic scenario—customers need assistance with transactions or inquiries about their account, but certain information must be off limits.
- Staff can perform transactions and queries without viewing sensitive data
- Stolen tokens cannot be “cracked” to obtain the original value
- Removing sensitive data from the production server reduces the risk of a breach
- The production server does not have to demonstrate compliance
How Data Tokenization Works
Tokenization replaces sensitive data with substitute values called tokens. Tokens are stored in a separate, encrypted token vault that maintains the relationship with the original data outside the production environment. When an application calls for the data, the token is mapped to the actual value in the vault outside the production environment.
Tokens can represent substitute values in various ways. For example, they can retain the format of the original data while revealing only the last few digits. The same token can also represent each instance of the original data.
Common Use Cases
- In a retail setting, tokens are often used to represent credit card numbers. Tokens reside on a retailer’s system while the actual numbers are stored on a payment network.
- Customer service staff at banks, hospitals, and government agencies often request the last four digits of a social security number to confirm identity. A token can display these values while masking the other digits with an “X” or asterisk.
Anonymization is designed to make it impossible (or extremely impractical) to connect personal data to an identifiable person. Organizations can then use, publish, and share that data without requiring permission.
- Permanently replaces sensitive data with substitute values
- Various methods are available (masking, scrambling, etc.)
- Not for production environments (original data is required)
- Non-production servers are not subject to compliance
How Anonymization Works
Anonymization permanently replaces sensitive data with a substitute value—it’s a form of data tokenization without the token vault. As with standard tokenization, substitute values can take various formats. (For example, “Jeff” could be replaced by “Helga,” or some random combination of digits.)
Common Use Cases
- Internal environments such as software development or testing that have to work with realistic data
- Sharing reporting data with external entities that are not authorized to view sensitive information
- Healthcare analytics such as population studies where specific patients must not be identifiable
How to Get Started
Tokenization and anonymization can be implemented in different ways depending on the environment that needs to be protected. Many organizations use both to meet different business objectives as part of their overall privacy and security strategy.
Finally, it’s important to recognize that privacy protection and operational efficiency are not mutually exclusive. Effective protection can be integrated into your processes to safeguard sensitive data while extracting maximum value from it. A partner like Syncsort, with deep understanding of privacy requirements in big data environments, can help you determine the right solution and bring it to life.
You can learn much more about these important data-protection technologies as well as the pros and cons of each by reading our eBook.