Skip to content

Build Data Loss Prevention (DLP) policies

In order to use Data Loss Prevention (DLP) tools within Cloudflare Zero Trust, you first need to define your DLP profiles. DLP profiles are complex objects with dictionaries, pre-built detections, and custom logic that you can reference as selectors within your Gateway policies.

Configure a DLP profile

You may either use DLP profiles predefined by Cloudflare, or create your own custom profiles based on regular expressions (regex), predefined detection entries, and DLP datasets.

Configure a predefined profile

  1. In Zero Trust, go to DLP > DLP Profiles.
  2. Choose a predefined profile and select Configure.
  3. Enable one or more Detection entries according to your preferences. The DLP Profile matches using the OR logical operator — if multiple entries are enabled, your data needs to match only one of the entries.
  4. Select Save profile.

Build a custom profile

  1. In Zero Trust, go to DLP > DLP Profiles.

  2. Select Create profile.

  3. Enter a name and optional description for the profile.

  4. Add custom or existing detection entries.

    Add a custom entry

    1. Select Add custom entry and give it a name.

    2. In Value, enter a regular expression (or regex) that defines the text pattern you want to detect. For example, test\d\d will detect the word test followed by two digits.

      • Regular expressions are written in Rust. We recommend validating your regex with Rustexp.
      • DLP detects UTF-8 characters, which can be up to 4 bytes each. Custom text pattern detections are limited to 1024 bytes in length.
      • DLP does not support regular expressions with + or * operators because they are prone to exceeding the length limit. For example, the regex pattern a+ can detect an infinite number of a characters. We recommend using a{min,max} instead, such as a{1,1024}.
    3. To save the detection entry, select Done.

    Add existing entries

    Existing entries include predefined detection entries and DLP datasets.

    1. Select Add existing entries.
    2. Choose which entries you want to add, then select Confirm.
    3. To save the detection entry, select Done.
  5. (Optional) Configure Advanced settings for the profile.

  6. Select Save profile.

Build effective DLP profiles

For many Cloudflare users, Zero Trust is often one of the only measures for preventing the loss of sensitive data. For other users, Zero Trust may be the one of the early in-line measures of a complex Internet and SaaS app security strategy. No matter which model you most resemble, developing effective and appropriate DLP policies and practices starts with first-principles definitions.

Define your sensitive data

Existing data patterns

If your organization is most concerned about general data patterns that fit existing classifications such as personal identifiable information (PII), protected health information (PHI), financial information, or source code, we recommend using the default predefined profiles.

To help this better match the needs of your organization, you can also build a complex profile that matches data to both an existing library and a custom string detection or database. For example:

SelectorOperatorValueLogicAction
DLP ProfileinCredentials and SecretsOrBlock
DLP ProfileinAWS Key Dataset

Assorted data patterns

If your data patterns take many different forms and contexts, consider building a custom profile using one or multiple regexes.

For example, you can use a custom expression to detect when your users share product SKUs in the format CF1234-56789:

  1. Build a custom profile with the following custom entry:

    Detection entry nameValue
    Product SKUsCF[0-9]{1,4}-[0-9]{5}
  2. Create an HTTP policy with the following expressions:

    SelectorOperatorValueLogicAction
    DLP ProfileinProduct SKUsAndBlock
    User Emailmatches regex[a-z0-9]{0,15}@example.com

DLP datasets

If your data is a distinct dataset you have defined, you can build a profile by uploading a database to use in an Exact Data Match or Custom Wordlist function. Exact Data Match and Custom Wordlist feature some key differences:

Exact Data MatchCustom Wordlist
EncryptionHashed and compared to encrypted trafficStored as plaintext
Payload loggingMatches redacted in logsMatches appear in logs
UsagePII (such as names, addresses, and credit card numbers)Non-sensitive data (such as intellectual property and SKU numbers)

We recommend using Exact Data Match for highly sensitive datasets and Custom Wordlists for lists of keywords.

As your datasets change and grow, we recommend building a pipeline to update the data source in Cloudflare Zero Trust. For more information, contact your account team.

Microsoft Information Protection (MIP) labels

If your data already contains Microsoft Information Protection (MIP) labeling schema, Cloudflare can detect those values in-transit automatically. To get started, connect your Microsoft 365 account with a CASB integration. Cloudflare will automatically pull in your existing MIP definitions into Zero Trust. You can then use the MIP definitions to build DLP profiles for use in Gateway policies.

For more information, refer to Integration profiles.

Build DLP policies

The best way to start applying data loss prevention to your traffic, minimize the chance of false positives, and collect actionable data is to start with the known knowns in your sensitive data policies. Rather than building policies to detect sensitive data like SSNs or financial information across all of your traffic, you should start by building policies that target both sensitive data types and destinations that are known data sources or points of high risk. These sources can be inside or outside your organization.

Example

Many organizations want to detect and log financial information egressing from user devices to critical SaaS applications. To limit the risk of false positives and to filter out logging noise, Cloudflare recommends building your first series of policies to specify both target data and target destination. For example, you can block financial information from being sent to AI chatbots, such as ChatGPT and Gemini:

SelectorOperatorValueLogicAction
DLP ProfileinFinancial InformationAndBlock
Content CategoriesinArtificial Intelligence

Once you have analyzed the flow and magnitude of data from the known sources, you can begin focusing on more specialized or explicit datasets for more generalized sources. You may want to allow sources that are known internal locations where sensitive data is intentionally transferred.

After developing a level of confidence from reviewing the logs and evaluating a rate of false positives for both types of policies, you can feel more confident in experimenting more broadly with data loss prevention policies.