Skip to content

Matchers

Matchers are the building blocks of custom tags. Each matcher targets a different attribute of data access to check a certain condition. If the condition is true, the matcher creates the specified tag, which will be available when policy rules are evaluated. For example, to tag queries that access one or more databases in a data store, use the db_name matcher.

Database Matcher

Use this matcher to identify access to certain databases.

For example, the following will create the financial_data tag when querying any of the sales, finance or fin databases.

tags:
  - display_name: "Finance databases"
    tag: financial_data
    matcher_id: db_name
    matcher_data:
      match:
        - sales
        - finance
        - fin

To automatically create tags on every database that is accessed, add the {{db_name}} pattern to the tag name. For example, the following will create the data_sales tag when the sales database is accessed:

tags:
  - display_name: "Tag all DBs"
    tag: data_{{db_name}}
    matcher_id: db_name

Note: All tags generated for database objects will appear in lower case form

Schema Matcher

Use this matcher to identify access to certain schemas.

For example, the following will create the financial_data tag when querying the sales schema in any database or the finance schema in the db1 database:

tags:
  - display_name: "Finance schemas"
    tag: financial_data
    matcher_id: schema_name
    matcher_data:
      match:
        - sales
        - db1.finance

To automatically create tags on every schema that is accessed, add the {{schema_name}} pattern to the tag name. For example, the following will create the data_sales tag when the sales schema is accessed, in any database:

tags:
  - display_name: "Tag all schemas"
    tag: data_{{schema_name}}
    matcher_id: schema_name

Note: All tags generated for schema objects will appear in lower case form

Table Matcher

Use this matcher to identify access to certain tables.

For example, the following will create the financial_data tag when querying:

  • The deals table in any database and schema
  • The quotes table in the public schema in any database
  • The assets table in the shared schema in the finance database
tags:
  - display_name: "Finance tables"
    tag: financial_data
    matcher_id: table_name
    matcher_data:
      match:
        - deals
        - public.quotes
        - finance.shared.assets

To automatically create tags on every table that is accessed, add the {{table_name}} pattern to the tag name. For example, the following will create the table_plays tag when the plays table is accessed, in any schema or database:

tags:
  - display_name: "Tag all tables"
    tag: table_{{table_name}}
    matcher_id: table_name

Note: All tags generated for table objects will appear in lower case form

Column Matcher

Use this matcher to identify access to certain columns.

For example, the following will create the financial_data tag when querying:

  • The deal_size column in any database and schema
  • The revenue column in any database and schema
tags:
  - display_name: "Finance tables"
    tag: financial_data
    matcher_id: column_name
    matcher_data:
      match:
        - deal_size
        - revenue

Note: All tags generated for column objects will appear in lower case form

Result Set Matcher

Use this matcher to identify specific data elements in any field of the result set. For example, to detect result sets that contain the string EU or eu use the following tag:

tags:
  - display_name: "EU data"
    tag: data_eu
    matcher_id: column_data_contain
    matcher_data:
      match:
        - EU
        - eu

Query Filter Matcher

Use this tag to identify queries that contain a WHERE clause with a specific value. For example, assume a projects table contains information about multiple projects, identified by a project_id column. To identity queries that filter based on project_id values, use the following custom tag definition:

tags:
  - display_name: "Filtered by project"
    tag: "filtered_by_{{tagger_alias}}"
    matcher_id: "columns_exact_filter"
    matcher_data:
      match:
        - table: projects
          column: project_id
      tagger_alias:
        flying_cars: 1
        laser_swords: 12
        ceo: [1,12]

The custom tag definition will generate tags based on the filter values applied in the query. The actual filter values in the query must be a subset of the tags defined in the tagger_alias section. For the above example:

  • select * from projects → No filter by project_id applies, no tag will be generated
  • select * from projects where project_id=1filtered_by_flying_cars, filtered_by_ceo
  • select * from projects where project_id=12filtered_by_flying_laser_swords, filtered_by_ceo
  • select * from projects where project_id in (1, 12)filtered_by_ceo
  • select * from projects where project_id in (1, 14) → No tag will be generated

The match section can be modified to apply to any table with a given column by omitting the table parameter:

      match:
        - column: project_id

The match section can be used to specify a json path within the data:

      match:
        - table: projects
          column: project_id
          json_path: "*['metadata'][0]['internal_id']"

Row Count Matcher

Use this matcher to tag any result set which exceeds a threshold, the below example will trigger a tag response_larger_than_expected if the result contains more than 50 rows.

tags:
  - display_name: "large datasets"
    tag: response_larger_than_expected
    matcher_id: result_set_row_count
    matcher_data:
      criteria:
        - '> 50'

Wildcard Query Matcher

Use this matcher to identify queries which are accessing all columns in a given table(s) (i.e. wildcard query)

tags:
  - display_name: "Detect wildcard queries on sensitive tables"
    tag: "wildcard_query"
    matcher_id: wildcard_tables
    matcher_data:
     match:
      - table: customer_data
      - table: customer_purchases

The match section can be modified to apply to any table by omitting the table parameter:

      match:

Client IP Matcher

Use this matcher to identify queries by their client IP.

tags:
  - display_name: "tag clients from China or Hong Kong"
    tag: client_ip_from_{{tagger_alias}}
    matcher_id: client_ip_acl
    matcher_data:
      acl:
      - cidr:
          - 1.1.0.0/16
          - 1.2.4.0/24
          - 1.2.3.4/32
        tag: china
      - cidr:
          - 1.32.205.0/24
          - 102.165.36.0/23
        tag: hong_kong

The custom tag definition will generate a tag based on the first matched ACL CIDR. For the above example:

  • client IP == 1.1.1.1client_ip_from_china
  • client IP == 1.2.3.4client_ip_from_china
  • client IP == 102.165.36.10client_ip_from_hong_kong
  • client IP == 2.3.4.5 → None of the ACL CIDRs are match the client ip, no tag will be generated