Why you Should be Using Public Data for Compliance


While a working knowledge of your organization’s internal data is, of course, essential to meeting even the most basic of regulatory requirements, proprietary data alone is insufficient for achieving cost-effective compliance, let alone identifying indicators of truly suspicious activity.

Relying solely on internal data is like a real estate agent determining the value of a home based on the builder, materials, and design features without taking into account the neighborhood. A state of the art home is suddenly worth much less if it’s in an unsafe or otherwise undesirable area. Context is important. This is where public data comes into play.

What is public data, exactly?

In the simplest terms, public data is data about the public domain: information about the people, places, and things that make up the world around us. While a number of existing compliance processes rely heavily on public data from a traditional perspective — criminal backgrounds, negative news — there are troves of public datasets like corporate registrations and property tax forms that remain largely underutilized by financial services organizations.

It is no coincidence that the bulk of public data is comprised of information created by force of regulation. Individual and business licenses and permits, violations and sanctions, incorporation filings, etc. are all reported values associated with entities per the regulatory regime. Why not use this to support your organization’s efforts to meet regulatory requirements? This type of typically lesser known structured public data, or alternative data as it’s often called in the hedge fund world, provides a wealth of general knowledge about your customers. When connected, these public datasets paint a holistic picture of entities that is valuable not only to compliance personnel but also to the business side of an organization.

Public data provides context

Public data anchors proprietary data to the dynamic nature of the world, allowing you to root decision-making in the day-to-day reality of the setting in which you’re operating. It enables contextualization of customer profiles and activity, offering valuable signal and supplementary intelligence for Know Your Customer (KYC) processes, anti-money laundering (AML) investigations, sanctions screening, and identification of politically exposed persons (PEPs).

Placing proprietary data augmented with public data in the hands of experienced compliance professionals provides well-tuned insight into the identities and activities of the people and organizations with whom you do business—insight that enables you to work more effectively and identify more suspicious activity. With public data at work, organizations are able to escape the limitations of rigid, event-based compliance and adopt a more contextual, model-driven approach that places customer entities as the focal point of decision making.

Let public data do the driving

Public data serves as a vital resource to validate and augment your organization’s proprietary data. Connecting public data with your internal data enables an entity-linked view of your business and surfaces rich customer profiles. With pre-identified customer records and relationships, investigators only have to ask questions of data once, eliminating the time consuming process of combing through disconnected datasets.

Public data that’s readily accessible to investigators facilitates better-informed decision-making at scale and significantly accelerates investigation time, making possible more proactive and efficient analysis with fewer resources. By automating the delivery of data to investigators, our compliance solution cut a client’s average investigation time down from many hours to minutes. That translates directly to sizable annual cost savings.

While there are a number of opportunities for financial services organizations at the intersection of public and private data, public data’s role in powering entity-driven compliance takes shape in three core functions:


Confirm what you know is accurate and up to date

Is your customer real? Is she who you think she is? How thin is her file? Public data serves as an official source of truth against which you can compare your internal data. Looking at proprietary data alongside public data enables you to identify inconsistencies – some of which could help pinpoint voids in your own data.

Data containing the addresses of corporations in US is a common avenue for verification. The vector of entity name and address provides a very strong unique identifier — one that doesn’t reveal private information like a customer’s social security number.  Datasets such as corporate registrations, liquor licenses, tax liens, and OSHA violations provide additional data points that are useful for both AML/sanctions and risk-decision verification. For investigators, public data provides additional points of comparison to determine whether entities mentioned in two separate alerts are the same, or whether a person mentioned in one alert is the same as a person on a sanctions or terror watch list. Intelligent matching can significantly reduce the number of false positives and duplicative investigative work. Where risk-based verification is concerned, these additional data points can help determine if a potential customer is a real person or organization, or if they are located at an existing address.


Proactively update customer profiles

Often, organizations only update customer profiles on periodic time intervals: a KYC refresh may take place every five years. During that time period it’s likely the customer will forge new relationships, change addresses, or change jobs (or hire new employees, if the customer is a company). Public data enables you to ground your own data — be it demographics, transactional activity, or relationships with other entities — in the daily external reality of where a person or company is and what they’re doing. In other words, it enables you to match internal data with the demography or relationships you see in the public domain. For example, a customer may submit new information to the government as the result of a license filing. If you are continuously ingesting public data, you very well may catch a new address (or other key information) long before a customer would share it with you, or notice a discrepancy to trigger a necessary customer touchpoint. Access to this data provides two key benefits:

Prioritization: you can proactively reach out to clients (who have given you a reason to do so) if you are under the impression that the public data is more accurate than your current records. In addition to ensuring your data is up-to-date, this also enables you to service customers better and more accurately. For example, with an updated address or phone number you would be able to successfully send your customer a letter to notify them of potential fraud.

Improved de-duplication: with knowledge of additional attributes, such as all possible addresses for an entity, your sanctions screening will be much more accurate.


Identify what you don't know

Proprietary customer data is only a piece of the story: an often static collection of disconnected data points that by and large lack the context of any relationships existing outside the walls of your organization — or within your organization, for that matter.

Linking internal and external data makes it possible to uncover additional addresses, aliases, and connections an analyst may never have found through traditional public record search applications.

Public data also makes it possible to identify relationships within proprietary data that you would not be able to identify without public data. For example, machine-learned models trained on public data to classify types like names, addresses, or phone numbers map to similar proprietary data to identify those same types. Public data provides a giant corpus upon which to train and refine models for use in the private domain.

Patterns of relationships between businesses found in Enigma public data and external data reports map to patterns found in your internal data, creating a discernible web of a customer’s activity. Visibility into a customer’s overall world presence could provide valuable insight to investigations into individuals or companies operating fraudulently, for example. Or, automatically cross-referencing second-order relationships such as a subject’s employer organization or chains of business with AML case histories and government watch lists becomes possible.

Connecting data to provide a more robust profile of customer entities and their traces across the public data landscape enables you to extract substantially more value from data. With linked public and private data, you have the ability to answer a question or solve a problem that a single dataset could not.

Perhaps most importantly, exposure to a greater volume of signals creates new intelligence that enables your data to push workflows forward and drive business operations horizontally across your enterprise.

When integrated into key business processes and investigative workflows, public data provides continuous opportunities to extract new opportunity from your existing data. It lays a foundation for creating an enterprise intelligence asset that drives value across multiple regulatory and business streams.

Enigma helps financial services organization put public data to work

Enigma operationalizes internal and external data to prioritize decision-making. Our entity-driven solution augments existing data resources with a vast public data collection to build rich customer profiles and uncover addresses, aliases, and connections not visible through traditional public record search engines. Enigma’s flexible technologies, automated data acquisition, and advanced linking capabilities create a repurposable entity knowledge base that increases visibility, improves accuracy, and enables multiple regulatory and business streams to build upon previous compliance work.

Enigma data is trusted, verified, and true to (official) source. Our library of 100,000 public datasets and billions of records enables us to provide a rich, entity-linked view of the people and organizations with whom you do business. Our repository contains key datasets such as Office of Foreign Assets Control (OFAC) Specially Designated Nationals (SDN) and Sectoral Sanctions Identifications (SSI) as well as data from sources including US Departments of Commerce (DOC) and Health and Human Services (HHS), Central Intelligence Agency, Bureau of Industry & Security, Her Majesty's Treasury, and United Nations Office on Drugs and Crime.