4 Hacks: Minimize Risk in Unstructured Communications
As an Attivio Solutions Architect, I often work to help companies customize (a.k.a., hack) our Cognitive Search and Insight technology to meet their unique search demands. It’s worth sharing some of the most common hacks in the risk space.
Hack #1: “I have tons of unstructured communications in my enterprise — isn’t flagging all variations of a potentially risky situation time consuming and expensive?"
Attivio has a highly flexible rules engine that takes a query-based approach to surveying content to flag outliers, suspicious parameter combinations, and other threatening scenarios in near-real-time. Any and all content and metadata coming from a transaction ecosystem can be queried as part of ingestion. This allows us to apply rules, tags, and customizable scores as additional metadata.
More importantly, the Attivio rule engine can be updated dynamically; and, unlike Regular Expressions, the Attivio query language is capable of recognizing many variations of the same situation without requiring explicit definition for each one. We take care of stemming, lemmatization, synonyms, phonetic similarity, and acronyms for you.
In the field, we have enabled businesses to reduce their rules footprint by over 50%.
Additionally, these rules and labels can be affiliated with risk scores and aggregated onto the records. This way, the investigation tools and case management systems have immediate line-of-sight into the relative risk scores of different content and can easily prioritize those with the highest scores, most severe violations, or some other subset.
Hack #2: “Our risk and compliance teams rely on both forensic investigation and advanced analytical analysis, but are also tasked with reducing our footprint — how can we support both analyses from one system"
Attivio relies on a universal, schema-less index that facilitates query-time relational search — what this means is that any and all investigation can support both natural language search in addition to ANSI 92 SQL connectivity via ODBC/JDBC. One index & one platform can power all risk & compliance analyses.
Hack #3: “I need to reduce my false positives by mashing up unstructured data with my transactions records for better context but I don’t know where to start"
Attivio creates structure around the unstructured — we can transform emails, chats, PDFs, and hundreds of other file types into key-value pairs when indexing. In addition, we have powerful capabilities around creating enhanced and enriched metadata during ingestion based on the business’s own rules, regulations, business dictionaries, and any other reference data deemed critical. For example, Attivio can extract tickers, product names, employees, currencies, key terms (e.g. “Libor”) and phrases (e.g., “hit at 93,” “bull steepening”).
What this means is that the idea of “join” is no longer a SQL-only concept. Attivio unlocks a new universe of keys between sources and provides query operators to do the heavy lifting.
Let’s take a simple example: I need to join an hour’s worth of transaction records to chat data wherever a trader said “received fix.”
With Attivio, all of this computation can be powered by one query or even created via GUI and surfaced to an investigator in seconds, such as:
I need to find the chats with the relevant phrase
Extract and correlate the banker information with the chat record
Join the transaction data to the chat data if and only if:
a. The transaction happens around the time of the key phrase, say +/- 15 minutes
b. The transaction involves selling Libors
Hack #4: “Our team is great about sorting false positives from true positives. How can we capture that intelligence and make it work for us?"
In addition to industry-leading text analytics, Attivio also provides robust capabilities for document classification via supervised machine learning. Our classifier technology leverages training sets built up from the business’s own expert feedback to help automate and accelerate decision making processes.
For example, let’s say the business is monitoring violation X, and the risk team found a subset of true positives and false positives. These two sets of documents can seed a new classifier that creates a “true” and “false” model for violation X. Training the classier takes only a few moments. It can then be dynamically added to the ingestion process and pre-sort new appearances of violation X into “likely true” and “likely false” buckets, accompanied by a confidence score. The business can use these additional labels and scores for prioritization, routing, or other risk-management operations.
You can create as many classifiers as you need and can place them at any point in the ingestion process to add metadata, route documents to different processing stages, or perform different sets of rules as the business deems necessary.