It Begins with Data

Blog | 14 Jun 2021

Governance Risk and Compliance leaders
recognize that making a shift from a rules-based to models-based approach is
necessary to keep up and respond to the developing reality that financial crime
is increasingly complex.

Financial firms continue to expand their digital footprint across
all products and channels, making the risk to their ecosystems and customers
exponentially more complex, diverse, and risky. The mindset must change too if a
financial crime is attempted and to consider how the financial industry will
respond.  Will it continue to be reactive
or invest in technologies to become more proactive? This shift does not come
without costs, however not continuing to change and invest also comes with
costs. The traditional rules-based systems, by their design, are reactive and
slower to react. To be effective, these systems have to cast a wider net of
behaviors to potentially catch suspicious activity, increase the number of
false positives, and increase the cost of investigations. That is why a new and
improved approach is required. To achieve this, you have to start with better
data to ultimately improve the output and deliver the results necessary to
effectively manage the incidents of financial crime.

To start with better data, you have to find a faster and more efficient way to clean and insert it into your updated analytics. Your data ingestion needs to step away from legacy ETL frameworks and focus equally on ingestion and transformation of data for any of this to be possible. Ideally, an enhanced methodology is needed here to avoid the complex mapping, transformation, and data-cleansing requirement that currently exists. “The availability, integrity, reliability, and completeness of data will influence the design, creation, and the ongoing viability of AML models throughout the end-to-end model life cycle.” (Chuck Subrt, Senior Analyst – Aite Group). This is the first step of integrating new technology or strategy and can be the kiss of death for a project if not handled correctly and on time.

Tackling the Challenges

Data Ingestion Challenge Solution
Volume and SpeedToday’s data volumes are limitless. Unfortunately, large volumes of data tend to break ingestion and subsequent processing pipelines, clogging up the ingestion process. With complexity and volume comes more processing time to ingest and analyze data streams, especially in real-time.Ingestion systems need to become more cloud-native. Applications need to be able to ingest millions of records per second from multiple data sources, including financial transactions and data from external sources, to compare, aggregate, and report on varieties of data. In addition, systems need to scale with the data.
DiversityStructured, unstructured, labeled, or streaming data is now available in different formats. While structured data tends to be easier to process, unstructured data seems complicated and requires unique processing power.Utilize a dynamic data model; the systems need to adapt to the data instead of the data being formatted for the system. This leans on the concept of a less-schema data set. The next generation of compliance systems need to be prepared and designed to assume the data will be diverse, not clean, and prone to duplicates.
ExpenseThe maintenance of IT support resources and infrastructure makes the data ingestion process highly expensive.With the introduction of cloud-native technologies, the underlying infrastructure needs to be designed to use lower-cost commodity hardware and have the ability to scale out based on processing needs. This shifts the ingestion steps closer to the same tech stack as tomorrow’s processing stack, adopting a network of microservices that handle discrete tasks in the ingestion and transformation stages of the data.
SecurityStructured data is now the more significant issue; it represents the bulk of the content and is proving much more challenging to configure. This data is more secure but less flexible, therefore, more challenging for enterprise-level systems.The creation of unstructured and semi-structured data are surpassing structured data. New laws and regulations force a deeper look into how data is shared, processed, stored, and applied. Tokenization and service to service encryption are adopting a zero-trust model. Services that ingest and transform the data are becoming security-aware and authenticating between themselves at the data layer.

Faster Data Ingestion

Our platform is segmented into five
architectural layers – ingestion, transformation, data, machine
learning, and response. Each layer has a contractual agreement between its
function and interface. The contracts relate to how a solution layer module(s)
or function(s) calls another and is agreed upon by an
orchestrating provider. 

Each layer has a specific role. For
example, the ingestion layer provides a secure process for external data to be
intentionally transferred into our transformation layer. Data is ingested
depending on the customer’s specification, both at specific
times or on-demand.  Data ingestion by our application
programming interface is where our transformation layer discovers, cleans, and
prepares the data to be processed and mapped to our dynamic data store through
machine learning assisted data transformers. Our machine learning assisted
mapper discovers and ensures that data is classified and cataloged for
historical and management functions. Once discovered, data sources are
mapped to our dynamic data model, creating a repository of source data
mappings, and thus accelerating our feature engineering.   

Data Management in The New World

Production platforms are gone. Instead, we have designed our platform to empower data and IT professionals to control how we integrate and evolve with your ever-growing data needs. Utilizing our dynamic data model, we enable data ingestion in any format. Proprietary algorithms are then applied to transform your data into our data model. The end result is a shorter implementation while also reducing your resource load.