Blog | June 14, 2021
It Begins with Data
BY Warren Zafrin
Governance Risk and Compliance leaders recognize that making a shift from a rules-based to models-based approach is necessary to keep up and respond to the developing reality that financial crime is increasingly complex.
Financial firms continue to expand their digital footprint across all products and channels, making the risk to their ecosystems and customers exponentially more complex, diverse, and risky. The mindset must change too if a financial crime is attempted and to consider how the financial industry will respond. Will it continue to be reactive or invest in technologies to become more proactive? This shift does not come without costs, however not continuing to change and invest also comes with costs. The traditional rules-based systems, by their design, are reactive and slower to react. To be effective, these systems have to cast a wider net of behaviors to potentially catch suspicious activity, increase the number of false positives, and increase the cost of investigations. That is why a new and improved approach is required. To achieve this, you have to start with better data to ultimately improve the output and deliver the results necessary to effectively manage the incidents of financial crime.
To start with better data, you have to find a faster and more efficient way to clean and insert it into your updated analytics. Your data ingestion needs to step away from legacy ETL frameworks and focus equally on ingestion and transformation of data for any of this to be possible. Ideally, an enhanced methodology is needed here to avoid the complex mapping, transformation, and data-cleansing requirement that currently exists. “The availability, integrity, reliability, and completeness of data will influence the design, creation, and the ongoing viability of AML models throughout the end-to-end model life cycle.” (Chuck Subrt, Senior Analyst – Aite Group). This is the first step of integrating new technology or strategy and can be the kiss of death for a project if not handled correctly and on time.
Tackling the Challenges
|Volume and Speed||Today’s data volumes are limitless. Unfortunately, large volumes of data tend to break ingestion and subsequent processing pipelines, clogging up the ingestion process. With complexity and volume comes more processing time to ingest and analyze data streams, especially in real-time.||Ingestion systems need to become more cloud-native. Applications need to be able to ingest millions of records per second from multiple data sources, including financial transactions and data from external sources, to compare, aggregate, and report on varieties of data. In addition, systems need to scale with the data.|
|Diversity||Structured, unstructured, labeled, or streaming data is now available in different formats. While structured data tends to be easier to process, unstructured data seems complicated and requires unique processing power.||Utilize a dynamic data model; the systems need to adapt to the data instead of the data being formatted for the system. This leans on the concept of a less-schema data set. The next generation of compliance systems need to be prepared and designed to assume the data will be diverse, not clean, and prone to duplicates.|
|Expense||The maintenance of IT support resources and infrastructure makes the data ingestion process highly expensive.||With the introduction of cloud-native technologies, the underlying infrastructure needs to be designed to use lower-cost commodity hardware and have the ability to scale out based on processing needs. This shifts the ingestion steps closer to the same tech stack as tomorrow’s processing stack, adopting a network of microservices that handle discrete tasks in the ingestion and transformation stages of the data.|
|Security||Structured data is now the more significant issue; it represents the bulk of the content and is proving much more challenging to configure. This data is more secure but less flexible, therefore, more challenging for enterprise-level systems.||The creation of unstructured and semi-structured data are surpassing structured data. New laws and regulations force a deeper look into how data is shared, processed, stored, and applied. Tokenization and service to service encryption are adopting a zero-trust model. Services that ingest and transform the data are becoming security-aware and authenticating between themselves at the data layer.|
Faster Data Ingestion
Our platform is segmented into five architectural layers – ingestion, transformation, data, machine learning, and response. Each layer has a contractual agreement between its function and interface. The contracts relate to how a solution layer module(s) or function(s) calls another and is agreed upon by an orchestrating provider.
Each layer has a specific role. For example, the ingestion layer provides a secure process for external data to be intentionally transferred into our transformation layer. Data is ingested depending on the customer’s specification, both at specific times or on-demand. Data ingestion by our application programming interface is where our transformation layer discovers, cleans, and prepares the data to be processed and mapped to our dynamic data store through machine learning assisted data transformers. Our machine learning assisted mapper discovers and ensures that data is classified and cataloged for historical and management functions. Once discovered, data sources are mapped to our dynamic data model, creating a repository of source data mappings, and thus accelerating our feature engineering.
Data Management in The New World
Production platforms are gone. Instead, we have designed our platform to empower data and IT professionals to control how we integrate and evolve with your ever-growing data needs. Utilizing our dynamic data model, we enable data ingestion in any format. Proprietary algorithms are then applied to transform your data into our data model. The end result is a shorter implementation while also reducing your resource load.