Kenya continues to face a structural problem that impedes how the financial sector interacts its private sector. On one hand, the financial services sector is very formal and on the other, a highly informal private sector. An end result, undercapitalisaton for the sector or higher NPLs for the banks. A significant contributing factor to this is a never-ending struggle by the financial sector to sufficiently distinguish business and personal expenses that impedes its identification of opportunities for growth and expansion. This challenge is also compounded by data ambiguities and incompleteness for these segments that do not allow for robust interpretation.
Consequently, credit risk and underwriting methods have evolved to form a complex yet fragile ecosystem leveraging simple financial behavioral data and statistical methods, with these segments’ peripheral to that. Innovation, within these bounds, also took an all too familiar path, strengthening these systems for conventional consumer behaviors, simultaneously reinforcing the exclusion of segments with atypical financial behavioral usage patterns. Small scale producers and agriculture MSEs ‘invincibles’ fall here – with atypical financial behavoural patterns and ambiguous data. Critically, these exclusion effects have now also been embedded within the regulatory and orchestration dimensions, beyond data sources and analytic methods. These systems minimally use alternative data and could benefit from an expanded scope geared towards collecting such data. In a credit access ecosystem deeply entrenched in algorithms that use only conventional, well-orchestrated asset and financial relationship behavioural data to configure the great majority of lending, what options do we have to change this?
A useful and likely successful path towards figuring out a solution for credit exclusion is to approach the following questions in parallel:
This path should allow for establishment of a validated process (by industry standards) for qualifying and integrating alternative data into risk processes for credit, underwriting, and oversight. A robust process should allow for Kenya’s financial sector players (FSPs) to offer credit at low cost by figuring out a price point that’s affordable for small scale producers, women and MSEs with improved originations.
The goal here is that through an understanding of the borrower, FSPs can sufficiently identify atypical financial and non-financial behavioral data patterns that can be leveraged for credit analytics and underwriting. The diagram below explains this further.

This is the first step requisite for effective identification and incorporation of new data for excluded segments. From the dimension of the 5Cs of credit framework, cash flow data has already demonstrated its predictive insight across numerous markets and economies, and if these data demonstrate that, then it is acceptable. What remains is an understanding of the tradeoffs between predictive insight, coverage, data stability, accessibility as this is likely to affect risk assessment scores each with significant departures on credit files. CIS ValiData tool might be useful where these segments enjoy self-reporting mechanisms and when new attributes of CRB data are included in scorecards. Cashflow data sitting within bank accounts and/or digital wallets (e.g. Pochi la biashara), merchant / business pay bill /till number etc can answer this question. Other data that can be used to determine available cash can be approximated based on ‘agri-income estimates. Data here can be varied and might include soil and crop health indices, pest and weed assessments, yield forecasting leveraging etc. Numerous lending and research ecosystems have evaluated and proven the value of these data sources. Importantly, while there can be a wealth of insight into this data, the core insight is a summary of available cash calculation derived by the result of total income less total expenses. This is the ‘minimum viable data’ (MVD).
Helpfully, modeling innovations such as machine learning techniques allow substantial predictive insight to be gleaned from such data and even sparser datasets. Score predictive performance uplift has been observed to improve by as much as 10-15%. Furthermore, explainability and transparency need can be readily re-fitted within the context of a machine learning approach. Leveraging the MVD along with these new analytic techniques can deliver highly effective credit risk ranking tools. Consequently, the true question and work to be done must be to solve the issue of data maturity and accessibility. For maturity, a scalable, data collection, and validation process for alternative data is preferred. There is also an opportunity for financial sector regulators to work with CRBs and financial service providers and put in place measures that allow for expanded coverage to collect alternative data. This will allow for full scale integration into production-level credit risk assessments.
Technology has evolved at a rapid pace, beneficially in the arena of mobile platforms and infrastructure. These mobile/app-based solutions provide simple, sustainable, cost-efficient engagement for data available from a FI population. Orchestration via consumer permissioned data access using mobile technology removes the need for large-scale technology infrastructure that is costly to develop and even more costly to maintain. As with any development, this solution will still require solutions for identity and fraud. Leveraging biometric and AI techniques can contribute to solving these concerns. Further benefit is that the platform offers substantial flexibility, at efficient development cost, for ingesting further data sources as they become beneficial.
Accessing cash flow data using data aggregators via consumer permission has become standard functionality in many markets. Regulatory guidance should be fully defined and established to ensure the consumer comprehends and truly permits data access. This regulatory infrastructure will also serve incremental data sources as they are integrated within the expanding ecosystem. While Kenya continues to make tremendous stride, still, any use of non-CRB data by banks is likely to trigger regulatory scrutiny that delays deployment of resultant solutions. Also, as with machine learning techniques, inadequate experience with these approaches by regulators is likely to be flagged for review, and this often results in delays. Additionally, the need to ensure credit risk analytics continue to enable safe and sound lending practices is paramount for robust model governance. Consideration must especially be given to consumer permission data inclusion such as adverse data selection, disparate impact and treatment concerns.
Way forward
So, can alternative data resolve exclusion? The answer is YES. With expanded coverage, improved and equal access to credit information to include alternative data including access, orchestration and regulatory foresight, FSPs will be better positioned to extend credit to excluded segments at price points that are affordable and scale at a low cost.
Stay informed with regular updates from FSD Kenya