Voice is the mainstay of many mobile telecommunications service providers, providing the bulk of their revenues alongside SMS. However, voice services have come under siege with the growth of OTT - Over The Top service providers - who provide similar services at a fraction of the cost.
As a channel of communication, voice does not lend itself easily to innovation, and as a result, it has lagged behind data and USSD. This could change, thanks to an interesting quirk in how we talk.
The human voice carries a unique signature with it. Every human being in the world has a different intonation, accent, pronunciation and other voice qualities that make it an ideal candidate for use in authentication and verification. With growth in the fintech space driven by the opportunity that financial inclusion for the underbanked presents, it's curious that we haven't seen the adoption of voice biometrics in service differentiation.
Currently, leading fintech products in the microcredit and transaction processing space require access to a personal terminal such as a phone or card to transact.
In March of this year Google opened up its Cloud Speech API to third party developers covering over 80 languages. This move will shake up how voice APIs are used, and is likely to challenge the dominance of Nuance, which has close ties to IBM and is at the core of many popular voice-linked services.
My imagined application of this technology goes beyond transcription and processing. Mashed up with acoustic fingerprinting technologies that power services such as Shazam and SoundHound, voice recognition API’s can change the service experience for money transfer, banking and commerce as we know it.
The simplified step by step play would look like this. First the mobile consumer would be asked to opt in and sign up for the new voice-based identification service that promises flexibility and added security while performing critical financial transactions. Part of the onboarding process will be a simple sampling of the user’s voice, reading out a phrase or set of numbers in a language of their choosing.
Secondly, day to day conversations would be randomly sampled at lengths of no more than 5 seconds each to continuously build out a highly fine-tuned signature that is encrypted in storage.
Thirdly, the check out or fulfillment process can take three forms - where there is a stored PIN, where a one-time transaction code is presented, and where a personalized random phrase is given based on language preferences that were captured when onboarding.
For the stored PIN, the verification call to the consumer will simply ask them to repeat the code sent to their phone. The same could also be done using a one-time code presented on screen at a PoS terminal or vendor's mobile phone. The verification call will ask for the user to read the code. For the random phrase, the consumer may be asked to repeat a word or number set.
The verification call does not need to be received on the consumers own handset. Imagine a scenario where your phone is out of charge or has just been stolen, perhaps along with your wallet or purse. Here is the underpinning value - even if you would know my PIN, one time access code or the phrase that I have been served, it is near impossible to replicate my exact voice print with the nuances of accents and others mentioned earlier.
With the dynamics of authentication and verification streamlined and unshackled from personal terminals and devices we can now start to re-imagine financial services that are truly universally accessible.