Imagine reducing call times and overall costs, while not only meeting, but exceeding your customer needs with a genuine increase in the quality of your customer experience.
Every person’s throat, palette and vocal chords are uniquely different and it is these elements that primarily affect the air being expelled from our lungs and out of our mouths when we speak. Starting with the raw audio and waveform generated when we speak, much of which is beyond the human audible range, our voice biometric system processes the audio waveform to extract key features that are unique to each speaker. From this a statistical model and a voice signature is built for each individual person enrolled on the system. When subsequently comparing speech audio, e.g. authenticating a person against their previously enrolled voice signature, the same process is applied and a similarity measure of the key features is obtained, the value of which indicates a pass or fail. There are many different features that can be extracted from a speech signal and speech scientists class them from low level to high level.
Voice biometrics differs from ‘speech recognition’ in that “voice biometrics knows who you are but not what you’re saying” and so is used to authenticate an individual’s identity, whereas “speech recognition knows what you’re saying but not who you are” and thus is used for voice command applications. It is not unusual for the two technologies to be used in tandem but it is important to understand the distinction between them.
When someone mimics another individual’s voice they copy high-level language mannerisms that are sufficient to fool human hearing. Our voice biometrics system exploits the full range of language mannerisms, much of which is outside of the human audible range, created by the speaker’s unique vocal tract. In effect it’s easy to copy the way a person speaks (accent, mannerism) and the human ear might well think that an impersonator sounds very much like the person they’re mimicking, but it’s impossible to alter the way speech is produced (effect of the vocal tract). Thus our voice biometric engine will notice that fundamental features in the voice are not the same.
Biometric systems, including voice, are highly accurate but no pure biometric solution, regardless of modality, can ever achieve 100% performance across all scenarios. Typical accuracy rates for stand-alone voice biometric engines are in the region of 98% - 99% over traditional telephony transmission and approaching 100% where high-definition audio is available, i.e. WiFi/3G/4G. This is why ValidSoft recommends that its voice biometric engine is used in conjunction with its user authentication platform, enabling the system to utilise multiple authentication factors and contextual data, to deliver a highly secure and user friendly experience that’s tuned to the specific application.
Using a recording device to playback another person’s voice is known as a ‘replay attack’. Our voice biometric technology uses a number of different processes to detect such attacks. These range from detecting ‘machine noise’ from the recording and playback device(s), to identifying waveform clipping and compression, missing high and low frequencies that are outside of the human audible range, identical utterance detection and various ‘liveness detection’ methods.
Multi-factor authentication is the means by which our partners use a number of different authentication methods and contextual data, some visible and some invisible, layered together to deliver a stronger authentication solution. Simply entering a password, as an example of single factor authentication, has been proven to be a weak solution. Therefore, a company may continue to use a password but coupled with a biometric. Using a biometric, such as voice, and other contextual data (such as device ID, user profile data, and transaction information) you can deliver a secure authentication that’s also a user friendly experience.