Case Study

Eliminating foreign accent bias with video recorded interviews

We used deep learning to improve job applicant videos for an online recruiter, deciphering foreign accents and eliminating bias in the process.

The Client

A technology-enabling company operating in the HR space, facilitating and automating the interview process.

Candidates go through an automated video interview and submit the recording to the company. The video recording of the interview is transcribed and the transcript is anonymized. The transcripts of all candidates are then analyzed and evaluated by HR professionals to determine which candidates to invite to an interview.

The Challenge

The client platform integrates with state-of-the-art speech recognition engines from Google and IBM to automatically transcribe interviews. As one would expect, the quality of a transcript greatly depends on the recording’s quality which is a function of recording equipment and ambient noise. However, it turns out that existing speech-to-text solutions are incapable of processing non-North American accented speech. Many applicants on the platform speak fluent English with an accent, and this impedes the client’s ability to automate the transcribing task.

The Solution

Crater Labs proposed a deep learning solution, whereby generative adversarial networks would be employed to stylize the voice of speakers with foreign accented English, effectively making people sound like North American accented English speakers. In addition to working towards their long-term transcription goals, the Crater Labs developed AI has delivered an immediate benefit by removing background noise from videos, making it easier for human transcribers to hear and understand speakers within the videos.

IP Generated

Crater Labs developed a multi-modal generative adversarial network (GAN) that combines audio with a specialized attention layer that makes use of computer vision to analyze mouth shape and anticipated speech. This network is currently able to de-noise videos by removing background noise which cannot be correlated to the anticipated sounds due to mouth movement.

Benefits & ROI

This project generated more than 100% return on investment from leveraging Canadian investment tax credits alone. This ROI does not account for the savings generated by the developed technology and the potential value of the resulting IP.

Keep Reading

More Moonshots Worth Celebrating

Surfacing and categorizing 3 million+ risks for Fortune 500 businesses

We developed a custom ML model that was able to surface, identify and classify 3 million+ risks pertinent to any specific company and industry.