Eliminating foreign accent bias with video recorded interviews
We used deep learning to improve job applicant videos for an online recruiter, deciphering foreign accents and eliminating bias in the process.


The Client
A technology-enabling company operating in the HR space, facilitating and automating the interview process.
Candidates go through an automated video interview and submit the recording to the company. The video recording of the interview is transcribed and the transcript is anonymized. The transcripts of all candidates are then analyzed and evaluated by HR professionals to determine which candidates to invite to an interview.
The Challenge
The client platform integrates with state-of-the-art speech recognition engines from Google and IBM to automatically transcribe interviews. As one would expect, the quality of a transcript greatly depends on the recording’s quality which is a function of recording equipment and ambient noise. However, it turns out that existing speech-to-text solutions are incapable of processing non-North American accented speech. Many applicants on the platform speak fluent English with an accent, and this impedes the client’s ability to automate the transcribing task.
The Solution
Crater Labs proposed a deep learning solution, whereby generative adversarial networks would be employed to stylize the voice of speakers with foreign accented English, effectively making people sound like North American accented English speakers. In addition to working towards their long-term transcription goals, the Crater Labs developed AI has delivered an immediate benefit by removing background noise from videos, making it easier for human transcribers to hear and understand speakers within the videos.
IP Generated
Crater Labs developed a multi-modal generative adversarial network (GAN) that combines audio with a specialized attention layer that makes use of computer vision to analyze mouth shape and anticipated speech. This network is currently able to de-noise videos by removing background noise which cannot be correlated to the anticipated sounds due to mouth movement.
Benefits & ROI
This project generated more than 100% return on investment from leveraging Canadian investment tax credits alone. This ROI does not account for the savings generated by the developed technology and the potential value of the resulting IP.
Subscribe
Sign up with your email address to receive news and updates
More Moonshots Worth Celebrating
Identifying Potential Security Access Violations
We developed a semi-supervised neural network to identify anomalous security access card swipes in a large corporation with numerous facilities throughout the world, previously hidden secure access violations.
Read More
Minimizing radiation exposure in construction projects
We developed a custom LSTM neural network to predict radiation exposure of construction workers in sensitive sites, improving shift scheduling and safety compliance.
Read More
Energy demand prediction for generation management
We developed a deep reinforcement learning model to assist energy producers in scheduling maintenance and minimizing operating costs.
Read More