Twenty-First Century Captioning Technology, Metrics and Usability

“[I dream of] an invention that will throw the words spoken directly under the screen”
Emil S. Ladner, Jr. American Annals of the Deaf, 1931 
Gallaudet Alumnus, Teacher at CSD Berkeley

Captioning Usability and Metrics logo

Captioned video is essential for the 36 million Americans who are deaf or hard of hearing. Access to captioned video has a direct impact on participation in society.

In the twenty-first century, video is everywhere: entertainment, news, political engagement, government, schools, postsecondary education, at-home learning, social engagement, and much more. However, captioning has not kept up with the shift from broadcast TV to video that can be produced by anyone. The technology and processes for creating captions are fundamentally the same as in the 1980s and 1990s, and do not serve the needs of consumers today.

View this page in ASL

Today, our personal devices have high-quality screens and can support customized captions. At the same time, automatic speech recognition has much potential to both improve caption quality and the availability of captions for us. We are in the middle of a disruptive transition to captions that can be viewed anywhere, anytime. These new technologies create different types of caption errors, compared with human captioning techniques that have evolved over 40 years. As a result, there has been much consumer frustration.

With these new technologies, it is critical to understand how caption errors impact consumers who rely on captioned video. We need a way to measure whether captions on a video are good enough for consumers, or not. We also need to understand how modern consumer electronics could support better caption usability and viewer experiences.

The Twenty-First Century Captioning Usability & Metrics project has two goals to support the technology transition: First, to develop consumer-focused metrics for caption quality. Second, to improve caption usability on all devices.

We have formed a deaf-led five-year research partnership between Gallaudet University, Rochester Institute of Technology and AppTek to achieve these goals. Our approach embraces the perspective of a diverse range of stakeholders, including consumers, caption providers, broadcasters, and other video distributors. 

Gallaudet University campus. Photographed by Yuqing Deng