How Will We Meet the 2006 Captioning Requirements?
Editor: I’ve recently been involved in discussions of captioning issues with some of our local TV stations, and thinking about captioning from their perspective has raised a LARGE red flag! How are we going to meet the demand for captioners in January 2006, when FCC regulations require that 100% of new programming (with a few exceptions) be captioned? This situation is NOT similar to previous increases in captioning requirements (from 25% to 50% to 75%), as this article explains.
The good news is that beginning in January 2006, 100% of all new television programming (with a few exceptions) will be required to be captioned. This means that you should be able to turn on your TV set, select any channel, and have the program be captioned (provided it is produced after January 2006). This includes all local news, weather, and sports (no more teleprompter captioning that displays only what the reporters read from the teleprompter); it also includes all sporting events, local interest programming, city council coverage, etc. And of course it includes all emergency programming, the captioning of which has been the subject of several FCC complaints from across the country in the past year.
The bad news is that there are not nearly enough real time captioners available to provide the required services. We would have a problem if the increase in required captioning were only 33% (which is the increase from the current requirement to the new 100% requirement). But the increase in required captioning is really much more than 33%, as we’ll explain.
Pending Captioner Shortage
Thankfully the days of having to watch a particular TV program because it’s the only one on that’s captioned are past. But as you select your TV viewing over the next couple of weeks, notice what is captioned and what isn’t. Most of the captioned programming is the material that comes from the networks; most of the new programming that isn’t captioned is locally produced. So the 2006 captioning requirement really means that all the local stations will have to caption the programming that they are producing.
During prime time the vast majority of stations are broadcasting network programming that is captioned by the networks. At 8PM, most of the stations meet the captioning requirement because of the efforts of a handful of captioners working the network programming.
But what about the 6PM local news, or the afternoon coverage of the city council meeting? Each local station will have to provide captioning for that programming. So how many stations are simultaneously broadcasting locally generated programming? If we consider the evening local news, we can estimate that roughly one-third of the stations in the country are broadcasting local news at the same time (based on the three major time zones in the US). According to the FCC there are 1937 TV networks and stations in the US (http://www.census.gov/epcd/cbp/view/us01.txt), which means that roughly 650 of them will require captioning services at the same time.
So how many real time captioners are there? It’s tough to get a good estimate, but the commonly accepted number is between 300 (http://www.ncraonline.org/ppa/fed_initiative/testim-harkin.shtml) and 500 (http://www.ncicap.org/SallyBennett.asp). The clear conclusion is that it’s impossible to meet the January 2006 captioning demand with the current supply of captioners.
It’s also impossible to train new captioners using traditional methods in the remaining time. January 2006 is 16 months away, but it takes three to five years for most new captioners to complete their training, and only a tiny fraction of them are sufficiently skilled to caption real time television programming (http://www.gwsra.com/CaptionArticle.html).
So What’s the Solution?
So, is there no way out of this situation? Are we destined to forego required captioning on some programming because of a shortage of service providers? Possibly. But there is a potential solution emerging using voice recognition technology.
It’s long been a dream of the hearing loss community to have a computer program that listens to a voice and produces a text transcript. The ideal program would be able to transcribe whatever anyone says with 100% accuracy. (Such a program would be “speaker-independent”, because it would work for all speakers.) Some “experts” have been predicting that such a program is “just around the corner” for many years. But this seems to be one of those intractable problems that continues to defy the best efforts of a bunch of smart people who are working on it.
But that doesn’t mean that a judicious application of voice recognition technology to the television captioning issue isn’t worth consideration. It only means that solving the problem isn’t ridiculously easy!
While there are currently no voice recognition programs that provide sufficient accuracy when applied as a speaker-independent solution, there are programs that can meet television captioning requirements when the program is trained to a particular speaker.
A Real Time Voice Recognition Application
This technology is being used every day by the CapTel telephone system (http://www.captionedtelephone.com/). Here’s how that system works:
A person with hearing loss calls a person with normal hearing using a CapTel phone. Behind the scenes, the CapTel phone dials in to the CapTel call center, where a trained CapTel operator assists with the call. The person with hearing loss speaks to the hearing person in a normal fashion. The hearing person also speaks to the person with hearing loss in a normal fashion. So far, it’s just like a normal phone conversation.
The difference is that the CapTel operator is in the loop. She revoices everything the hearing person says into a voice recognition system that is trained specifically to her voice. That system converts her words to text and transmits the text over the phone line to a small display on the CapTel phone. There is a short delay between the time something is said by the hearing person and the time the text shows up on the CapTel display, but it’s short enough that a missed name or number is usually there before it inhibits the conversation from flowing freely.
That’s exactly the technology that can be used to provide television captioning!
Is It Really That Easy?
Well, we don’t really know. Schools that teach voice captioning are just getting started, so there’s not a solid track record to compare to the traditional (steno machine) method. Anecdotal information indicates that people can become proficient in this technology in about six months, rather than the several years required using the steno method. So the possibility is there.
Furthermore, a complete novice can use voice recognition to produce a useful output after just a couple of hours of training! I bought IBM’s Via Voice a few years ago and trained on it for no more than two hours before using it to caption a local ALDA (www.alda.org) meeting. I wasn’t always able to keep up with the speaker word-for-word, and my accuracy rate was probably closer to 90% than the desired 98% or 99% for television captioning. But it was a whole lot better than nothing!
I encourage our local television stations to be proactive in preventing a captioning debacle in January 2006. You can bet that members of the hearing loss community will be watching local programming on January 1 and filing FCC complaints against stations that ignore the new requirements. I’m predicting that there will be thousands of complaints!
I also think that individual stations can take some pretty simple steps to avoid being the subject of these complaints.
One obvious solution is to have each of the on-camera folks take a couple of hours to train a voice recognition system to their voice. Because television personalities tend to speak slowly and clearly, they are natural candidates for voice recognition technology. The station engineers can feed the program audio into the voice recognition software, which will automatically generate captions that can be fed to the caption encoder. The initial accuracy may be only in the 90% range. But the voice recognition programs include very nice ways of identifying and correcting mistakes, and the program soon learns the appropriate text to produce for a particular vocal sequence. Accuracy should rapidly increase.
A second solution is to hire or develop voice captioners in house. As I write this in August 2004 there is plenty of time to get people trained to provide voice captioning services beginning in January 2006 (or even before). This approach has the disadvantage of inserting another person in the loop to revoice what the on-camera people say – much like the CapTel phone strategy. An advantage of this approach is that a person dedicated to providing voice captioning will improve in speed and accuracy faster than a news anchor who views captioning as just one more thing to be concerned about.
I want to be clear about this proposal. I am NOT advocating that a station provide an inferior captioning product if a better one is available. If a station is able to hire people to accommodate their entire captioning needs after January 2006, so much the better. But the pending captioner shortage potentially affects every station in the country. Those that proactively embrace a backup plan NOW will be among the fortunate few who are able to provide serviceable captioning for ALL new programming beginning in January 2006.