Once a video is transcribed, it appears in a collapsible window below each player. Not only is all the text visible to search engines, and thus should help drive more search traffic to individual videos, but the text is all time-stamped. So you can click on any sentence and it will jump to that point in the video. Anytime somebody cuts and pastes a portion of the transcript in a blog or other site, a link back to that point in the video is also included. The startup tried doing a Flash wrapper before for the YouTube player. It completely reworked its technology into what it is now calling the SpeakerBar that is more of a transcript plug-in that detects any video on your site that has a matching plug-in. SpeakerText works with video players from YouTube, Brightcove, and Blip.tv, and there is also a WordPress plug-in.
Below is a video explaining how it works, with a SpeakerBar underneath. Click on any sentence to jump to that party of the video.
SpeakerText uses a combination of speech-to-text software, natural language processing, and crowdsourced human labor to create each transcript. Video publishers submit videos they want transcribed. Using open source speech-to-text software called Sphinx-4 developed at Carnegie Mellon University (where co-founder Matt Swanson studied artificial intelligence), the videos get a rough first pass. (The other founders are CEO Matt Mirales and Tyler Kieft). These then get broken up into 5 to 8-second chunks, which are distributed to to human transcribers via Mechanical Turk.
The humans correct the text and punctuation in a digital assembly line, going through their micro-tasks quickly and efficiently. Different workers get ranked based on their work history, which helps in the assignment process. The transcribed video chunks are then pulled back together and reassembled into the complete video, with speech recognition software aligning the text to the video and adding time stamps. Natural language processing software is then used to determine where sentences begin and end, and to create meta tags for more SEO goodness.
This entire assembly line process is designed with feedback loops to get better and more automated over time. The service starts at $20 a month for the SpeakerBar, plus $2 per minute for the transcriptions. That is competitive with other transcription services, which seem to start at the $3 to $5 per minute range, but you also get the SpeakerBar. The lower SpeakerText can get their rates, the broader it’s appeal will be.
 
 8
8
8
8Authors: Erick Schonfeld
 Le principe Noemi concept
		    			Le principe Noemi concept			   
			 Astuces informatiques
		    			Astuces informatiques			   
			 Webbuzz & Tech info
		    			Webbuzz & Tech info			   
			 Noemi météo
		    			Noemi météo			   
			 Notions de Météo
		    			Notions de Météo			   
			 Animation satellite
		    			Animation satellite			   
			 Mesure du taux radiation
		    			Mesure du taux radiation			   
			 NC Communication & Design
		    			NC Communication & Design			   
			 News Département Com
		    			News Département Com			   
			 Portfolio
		    			Portfolio			   
			 NC Print et Event
		    			NC Print et Event			   
			 NC Video
		    			NC Video			   
			 Le département Edition
		    			Le département Edition			   
			 Les coups de coeur de Noemi
		    			Les coups de coeur de Noemi			   
			 News Grande Région
		    			News Grande Région			   
			 News Finance France
		    			News Finance France			   
			 Glance.lu
		    			Glance.lu			   
			
 








