INDICS > Press Center > Top News

Language the next frontier of artificial intelligence

By  : INDICS Operator Updated  :   2018-08-06 10:05:47

Source: China Daily

In the near future, news anchors face the prospect of being replaced by virtual anchors while you won't realize it at all, through artificial intelligence, including the process of lip motion synthesis, speech synthesis, joint modeling of audio and video, and in-depth learning.



The virtual anchor was launched by Sogou — China's fourth-largest internet company by users after Baidu, Alibaba and Tencent. Wang Xiaochuan, chief executive officer of Sogou, showed how the world's first virtual anchor works during the RISE technology conference in Hong Kong earlier this month.

After training and more than an hour calculating the video and audio materials from the news anchor by the machine, a piece of synthetical news video was played during Wang's speech at the conference, entitled "The next frontier of artificial intelligence".

"Lip-reading recognition involved in the virtual anchor can be widely applied in other scenarios too," said Wang. The machine could tell one's words by recognizing lip movements without recording the voice.

Wang gave another demonstration of the combination of personalized speech synthesis and emotional transference. With 14 minutes' training data of Wang's voice and the music composition of a recent popular song My Skateboard Shoes — a synthetic version of the same song — Wang's edition was produced, including the original melody and his tones and language style.

"Language is the future of AI," Wang said.

Beijing-based Sogou, which went public on the New York Stock Exchange in November last year, is China's second-largest search engine after Baidu. It also operates Sogou Input Method and Sogou browser. "As a search company, we're good at AI because we are clear about application scenarios and the input and output of information," said Wang.

According to Sogou, its mobile keyboard processed an average of 280 million voice requests per day. "We've already achieved 98 percent of the Chinese speech recognition rate. Meanwhile, as China's largest voice input engine, Sogou Input Method helps us collect a huge amount of corpus and user behaviors," Wang said.

Wang explained in his speech there are two aspects of language in AI — one is natural interaction which allows free communication between people and machines through images and voice, and the other is knowledge calculation, including conversations, questions and answers and translations.

He believes that through natural interaction and knowledge calculation, communication without language boundaries could be achieved while challenges exist as well.

"How intelligent the machine could be is what we are thinking about," Wang said. Assistant dialogues, which allow machines to generate response according to different people, are being developed by Sogou.

This year, the company remains focused on language-centered AI technologies, such as translation, voice and computer vision.

In the company's first fiscal year following its listing on the NYSE, which raised $585 million, Sogou reported a 26-percent increase in year-on-year profit to $94.4 million for the first quarter of this year, which was above market expectations.

Sogou has forecast revenues of between $296 million and $305 million for the second quarter.

Among the services provided by Sogou, search business generates more than 80 percent of the revenue, followed by sales of smart hardware products and internet value-added services.

"Based on Sogou's input method and its search engine, we're expanding our capacity in natural interaction and knowledge computing," said Wang.

In January this year, Sogou came out with the Smart Recording Translator supporting recording, transcription, translation and interpretation of both real-time and recorded conversations.

The company followed it up two months later with the launch of a portable travel translator — an AI-powered translator that's capable of 24 languages offline translation, image translation recognition and filtering out and suppressing background noise. Exclusively available on JD.com, the units were sold out in the first round of orders with total sales exceeding 10 million yuan.

"Focusing on language, we will launch four AI-powered devices this year. And, we plan to expand our unique content in key verticals like healthcare and laws to make the machines more intelligent," Wang said.


We recommend using 1200*768 and above to have better experience.Chrome and Firefox web browsers are preferred.

Copyright@ 2020 ,All Rights Reserved  Jing ICP Bei No. 05067351-2  JGWAB 1101082014254