Skip to main content

Method for Developing Phonetically Rich and Balanced Lexical Corpus

Potential Commercialised fa-solid fa-cart-arrow-down violet
Reg. ID : 16960
Comments

Description

The various embodiments of the present invention provide a method for developing a phonetically rich and balanced lexical corpus. the method comprises accumulating a plurality of sentences from a target language through a plurality of data sources. The sentences collected from the web-source are raw as it contains unstructured data such as duplication of sentences as well as the presence of alien words, special characters, and boilerplates from the extracted sentences. At least one sentence is selected from the accumulated sentences. The selected sentence is phonetically rich and balanced with relatively small database size. A plurality of selected sentences is evaluated for creating a balanced database. The result of the said method is the phonetically rich and balanced lexical corpus that is only a fraction in size of the initial lexical database.

Contact Person/Inventor

Name Email Contact Phone
Um Centre Of Innovation And Enterprise (Umcie) umcie@um.edu.my 013-2250151 / 03-79677351

Award

Award Title Award Achievement Award Year Received
0 0 0

Comment

LOG IN or REGISTER to post comments.

Star Rating

Star Rating
No votes yet

Contact Form