Restricted domain malay speech synthesizer using syntax-prosody representation

Sabrina Tiun, Rosni Abdullah, Tang Enya Kong

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

The speech synthesis approach required in restricted domain speech application is a synthesizer that has high quality like the speech output of 'slot-filler' approach but have at least the least flexibility of the 'genuine' speech synthesizer. Thus, in this research study, we propose an alternative approach of creating a speech synthesizer to be used in a restricted domain speech application. In our approach, we use word unit as the primary unit and our speech corpus is represented by syntax-prosody tree structures. Speech synthesis is performed by constructing a syntax-prosody tree of a target input sentence. The construction of the tree is by done by adapting an examplebased syntactic parsing approach and the concatenated of synthesis units from the constructed tree nodes will be the synthesized utterance. For evaluation, we performed MOS subjective evaluation on our speech synthesizer with natural speech and two other Malay TTS system. Based on an ANOVA and T-Tests analysis, we found the overall MOS scores of our speech synthesizer output, sound B was (mean = 3.34, sd = 1.10), the other two Malay TTS system; C (mean = 1.95, sd = 0.72) and D (mean = 1.80, sd = 1.04) and the natural speech, A (mean = 4.71, sd = 0.21). We conclude that our Malay speech synthesizer sounded more natural, easier to listen, more pleasant and more fluent compared to the sounds of the other two Malay TTS systems. As expected, the recorded speech was perceived more natural than the output of our Malay speech synthesizer.

Original languageEnglish
Pages (from-to)1961-1969
Number of pages9
JournalJournal of Computer Science
Volume8
Issue number12
DOIs
Publication statusPublished - 2012

Fingerprint

Speech synthesis
Acoustic waves
Syntactics
Analysis of variance (ANOVA)
Fillers

Keywords

  • Malay speech synthesis
  • Restricted domain speech synthesis
  • Syntax-prosody representation

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

Restricted domain malay speech synthesizer using syntax-prosody representation. / Tiun, Sabrina; Abdullah, Rosni; Kong, Tang Enya.

In: Journal of Computer Science, Vol. 8, No. 12, 2012, p. 1961-1969.

Research output: Contribution to journalArticle

Tiun, Sabrina ; Abdullah, Rosni ; Kong, Tang Enya. / Restricted domain malay speech synthesizer using syntax-prosody representation. In: Journal of Computer Science. 2012 ; Vol. 8, No. 12. pp. 1961-1969.
@article{45c645566e624741abf0e6bd0b56fefc,
title = "Restricted domain malay speech synthesizer using syntax-prosody representation",
abstract = "The speech synthesis approach required in restricted domain speech application is a synthesizer that has high quality like the speech output of 'slot-filler' approach but have at least the least flexibility of the 'genuine' speech synthesizer. Thus, in this research study, we propose an alternative approach of creating a speech synthesizer to be used in a restricted domain speech application. In our approach, we use word unit as the primary unit and our speech corpus is represented by syntax-prosody tree structures. Speech synthesis is performed by constructing a syntax-prosody tree of a target input sentence. The construction of the tree is by done by adapting an examplebased syntactic parsing approach and the concatenated of synthesis units from the constructed tree nodes will be the synthesized utterance. For evaluation, we performed MOS subjective evaluation on our speech synthesizer with natural speech and two other Malay TTS system. Based on an ANOVA and T-Tests analysis, we found the overall MOS scores of our speech synthesizer output, sound B was (mean = 3.34, sd = 1.10), the other two Malay TTS system; C (mean = 1.95, sd = 0.72) and D (mean = 1.80, sd = 1.04) and the natural speech, A (mean = 4.71, sd = 0.21). We conclude that our Malay speech synthesizer sounded more natural, easier to listen, more pleasant and more fluent compared to the sounds of the other two Malay TTS systems. As expected, the recorded speech was perceived more natural than the output of our Malay speech synthesizer.",
keywords = "Malay speech synthesis, Restricted domain speech synthesis, Syntax-prosody representation",
author = "Sabrina Tiun and Rosni Abdullah and Kong, {Tang Enya}",
year = "2012",
doi = "10.3844/jcssp.2012.1961.1969",
language = "English",
volume = "8",
pages = "1961--1969",
journal = "Journal of Computer Science",
issn = "1549-3636",
publisher = "Science Publications",
number = "12",

}

TY - JOUR

T1 - Restricted domain malay speech synthesizer using syntax-prosody representation

AU - Tiun, Sabrina

AU - Abdullah, Rosni

AU - Kong, Tang Enya

PY - 2012

Y1 - 2012

N2 - The speech synthesis approach required in restricted domain speech application is a synthesizer that has high quality like the speech output of 'slot-filler' approach but have at least the least flexibility of the 'genuine' speech synthesizer. Thus, in this research study, we propose an alternative approach of creating a speech synthesizer to be used in a restricted domain speech application. In our approach, we use word unit as the primary unit and our speech corpus is represented by syntax-prosody tree structures. Speech synthesis is performed by constructing a syntax-prosody tree of a target input sentence. The construction of the tree is by done by adapting an examplebased syntactic parsing approach and the concatenated of synthesis units from the constructed tree nodes will be the synthesized utterance. For evaluation, we performed MOS subjective evaluation on our speech synthesizer with natural speech and two other Malay TTS system. Based on an ANOVA and T-Tests analysis, we found the overall MOS scores of our speech synthesizer output, sound B was (mean = 3.34, sd = 1.10), the other two Malay TTS system; C (mean = 1.95, sd = 0.72) and D (mean = 1.80, sd = 1.04) and the natural speech, A (mean = 4.71, sd = 0.21). We conclude that our Malay speech synthesizer sounded more natural, easier to listen, more pleasant and more fluent compared to the sounds of the other two Malay TTS systems. As expected, the recorded speech was perceived more natural than the output of our Malay speech synthesizer.

AB - The speech synthesis approach required in restricted domain speech application is a synthesizer that has high quality like the speech output of 'slot-filler' approach but have at least the least flexibility of the 'genuine' speech synthesizer. Thus, in this research study, we propose an alternative approach of creating a speech synthesizer to be used in a restricted domain speech application. In our approach, we use word unit as the primary unit and our speech corpus is represented by syntax-prosody tree structures. Speech synthesis is performed by constructing a syntax-prosody tree of a target input sentence. The construction of the tree is by done by adapting an examplebased syntactic parsing approach and the concatenated of synthesis units from the constructed tree nodes will be the synthesized utterance. For evaluation, we performed MOS subjective evaluation on our speech synthesizer with natural speech and two other Malay TTS system. Based on an ANOVA and T-Tests analysis, we found the overall MOS scores of our speech synthesizer output, sound B was (mean = 3.34, sd = 1.10), the other two Malay TTS system; C (mean = 1.95, sd = 0.72) and D (mean = 1.80, sd = 1.04) and the natural speech, A (mean = 4.71, sd = 0.21). We conclude that our Malay speech synthesizer sounded more natural, easier to listen, more pleasant and more fluent compared to the sounds of the other two Malay TTS systems. As expected, the recorded speech was perceived more natural than the output of our Malay speech synthesizer.

KW - Malay speech synthesis

KW - Restricted domain speech synthesis

KW - Syntax-prosody representation

UR - http://www.scopus.com/inward/record.url?scp=84880098602&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880098602&partnerID=8YFLogxK

U2 - 10.3844/jcssp.2012.1961.1969

DO - 10.3844/jcssp.2012.1961.1969

M3 - Article

AN - SCOPUS:84880098602

VL - 8

SP - 1961

EP - 1969

JO - Journal of Computer Science

JF - Journal of Computer Science

SN - 1549-3636

IS - 12

ER -