0% found this document useful (0 votes)
115 views47 pages

汉语韵律标注(CHIPRO )与韵律结构的预测

Uploaded by

tg2bwqknjs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views47 pages

汉语韵律标注(CHIPRO )与韵律结构的预测

Uploaded by

tg2bwqknjs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chinese Prosodic Transcription

(CHIPROT) and the Prediction


of Prosodic Structure
Hana Třísková ( 廖敏 )

Abstract: This paper introduces a new prosodic transcription called


CHIPROT (based on Hanyu Pinyin), primarily designed for
L2 teaching purposes. It is a tool for transcribing Chinese
utterances and dialogues of colloquial style delivered at a
natural tempo.① CHIPROT may also be used to indicate the
prominence patterns of short phrases (2–3 syllables) called
“minimodules”, or “phonetic chunks”, which draw on
the notion of formulaic language. The features labelled by
CHIPROT are (1) the degree of prominence of particular
syllables (ma, mā, mā, MĀ), and (2) phrasing (prosodic
phrases and prosodic words). CHIPROT was initially
inspired by the system of Professor O. Švarny (1920–2011).
However, it is based on a different phonological analysis
of Chinese stress, whose crucial notions are a normal
syllable and a weakened syllable. The current paper argues
that some features of prosodic structure (particularly

① In this article, I use the term Chinese for Standard Chinese of putonghua type, or Mandarin.
160 韵 律 语 法 研 究 第 九 辑

syllable weakening) can be predicted, and that these


predictions can be built into the transcription procedure.
CHIPROT is graphically simple and highly intuitive. It has
been tested in pedagogic practice. Final version, presented
in this article, has already been systematically applied in a
recently published textbook (Třísková, 2021). CHIPROT
can be used by teachers and compilers of pedagogic
materials and may also be useful for linguists engaged in
research on connected speech.
Keywords: Mandarin; Standard Chinese; phonetics and phonology;
prosody; prosodic transcription; teaching Chinese as a
second language

1. Rationale
When considering the sound structure of this or that language, all literate laymen
probably know about the existence of vowels and consonants. However, they may have
a rather vague idea about features that stretch over vowels and consonants. Usually,
they are not acquainted with the term “prosody”, let alone the term “suprasegmental
features”. Learning that prosody comprises properties such as stress, pauses, intonation,
rhythm, tempo or loudness, they still tend to think that prosodic features are less
important than vowels and consonants, which make up the words and are reflected in the
script. Unfortunately, the same mostly holds for L2 teachers. Chinese language teachers
are no exception: their main suprasegmental concerns are the four lexical tones, tone
sandhi, the neutral tone ( 轻声 ), and disyllabic tone combinations.① Explanations
of the features of connected speech are limited. The same holds for textbooks.

① Important topics proposed for teaching Chinese pronunciation are suggested in Třísková
(2017a) in English, and Třísková (2017b) in Chinese.
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 161

Although audio files are included as an integral part of most current textbooks,
clarification of various prosodic phenomena students may hear there (e.g. undershooting
of tonal targets) is mostly missing. In addition, except in early lessons (the number
depends on the particular textbook), sentences are usually presented in Chinese
characters, while Hanyu Pinyin is placed elsewhere. In classroom teaching, teachers are
usually satisfied if students read the sentences character by character, with full tones
(“scripted speech”). Their main pedagogic goal is proper character recognition, good
initials, finals, and tones. Thus, the phonetic form of the resulting utterance sounds as
a sequence of isolated words mechanically aligned side by side like beads on a string,
deprived of the utterance-level prosody.
However, utterances in natural L1 speech are more than strings of fully pronounced
words. It is not advisable to neglect prosody in L2 (Chinese) teaching, as prosodic
features have vital functions in oral communication. One result of neglecting prosodic
features may be that students speak like robots. Their speech often has no rhythm,
wrongly placed breaks, no changes of syllable prominence, erroneous intonation
patterns, no reflection of information structure, etc. To help students speak more
naturally from the early stages of learning, we can:
1. Give them basic instruction on the prosodic features of spoken Chinese.
2. Provide the characters hand in hand with Hanyu Pinyin notation of sentences
wherever needed and for as long as needed.
3. Furnish Hanyu Pinyin with certain graphic marks rendering major prosodic
features: stress, and grouping (phrasing). This may be called prosodic transcription.
In this paper, I will introduce my proposal for prosodic transcription, called
CHIPROT (Chinese Prosodic Transcription), which was primarily designed for
language teaching purposes. Its tentative versions were presented in Hong Kong in June
2018 (my talk at the Chinese University of Hong Kong) and in Beijing in May 2019
[my talks at the Institute of Linguistics of the Chinese Academy of Social Sciences,
at Beijing Language and Culture University (hereafter referred to as “BLCU”), and
at Capital Normal University]. The final version of CHIPROT was introduced at two
162 韵 律 语 法 研 究 第 九 辑

online conferences in 2021 (ICPG-7 invited speech, and CASLAR-6 workshop).①

2. Initial Remarks About CHIPROT


2.1 Sources of Inspiration

Previous systems of prosodic annotation developed for the Chinese language


include Mandarin ToBI and Chinese ToBI (C-ToBI). The ToBI (Tones and Break
Indices) system was originally developed by Janet Pierrehumbert and Mary Beckman
in the early 1990s to transcribe intonation, accent, and prosodic boundaries in American
English. It was conceived within the Autosegmental-Metrical Phonology framework
(Silverman et al., 1992; Beckman & Ayers, 1994). Later on, ToBI systems were designed
for other languages. Research on the Chinese ToBI systems was carried out at the Ohio
State University (Mary Beckman, Marjorie Chan). Mandarin ToBI was proposed in
1999 (Peng et al., 2005: 230, 250). Yet another system, C-ToBI (Chinese ToBI), was
developed in the Institute of Linguistics of the Chinese Academy of Social Sciences.
The first version appeared in 1996, the third in 2002; see Li (2002) and Li & Zu et al.
(2007).
However, to the best of my knowledge, no attempt has yet been made to develop
a comprehensive system for practical Chinese language teaching. The only exception
seems to be the prosodic transcription of Professor Oldřich Švarný ( 史瓦尔尼 ,
1920–2011), Czech linguist, phonetician, and my teacher. Švarný developed a rather
sophisticated non-electronic transcription system over the course of several decades. He
introduced his concept in two articles (Švarný, 1991a, b; Švarný’s articles published in
English and German can be found in the volume Uher & Slaměníková, 2019). Švarný’s
transcription system is based on Hanyu Pinyin. It annotates two major prosodic features:
degree of stress for particular syllables (six degrees altogether), and phrasing (prosodic
phrases and prosodic words). Švarný’s prosodic analysis and his transcription system

① ICPG-7 is the 7th International Conference on Prosodic Grammar, organized by Faculty of


Linguistic Sciences, BLCU, April 17–18, 2021; CASLAR-6 is the 6th International Conference
on Chinese as a Second Language Research, organised by the George Washington University,
Washington D.C., USA, July 30–August 1, 2021.
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 163

were applied in two large non-electronic corpora (subsequently digitized). Both corpora
were recorded by a native Beijing speaker and prosodically annotated later on. They
represent a rather unique source of material to date.
The larger corpus is related to a voluminous dictionary of Chinese morphemes,
titled A Learning Dictionary of Modern Chinese (in Czech; Švarný, 1998–2000). Next
to grammatical analysis of individual morphemes, it comprises about 16,000 example
sentences illustrating the usage of particular morphemes in context. The sentences were
recorded over the course of several months in 1969 (Švarný employed a single speaker:
Beijing-born Mrs. Tang Yunling Rusková 唐 云 凌 ). The recordings were prosodically
annotated by Švarný over the following six years (the work was finished in 1976).

Figure 1 Sample of Švarny’s transcription: seven example sentences illustrating the use of the

verb yǒu 有 (A Learning Dictionary of Modern Chinese, 1998–2000: 71). Note that this

version of Švarny’s transcription is slightly different from the version shown in Figure 2.

A smaller and newer corpus forms part of a university textbook, Grammar of


Spoken Chinese in Examples (in Czech; Švarný et al., 1991–1993). The textbook
comprises 260 paragraphs treating particular phenomena of Chinese grammar. These
phenomena are illustrated by several thousand example sentences. The sentences
were recorded in 1990–1991 (by Mrs. Tang Yunling Rusková again) and subsequently
prosodically annotated.
164 韵 律 语 法 研 究 第 九 辑

Figure 2 Sample of Švarny’s transcription: example sentences illustrating the use of the verb

yǒu 有 “to have” (Grammar of Spoken Chinese in Examples, Švarny et al., 1991–1993: 46)

Švarný’s transcription system served as a major source of inspiration for CHIPROT.


My analysis of syllable prominence is conceived differently, though (see below). Also,
the graphic form of the two systems is different. Švarný was severely limited by his use
of a typewriter, while CHIPROT may take advantage of electronic fonts and graphics.
While developing CHIPROT, I used Švarný’s second corpus (Švarný et al., 1991–1993).
I compared his transcripts of numerous sentences with my own transcripts. As noted
above, the corpus was recorded by a single native speaker. Mrs. Tang Yunling, being a
rather lively and spontaneous person, recorded the sentences in quite an easy, natural
manner far from slow classroom speech. However, she had lived in Prague, outside of
her native country, for a large part of her life. This fact certainly had a bearing on her
pronunciation, which may have become fossilized in some respects over the years. Thus,
I also had to validate the wider applicability of CHIPROT using other speech materials. I
have transcribed a number of dialogues from several textbooks: HSK Standard Course 1
(Jiang, 2014), Integrated Chinese (Liu et al., 2017), and the Czech Textbook of Chinese
Conversation (Uher et al. 2007, 2016). I selected the samples carefully, choosing
natural tempo audio recordings and avoiding slow classroom speech. The results
indicate that CHIPROT can be used to transcribe current colloquial Mandarin without
any problems.
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 165

2.2 CHIPROT Design and Purpose

CHIPROT marks two prosodic features: degree of prominence of particular


syllables, and prosodic phrasing. Intonation is not annotated, since it would
overburden the transcription. My concept works with only two basic intonation patterns:
falling and non-falling (Třísková, 2021). These are generally predictable from grammar
(falling pattern for finished statements, question-word questions, alternative questions,
and A-not-A questions; non-falling pattern for non-final prosodic phrases, particle ma
吗 questions, and grammatically unmarked questions. A double question mark can be
possibly used in the last two cases, e.g. Nǐ-qù-ma?? 你去吗? Nǐ-qù?? 你去? Tone
sandhi is reflected in CHIPROT, e.g. Ní-hǎo ( 你好 ).

Example of CHIPROT:
这辆汽车是我们的第一辆汽车。“This car is our first car.”
plain Hanyu Pinyin: Zhè liàng qìchē shì wǒmende dì yī liàng qìchē.
CHIPROT transcription: Zhè-liàng qìchē // shì-wǒmende dì-YÍ-liàng qìchē.

CHIPROT prosodic transcription was designed to be as iconic, user-friendly, and


intuitive as possible, thus requiring no complicated instructions for readers. However,
in order to decipher the graphics and read the utterances adequately, readers require
preliminary basic tuition on prosodic features.① This comprises, for example, explaining:
• The means of manifestation of syllable prominence (such as expanding/
compressing pitch range, lengthening/shortening syllable duration, and segmental
reductions in unstressed syllables) and their interplay with the four tones.
• The means of manifestation of prosodic boundaries, besides silent pause (especially
the final lengthening at the end of prosodic phrases and finished utterances).
• Basic intonation patterns (falling and non-falling) and their interplay with the
four tones.
After receiving such instruction, students may be able to read transcribed
utterances with relative ease as they listen to the audio (or even without the audio later

① Of course, solid knowledge of the basics, such as proper pronunciation of disyllabic tone
combinations, or proper pronunciation of T0, is tacitly expected.
166 韵 律 语 法 研 究 第 九 辑

on). Transcribed utterances, in combination with audio recordings, can serve as a good
basis for practice and for subsequent work on removing errors with the help of a teacher.
Thus, natural speech production can be learned. More advanced students can use the
same resources to practice speech perception ( 听力 ): they may listen to the audio
recordings and attempt to transcribe the utterances themselves, using CHIPROT. Such
practice may help them become aware of various prosodic features of connected speech
they had not previously noticed.
My major ambition is to offer CHIPROT to teachers and compilers of pedagogic
materials. They can use it to transcribe common colloquial sentences/dialogues
according to the audio recordings. Of course, mastering the transcription procedure
inevitably comprises getting acquainted with the prosodic features of connected Chinese,
and with the essential principles of CHIPROT. Attaining a certain degree of practical
experience with transcription is yet another necessary condition for a satisfactory
outcome.

2.3 CHIPROT Utilization to Date

The CHIPROT system has been tested in teaching practice for several years. Since
2017, I have been using it in my courses on Chinese prosody (Charles University in
Prague) for second-semester students. Introducing the system at this level seems to be
most appropriate: students already know the basics (the initials, the finals, the tones,
basic vocabulary and grammar, simple sentences and phrases). At the same time, they
have not developed fossilized errors. After finishing the course I always distribute a
questionnaire asking students to verbalize their impressions. Students’ feedback on the
course in general and CHIPROT in particular has been very positive①.
Further, CHIPROT has already been systematically applied in structured teaching
material. I used it to transcribe about 80 example sentences and short dialogues in the
textbook Speak Chinese with Ease: Prosody of Colloquial Chinese (Třísková, 2021,

① The feedback from the year 2023: “CHIPROT is intuitive. It makes sense and gives me an
insight into the principles.” (P.M.) “For me CHIPROT is an unbeatable transcription. It is easy,
logical and intelligible.” (P.T.) “I am fully satisfied with CHIPROT. It is better than Pinyin.” (L.J.)
“CHIPROT is great, it definitely eases reading.” (A.M.K.)
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 167

in print in Czech; the English translation is in progress; the forthcoming textbook was
introduced at the CASLAR-6 conference in 2021). These sentences and dialogues
with audio recordings, illustrating various prosodic phenomena, were selected from
the above-mentioned textbook Grammar of Spoken Chinese in Examples (Švarný et
al., 1991–1993). Švarný’s prosodically annotated corpus is a rich resource offering
examples of prosodic phenomena in many different contexts. However, it is structured
according to grammatical topics. Thus, the examples of particular prosodic phenomena
had to be laboriously retrieved in the corpus.
In what follows I will describe the CHIPROT annotation conventions used to mark
syllable prominence and prosodic units.

3. Syllable Prominence
CHIPROT assumes four degrees of syllable prominence. The theoretical basis
of my prominence concept can be found in Třísková (2020). The crucial notions are
a normal syllable, and a weakened syllable. That is, I view the weakening of normal
syllables as a major issue in examining Chinese stress. Note that except for the highest
degree (emphasis/contrastive stress), I do without the term “stress” in my prominence
scale. However, for convenience, I use the common term “stressed syllable” when
speaking of a phonetically salient (relatively prominent) syllable. Similarly, I use the
term “unstressed syllable” for non-prominent, phonetically weak syllables. After all,
stress is a relative matter. As Feng Shengli points out in his ICPG-7 paper:
   Stress is not a sound, it is a relationship ( 重音不是“音”,重音是关系 )…
Stress “
( 重”) only exists in relation to non-stress “
( 轻”)…If we are looking for
stress, we should not look for a stressed item, but for a relative prominence ( 如果
找重音,不是看哪儿重,而是看哪儿有“相对凸显”的关系 ). As for whether
this relationship is reached by enhancing [a particular syllable], or by other means,
it is secondary ( 至于该关系是用“加重”或其他手段来表现,则是第二位的 )
(Feng, 2021: 7).
Feng is speaking of “other means” of expressing relative prominence. We may infer
168 韵 律 语 法 研 究 第 九 辑

that syllable weakening may be the most important of these other means. For instance,
an iambic pattern may be attained either by keeping the first syllable fully pronounced
and enhancing the second syllable, or by weakening the first syllable and keeping the
second syllable fully pronounced (without enhancing its prominence). We may add that
the second solution is “cheaper” in terms of articulatory effort and thus may be preferred
by speakers. Feng’s approach to stress is perfectly in line with my own.
What is a meaningful number of stress degrees? Švarný assumed six categories,
which might be too many. His categories were in a way abstract constructs arising from
the combination of two features: 1. Degree of tone fullness; 2. Presence/absence of
stress; see Třísková (2011).
C-ToBI and Mandarin ToBI suggest four degrees of stress. Mandarin ToBI accepts
the following “stress levels” in the stress tier (Peng et al., 2005: 255):
S3: syllable with fully realized lexical tone
S2: syllable with substantial tone reduction, e.g. undershooting of tonal target with
duration reduction
S1: syllable that has lost its lexical tonal specification, e.g. in a weakly-stressed
position
S0: syllable with lexical neutral tone, i.e., inherently unstressed syllable
Like C-ToBI and Mandarin ToBI, CHIPROT also proposes four degrees of
prominence. However, they are conceived differently (see section 3.1). The additional
“top-prominence” syllable is more prominent than S3. On the other hand, S2 and S1 are
collapsed into the “weakened syllable” degree. The reasons for collapsing weak-tone
syllables and neutralized syllables are explained in section 3.5.

3.1 Degrees of Syllable Prominence and Their Graphic Form

CHIPROT is based on Hanyu Pinyin. For expressing syllable prominence it gets


along with mere two graphic distinctions: bold vs. non-bold letters, and lower-case vs.
upper-case letters. As regards the fonts, note that the current form of CHIPROT uses
italics. Giving Hanyu Pinyin in italics is a common practice in texts printed in languages,
which use the Latin alphabet. This practice helps readers identify Hanyu Pinyin words at
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 169

first sight and distinguish them from the surrounding text. However, Chinese textbooks
and pedagogic materials do not follow this custom. They are accustomed to using some
sort of sans-serif typeface.
The four degrees of syllable prominence are marked as follows:①
MĀ top-prominence syllable (the most prominent syllable of a prosodic phrase/
utterance; see section 3.6)
tentative Chinese term: 强音节
mā normal syllable (ordinary syllable with full tone; see section 3.3)
tentative Chinese term: 常音节
weakened syllable (the tone is either weakened, or even completely neutralized;
mā 
see section 3.5)
tentative Chinese term: 弱音节
ma toneless syllable (morpheme without a lexical tone; see section 3.4)
tentative Chinese term: 无调音节 / 无调语素

3.2 Marking Syllable Prominence: Main Principles

• A minority of Chinese morphemes are toneless, i.e., they do not have a lexical
tone (de 的 , le 了 , ba 吧 , etc.). I call them 无调语素 . They are “unstressed” by default,
as they have no lexical tone which would give them the potential to become “stressed”.
• An absolute majority of Chinese morphemes are tonal, i.e., they have a lexical
tone. I call them 有调语素 . Lexical tone gives them the potential to become prominent,
“stressed”. This potential may or may not be exploited in connected speech. Tonal
morphemes may either realize as normal syllables, as weakened syllables, or as
enhanced syllables.
• Chinese tonal morphemes generally strive to be realized with full, perceptible
tones in connected speech, because tones distinguish lexical meanings of morphemes.
• Quite a few tonal morphemes/syllables may become weakened in connected
speech ( 弱音节 ). Their tone is either weak yet still perceivable ( 弱调音节 ) or completely

① Most of the Chinese terms appearing below emerged from extensive discussions with
Professor Cao Wen at BLCU in October 2011.
170 韵 律 语 法 研 究 第 九 辑

neutralized ( 失调音节 ). Syllable weakening does not happen without a reason.


• Syllables which do not have any perceivable tone are called atonic syllables ( 不
带 调 音 节 ). Their pitch is dependent on the tone of the preceding tonal syllable (see,
e.g. Lee & Zee, 2014: 375). Atonic syllables comprise neutralized syllables (which have
lost their tone in connected speech), and toneless morphemes/syllables (which never
had a lexical tone). In Švarný’s corpus around 30% of syllables were atonic. About half
of them were toneless syllables and the other half were neutralized syllables (Švarný,
1991a: 210).
• In connected speech some tonal morphemes/syllables may be prosodically
enhanced, becoming rather prominent ( 强音节 ). Usually, there is only one such syllable
in the utterance/prosodic phrase. The word which carries it is called a nucleus in the
present study. Quite often these cases can be predicted – from grammar, information
structure, pragmatics, etc.
Predictions of the prominence degree of particular syllables can be built in the
transcription procedure as the first step, making the process faster and more theoretically
justified. Thus, a draft of a transcript can be prepared even before listening to the audio.
This draft may serve as a basis for further corrections of the transcript, based on careful
listening and examination of the utterance by means of software for speech analysis
such as PRAAT.
Below we take a closer look at particular degrees of prominence: normal syllables
(section 3.3), toneless syllables (section 3.4), weakened syllables (section 3.5), and top-
prominence syllables (section 3.6).

3.3 Normal Syllables (mā)

I call syllables realized with ordinary full tone normal syllables 常音节 . Cf. Chao
Yuan Ren’s “normal stress” 正常重音 (Chao, 1968: 35). See also Třísková (2020: 82-83).
Normal syllables are not overly prominent (“stressed”). They just carry full distinguishable
tone. In CHIPROT, normal syllables ( 常音节 ) are printed in bold, carrying a tone mark
(mā). They may be viewed as a default form of tonal morphemes. As a starting point,
we may assume that all tonal morphemes would be realized as fully pronounced normal
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 171

syllables in connected speech (pedagogic practice often stops at this point). In turn, the
first step of the transcription procedure is the representation of all tonal syllables in bold
type, and of course with a tone mark. For instance, in this sentence the only syllable
which must not be in bold is the toneless lexical suffix zi 子 :
Zhuōzi shàng yǒu sān běn shū. 桌子上有三本书。
“There are three books on the table.”
Sometimes it is hard to find any difference in prominence in neighbouring tonal
syllables, as may be the case for the word huāpíngr 花瓶儿 “vase” in the following
utterance:
Bǎ-huāpíngr // fàng-zài ZHUŌzi-shàng. 把花瓶儿放在桌子上。
“Put the vase on the table.”
Yet this does not pose a problem, because the adjacency of normal syllables is
regarded as acceptable/common/natural. It is not viewed as some sort of stress clash.
Thus, CHIPROT liberates the transcriber from the enforced pursuit of stressed syllables
in cases where phonetic material does not offer clear support for such an evaluation.

3.4 Toneless Syllables (ma)

Chinese has a number of toneless morphemes which do not have a lexical tone (无
调语素 ). This group is fully predictable from the lexicon – the syllable carries no tone mark
in dictionaries. In CHIPROT, toneless syllables are printed in non-bold type, carrying
no tone mark (ma). Toneless morphemes are “unstressed” by default – their weak
realization is basically predictable. Sometimes they may be prolonged in final position,
i.e., at the end of a prosodic phrase or utterance (the well-known phenomenon of phrase-
final lengthening is more or less universal in languages). Their loudness may also be
non-negligible. They may even display pitch movement. Thus, such syllables may
sometimes sound rather conspicuous. However, the roots of this sort of conspicuousness
do not lie in prominence (“stress”) structure. Rather, such syllables serve as carriers of
emotional or pragmatic meanings.
We shall distinguish two major groups of toneless items: monosyllabic toneless
function words, and second syllables in some types of disyllabic words.
172 韵 律 语 法 研 究 第 九 辑

3.4.1 Monosyllabic Toneless Function Words (the Clitics)


structural particles: de 的 , de 得 , de 地
verb aspect particles: le 了 , zhe 着 , guo 过
sentence-final particles: ma 吗 , ne 呢 , a 啊 , le 了
Particles are always “unstressed” (although they may become rather conspicuous
at the end of an utterance). They are always tightly attached to the preceding word. I
borrow the general term “clitics” for this group of Chinese function words.
3.4.2 The Second Syllable in T+T0 Disyllabic Words
While the majority of disyllabic Chinese words have both syllables tonal (e.g.
xuéxiào 学校 “school”), there are also words whose second syllable is inherently toneless,
thus carrying no tone mark in dictionaries, e.g. fu 腐 in the word dòufu 豆腐 “bean
curd”. The normative dictionary Xiandai Hanyu Cidian (abbreviated XHC) prints these
words with a dot between the syllables and no lexical tone mark on the second syllable:
dòu·fu.
This group of words is far from homogeneous. Below are the major cases:
• words with lexical suffixes, e.g. zi 子 as in háizi 孩子 “child”, tou 头 as in gútou
骨头 “bone”, me 么 as in shénme 什么 “what”, men 们 as in rénmen 人们 .
• names of relatives created by reduplication: māma 妈妈 “mom”, gēge 哥哥
“elder brother”
• words with disyllabic morphemes (rare): pútao 葡萄 “grapes”, bóli 玻璃 “glass”
• other nouns, verbs, and adjectives: shìqing 事情 “matter”, péngyou 朋友 “friend”,
míngbai 明白 “to realize”, gàosu 告诉 “to tell”, róngyi 容易 “easy”, piàoliang
漂亮 “beautiful”
In some cases, minimal pairs can be found (Li, 1981):
对头 duìtou “enemy” (XHC duì·tou) duìtóu “correct” (XHC duìtóu)
东西 dōngxi “thing” (XHC dōng·xi) dōngxī “east and west” (XHC dōngxī)
地道 dìdao “genuine” (XHC dì·dao) dìdào “tunnel” (XHC dìdào)
Note that I do not place in the T+T0 group those cases which cannot be retrieved
in the XHC dictionary with a toneless second syllable. If a disyllabic item is not present
in XHC, I keep a tone mark on the second syllable. I view it as a weakened syllable.
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 173

Examples are:
• reduplicated of monosyllabic nouns: tiāntiān 天天 “every day”
• reduplicated of monosyllabic verbs: kànkàn 看看 “take a look” (see section 3.5.4)
• verbs with direction complements: zuòxià 坐下 “sit down” (see section 3.5.5)
• verbs with some resultative complements: kànjiàn 看见 “spot” (see section 3.5.6)
• verbs with complements expressing a short action: zuòyíhuìr 坐一会儿 “sit for a
while”

3.5 Weakened Syllables (mā), Commonly Weakened Morphemes

Besides toneless words/morphemes/syllables, connected speech delivered at


a natural tempo contains quite a few underlyingly tonal syllables which become
weakened to a greater or lesser degree in particular contexts ( 弱音节 ). They are shorter
and their tone becomes less conspicuous (as a result of a compressed pitch range and
short duration), or sometimes even completely neutralized. The segments of weakened
syllables may have their articulatory targets undershot, and some segments can even
completely disappear in rapid casual speech. The overall number of weakened syllables
in an utterance may vary according to speech style, communication situation, speech
rate, the individual speaker, his/her dialectal background, etc. The largest occurrence
of weakened syllables can be observed in fast colloquial speech used in everyday
communication in Beijing-type Mandarin.
Note that in weakened syllables I do mark the underlying lexical tone, even if
the syllable becomes completely neutralized – for instance, in kànkàn 看看 , dǎkāi 打
开 , zhèlǐ 这里 , huílái 回来 , tīngbùdǒng 听不懂 , qiǎokèlì 巧克力 . There are several
reasons for this:
1. The tone may reappear in other contexts (e.g. dǎkāi 打开 vs. dǎbùkāi 打不开 ).
2. There is no need for the transcriber to decide whether the tone is “still there” or
“completely gone”. In fact, this is not so important – it may be regarded only as a sort of
phonetic detail in most cases. After all, there is no clear border between a weakened tone
and a neutralized tone. As stated above, the degree of phonetic reduction is a continuum.
On the other hand, what really matters is the presence or absence of the lexical tone –
174 韵 律 语 法 研 究 第 九 辑

this is a fundamental phonological feature reflected in dictionaries.


3. Students may appreciate the information about the underlying tone.①
In general, phonetic weakening is a result of semantic weakening (Liang, 2003).
Both phonetic and semantic weakening form a continuum. Let us look at the weakening
continuum now.
• At one extreme point of the weakening continuum there are morphemes that
became purely formal in the course of time, losing their original meaning completely
(e.g. the structural particle de 的 , or the lexical suffix zi 子 ). Such extreme semantic
weakening finally resulted in permanent loss of tone. Permanent loss of meaning/
tone is thus a product of language change. The morpheme was gradually turned into a
toneless morpheme, entering the lexicon as such.
• Next there are morphemes/words which have an underlying lexical tone, yet are
obligatorily pronounced as atonic in specific grammatical contexts. An example is the
negative bù 不 in potential forms of verbs – for example, tīngbùdǒng 听不懂 “be unable
to understand”. Bù 不 has a lexical tone; however, it must be pronounced as atonic in
this context. Another case is the second syllable in reduplicated monosyllabic verbs. An
example of this is kànkàn 看看 “take a look”. The second item should be pronounced as
atonic, although it still has an underlying T4.
• In some cases, we may observe “only” a strong inclination to weak pronunciation
in particular contexts, yet weakening is not obligatory (see below, commonly weakened
morphemes). If such morphemes/words are weak, they may or may not be completely
atonic. Typical examples are monosyllabic personal pronouns, such as tā 他 . The syntactic
function of such pronouns also matters, increasing the multifariousness of the weakening
continuum. If tā 他 functions as a subject, standing at the beginning of the utterance, it
may often be weakened, yet it keeps at least the remnants of T1. Consider, for example,
the sentence: Tā bù rènshi wǒ. 他不认识我。“He does not know me.” On the other
hand, if tā 他 functions as an object, it will be prone to completely atonic pronunciation:

① I admit that the presence of tone mark on fully neutralized syllables may sometimes be
confusing. Yet I decided that the above arguments for the use of tone mark on such syllables are
sufficiently compelling.
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 175

Wǒ bù rènshi tā. 我不认识他。“I do not know him.” There may be contexts where tā
他 restores its full tone or even becomes emphasized: Bù shì wǒ, shì tā! 不是我,是他!
“It is not me, it is him!”
• At the other extreme of the weakening continuum there are content words. They
weaken their meaning/tone(s) only occasionally. This happens, for instance, when
the word is repeated and has no substantial semantic importance in the given context.
Yet such a word tends to keep remnants of tone(s), resisting complete neutralization
(though that may certainly happen in fast, sloppy speech). Content words are least prone
to prosodic weakening, representing the least conspicuous and least frequent/stable/
predictable cases.
In some Chinese monosyllabic words/morphemes with lexical tone, the inclination
to become weakened in connected speech is higher than in other words/morphemes.
They may be weakened rather frequently (in some cases even obligatorily), yet most of
them may occasionally gain prominence and even become strikingly prominent. I will
tentatively call these items commonly weakened morphemes (CWMs).①
Speaking of “commonly weakened morphemes”, the meaning of the word
“commonly” needs to be explained. Importantly, the sources/motivations for this
“common” weakening are not accidental or arbitrary. Weakening is mostly rule-
governed (Třísková, 2020). Thus, many cases of weakening can be predicted – from
grammar, phonology, lexicon, information structure, or pragmatics.
The members of the CWM group share one important feature: a general inclination
to become weakened. This entitles us to establish them as a specific group worth
investigating. Nevertheless, the CWM group is rather heterogeneous. The members
of the group display different grammatical properties, including different degrees of
freeness-boundness. Clearly, the same phonetic surface form (weak/neutralized tone)
may have different sources, rooted in different linguistic levels. We may also observe a
different degree of inclination to become weakened (this may even hold for members of

① Chinese morphemes (yǔsù 语素 ) can be either free, representing a monosyllabic word, or


bound, being a part of a compound word. Similarly, CWM can either be an independent word (such
as tā 他“he”), or a bound morpheme (such as xiān in xīnxiān 新鲜 “fresh”).
176 韵 律 语 法 研 究 第 九 辑

the same subgroup, such as the cliticoids; see section 3.5.1).


Phonetically, the CWM group (monosyllabic tonal function words in particular)
is rather complicated, as the morpheme/word in question may comprise at least two
possible phonetic forms: strong and weak (cf. English “words with weak forms”, see
Třísková, 2016: 131; 2017c). This has pedagogic consequences – namely, that students
have to master both forms. These phonetic forms may sound quite different. A fully
pronounced tā 他 with noticeable aspiration, an open vowel, long syllable duration,
and a clear, high T1 ([thaː]1) sounds quite dissimilar from a weakened tā 他 with slight
or even no aspiration, a centralized vowel, short syllable duration, and a neutralized tone
([tə]).
The CWM group definitely deserves attention (Třísková, 2016; 2017c; 2020),
as the low degree of prominence of its members is rather common. What is more,
the prosodic behavior of these items displays a good deal of predictability. Thus, a
deeper investigation of the CWM group inventory and the properties of individual
members may help us understand some important principles of natural speech rhythm
in colloquial Mandarin. Importantly, predictions can also be built into the CHIPROT
transcription procedure. This is precisely the reason why I deal with the CWM group in
detail in this article.
An overview of the main cases/subgroups of commonly weakened items follows
below. There are 14 groups altogether, addressed in sections 3.5.1 to 3.5.14. I proceed
from the level of morphemes and words to the levels of phrase and finished utterance.
3.5.1 Monosyllabic Tonal Function Words (the Cliticoids)
Monosyllabic tonal function words tend to become weakened in connected speech.
However, they may occasionally gain prominence if emphasized. For example, the
adverb dōu 都 “all” may be weakened, if it only has a formal meaning, while in other
contexts it may retain its full semantic content, being rather prominent. Compare:
Lián TĀ-dōu lái-le. 连他都来了。 dōu [dɔ] “Even he came.” (construction lián-dōu)
Tāmen DŌU lái-le. 他们都来了。 dōu [doʊ] “All of them came.” (emphasized dōu)
I call this group of words the cliticoids 类附着词 (this term was coined in Třísková,
2016: 134; 2017c: 34; 2020: 94). They are:
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 177

• three personal pronouns wǒ 我 “I”, nǐ 你 “you”, tā 他 “he” / 她 “she”


• classifiers, measure words: gè 个 , běn 本 , jiān 件 , tiáo 条 ; xiē 些 …
• monosyllabic prepositions: zài 在 , gěi 给 , dào 到 , bǎ 把 …
• monosyllabic postpositions: shàng 上 , xià 下 , lǐ 里
• the verbs shì 是 , zài 在 , existential yǒu 有
• monosyllabic conjunctions: gēn 跟 , hé 和 …
• formal adverbs: jiù 就 , hěn 很 , dōu 都 , yě 也
• monosyllabic modal verbs: huì 会 , xiǎng 想 , yào 要 …
• the verbs lái 来 , qù 去 with another verb
Note that not all monosyllabic tonal function words belong to the cliticoids (e.g.
zhè 这 “this”). Further, there are some Chinese function words which are disyllabic
(e.g. the personal pronoun wǒmen 我们 “we”, the postposition lǐmiàn 里面 “inside”, the
conjunction kěshi 可是 “but”). These also do not belong to the cliticoids.
The last group – the verbs lái 来 , qù 去 – has been newly added to the group of
the cliticoids (Třísková, 2020: 95 only lists 8 groups). These verbs may be used with
another verb (before or after it), meaning “to go to”, “to be about to”, “in order to”. In
this case, they regularly become weakened. They often do not need to be translated at
all, being mainly formal. For instance:
Wǒmen-liǎ lái-TǍOlùn-tǎolùn-ba. 我们俩来讨论讨论吧。
“Let us discuss it.”

Wǒ-yòng hǎo-HUÀ qù-quàn-tā //, tā-yě-bù-TĪNG. 我用好话去劝他,他也不听。


“I used nice words to persuade him, but he still would not listen.”

Tā-dào-pùzi mǎi-RÒU-qù-le. 他到铺子买肉去了。


“He went to the store to buy meat.”

As mentioned above, the group of cliticoids is not homogeneous. The degree of


semantic weakness, or “functionalness”, is not identical in all cliticoids. Consequently,
the degree of willingness to become weakened is not identical either. Take the classifier
gè 个 for example. Its affiliation to the category of function words could hardly
178 韵 律 语 法 研 究 第 九 辑

be questioned. It may be emphasized only rarely. On the other hand, modal verbs, the
adverbs such as dōu 都 , etc., may retain some degree of prominence more often. Their
affiliation to the cliticoid group may perhaps raise some questions. For the present
classification, the major criterion is consistent fading of the semantic content of the item
(and subsequent phonetic weakening) in many/most contexts.
3.5.2 Two Neighboring Monosyllabic Tonal Function Words
Sometimes two cliticoids (monosyllabic tonal function words) occur together. This
often happens at the beginning of a sentence. The examples are:
bǎ-tā 把他 “him”
gěi-tā 给他 “to him”
tā-zài 他在 “he at”
nǐ-jiù 你就 “you then”
jiù-shì 就是 “then is”
tā-hěn 他很 “he very”
Usually, both items form a disyllabic prosodic word. The first FW (function word)
receives weak prominence, while the second FW is completely atonic. The result is an
inconspicuous trochee, where the first item is just slightly prominent. I neglect this in
transcription in order not to overburden the CHIPROT graphics, writing both items as non-
bold (and of course with a tone mark). For instance, bǎ-tā 把他 in the following utterance:
Bǎ-tā jiào-dào WǑ-zhèr-lái. 把他叫到我这儿来。
“Call him to me.”
Only if the first item sounds clearly prominent, I put it in bold:
Bǎ-tā jiào-dào WǑ-zhèr-lái.
Note that in some prosodic words of this type the first FW has no grammatical
relationship with the second FW, e.g. tā-hěn 他很 “he very”. How can such a prosodic
word be formed? The requirements of rhythm may sometimes override the grammar,
causing a word to break away from its grammatical mate and “desert” to the preceding
monosyllabic word, saving it from standing alone. Monosyllabic prosodic words are
generally undesirable. Further, similar length prosodic words is more welcome than
extremes, i.e., very short or very long prosodic words. Thus, 1+3 is conveniently turned
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 179

into 2+2 in the following utterance:


Tā-hěn CŌNGming. 他很聪明。
“He is (very) clever.”
3.5.3 Words Favoring the Trochee Pattern
The majority of the Chinese lexicon consists of disyllabic words. Consequently,
accentuation of disyllabic words represents a major issue in the process of evaluating
the prominence of particular syllables of speech flow. A small part of the disyllabic
lexicon is comprised of words with a toneless second syllable, which have a stable
trochee pattern (e.g. háizi 孩子 “child”; see section 3.4.2). However, the majority
of the Chinese lexicon is comprised of words with two tonal syllables (e.g. huǒchē 火
车 “train”). For general discussion about their accentuation, see section 3.5.13. In the
present section we shall be concerned just with one specific group.
Some Chinese disyllabic words with two tonal syllables inherently favor the
trochee pattern in most contexts, even prepausally (Wang & Chu, 2008). The
pronunciation of the second syllable is most often completely atonic (that is, the pattern
is 重 轻 , while 重 中 is less common). There may be several dozen such words. The
normative dictionary Xiandai Hanyu Cidian (XHC) marks them by a dot between the
two syllables and a tone mark on the second syllable:
XHC: my system:
做法 zuò·f⛝ zuòfǎ “method”
这里 zhè·lǐ zhèlǐ “here”
新鲜 xīn·xiān xīnxiān “fresh”
因为 yīn·wèi yīnwèi “because”
底细 dǐ·xì dǐxì “details”
刚刚 gāng·gāng gānggāng “just”
力量 lì·liàng lìliàng “strength”
Note that Beijing speakers regularly pronounce such words with the neutral tone
on the second syllable, see the dictionary Beijinghua Qingsheng Cihui (Zhang, 1957).
For words with a stable neutral tone on the second syllable (dò[Link] 豆腐 ) see section
3.4.2.
180 韵 律 语 法 研 究 第 九 辑

There are many other words which may actually belong to this group, although
XHC does not recognize them as such. That is, they are printed without a dot between
both syllables, e.g. cuòwù 错误 “mistake”, sùdù 速度 “speed”, yuànwàng 愿望 “wish”
(Wang, 2016: 32, Třísková, 2020: 93).
3.5.4 Second Syllable in Many 3–4 Syllabic Words
Accentuation of 3–4 syllabic words (which represent a rather small proportion
of the Chinese lexicon) is relatively stable. In most of them, the second syllable is
pronounced in the neutral tone. The last syllable tends to be the most prominent.
• huǒchēzhàn 火车站 “train station”
• shuǐmòhuà 水墨画 “ink painting”
• búxiùgāng 不锈钢 “stainless steel”
• qiǎokèlì 巧克力 “chocolate”
• Xīshuāngbǎnnà 西双版纳 “the region Xishuangbanna”
• zībénzhǔyì 资本主义 “capitalism”
3.5.5 Second Syllable in Reduplicated Monosyllabic Verbs
Both monosyllabic and disyllabic Chinese verbs may be reduplicated to express a
short, finished action. If a monosyllabic verb is reduplicated, the second syllable should
be pronounced in the neutral tone:
• kànkàn 看看 “take a look”
• shuōshuō 说说 “talk about”
• tīngtīng 听听 “listen”
• chángcháng 尝尝 “taste”
The numeral yī 一 “one” may be inserted between both components: kànyīkàn 看
一看 . The pronunciation of yī is also atonic.
3.5.6 Directional Complements
Directional complements attached to a Chinese verb describe the direction of an
action (up, down, away from the speaker, towards the speaker, etc.). The complement
may either be monosyllabic (e.g. lái 来 in huílái 回来 ) or disyllabic (e.g. chūlái 出来 in
kànchūlái 看出来 ). Directional complements should be pronounced in the neutral tone.
This holds both for monosyllabic and disyllabic directional complements:
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 181

• huílái 回来 “come back here”


• huíqù 回去 “go back”
• kūqǐlái 哭起来 “start crying”
• zǒuchūqù 走出去 “walk out”
• kànchūlái 看出来 “figure out”
If a verb with a directional complement is turned into a potential form, the complement
gains prominence. (see section 3.5.8).
3.5.7 Some Resultative Complements
Chinese resultative complements indicate the result of an action. Most resultative
complements retain the original meaning of the morpheme, being fully pronounced, e.g.
hǎo 好 (zuòhǎo 做好 “to accomplish successfully”), wán 完 (shuōwán 说完 “to finish
speaking”), cuò 错 (xiěcuò 写错 “to write wrongly”), bǎo 饱 (chībǎo 吃饱 “to eat until
full”). However, some resultative complements are exceptions to this rule. They have
already become formalized and should be pronounced in the neutral tone: jiàn 见 , zhù
住 , dào 到 , zháo 着 , diào 掉 , sǐ 死 , kāi 开 (Lin, 1957: 71).
• kànjiàn 看见 “see, notice”
• jìzhù 记住 “remember”
• xiǎngdào 想到 “give a thought”
• shuìzháo 睡着 “fall asleep”
• chīdiào 吃掉 “eat completely”
• qìsǐ 气死 “be angry to death”
• dǎkāi 打开 “open”
If a verb with a resultative complement is turned into a potential form, the
complement is always prominent (see section 3.5.8).
3.5.8 The Negative bù 不 in Potential Forms of Verbs
In verbs with directional or resultative complements (e.g. chūlái 出来 “to come out”)
the negative bù 不 may be inserted between the verb and the complement, indicating
the impossibility of accomplishing the action (chūbùlái 出不来 “be unable to come
out”). The negative bù 不 should be pronounced in the neutral tone. On the other hand,
directional/resultative complement becomes fully pronounced, restoring its original tone
182 韵 律 语 法 研 究 第 九 辑

(e.g. chūlái vs. chūbùlái).


• kànbújiàn 看不见 “be unable to see”
• zǒubúdòng 走不动 “be unable to walk”
• nábúzhù 拿不住 “be unable to grasp”
• dǎbùkāi 打不开 “be unable to open”
• chībùliǎo 吃不了 “be unable to eat everything”
If the toneless structural particle de 得 is inserted instead of bù 不 , the meaning
is reversed: “to be able to accomplish the action”. The prominence pattern remains the
same as above:
• kàndejiàn 看得见 “be able to see”
• dǎdekāi 打得开 “be able to open”
3.5.9 Question Words Used as Indefinite or Relative Pronouns
Question words such as shénme 什么 “what”, shéi 谁 “who”, shéide 谁的 “whose”,
wèishénme 为什么 “why”, etc., if used in their original questioning function, are usually
the most salient item in the utterance (see section 3.6.5). However, they may be used as
indefinite pronouns or relative pronouns in other contexts. In this case they become
weakened, although they may retain some degree of prominence and the remnants of
tone (or sometimes even full tone).
For instance, the question word shénme 什么 “what” can be used as an indefinite
pronoun “anything, something”:
Ní-MǍI-shénme //, wǒ-CHĪ-shénme. 你买什么,我吃什么。
“I will eat anything you buy.”
The question word jǐ 几 “how many” may be used in the meaning “a couple of”:
Wǒmen jiā-lǐ zhí-yǒu jǐ-zhī-YÁNG //, MÉI-yǒu-niú. 我们家里只有几只羊,没有牛。
“Our family only has a couple of sheep, we do not have cows.”
The question word duōshǎo 多少 “how many/much” can be used in a negative
sentence, meaning “not much”:
Wǒmen-MÉI-yǒu duōshao-qián. 我们没有多少钱。
“We do not have much money.”
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 183

3.5.10 Other Tonal Morphemes Which Tend to Become Weakened


There may be yet other cases of tonal morphemes/words which tend to become
weakened in connected speech, for instance:
• the prefix dì 第 (dì-yī 第一 “the first”, dì-èr 第二 “the second”...)
• yīxià 一下 “at once” after the verb (kànyīxiā 看一下 “take a glance”)
• …de shíhou 的时候 “when” (tā lái de shíhou 他来的时候 “when he came”)
3.5.11 The Negative bù 不 in A-not-A Questions
A-not-A questions (see section 3.6.4) represent one type of Chinese question. The
verb/adjective is repeated, while the negative bù 不 is placed between the two items.
If the verb/adjective is monosyllabic, the negative bù 不 should be pronounced in
the neutral tone. The whole structure forms one prosodic word:
• lái-bù-lái? 来不来? “come or not?”
• chī-bù-chī? 吃不吃? “eat or not?”
• qù-bú-qù? 去不去? “go or not?”
• kàn-bú-kàn? 看不看? “look or not?”
• hǎo-bù-hǎo? 好不好? “good or not?”
Note that if the verb is disyllabic, the rhythmic pattern changes: the negative bù
不 assumes a certain prominence, standing as the first item of a new prosodic word.
Repeated verb is rather weak – for instance, Ní-XǏhuan bù-xǐhuan? 你喜欢不喜欢?
“Do you like it or not?” See section 3.6.4.
3.5.12 Monosyllabic Verbs Followed by an Object
A monosyllabic verb may become weakened if it is followed by an object:
xiě 写 “write”
Nǐ-hái-xiě bóKÈ-ma? 你还写博客吗?
“Do you still update your blog?”

3.5.13 The Second Syllable in Non-final Disyllabic Tonal Words


Before dealing with this topic let us make some general remarks about accentuation
patterns in Chinese disyllabic tonal words. Examining “word stress” (词重音) in Chinese
words consisting of two tonal syllables (such as huǒchē 火车 “train”), linguists have not
184 韵 律 语 法 研 究 第 九 辑

reached any considerable consensus so far. I assume two underlying patterns (Třísková,
2020: 92):
1. the spondee pattern (“equal-stress pattern”, 重 重 , 等 重 , 轻 重 不 分 ), with
the iamb pattern ( 中重 , 右重 ) as a variant. Note that I do not recognize any need to
establish the iamb as an independent pattern. I view the difference between the spondee
and the iamb as a phonetic detail. The iamb pattern is mostly induced by the prepausal
position (i.e., a post-lexical factor).
2. the trochee pattern ( 重轻 , 左重 ), regardless of the degree of second syllable
weakening (it may be either atonic or weakened, yet it is atonic in most cases).
This solution is quite similar to the analysis of disyllabic stress patterns in Beijing
Mandarin presented in Wang & Feng (2006). They describe two patterns: 左重 (trochee)
and 右重 (iamb). While the 左重 (trochee) pattern always has a weaker second syllable,
in the 右重 (iamb) pattern the weaker first syllable is not a rule (that is, both syllables
may sometimes have equal prominence). Regarding lexical stress, the authors recognize
only one underlying pattern: trochee, or 左 重 , which includes 轻 声 词 and 带 调
左 重 词 . All other disyllabic words are argued not to have lexical stress at all ( 不 是
左重的双音节词没有词重音 ). They may either be realized as 右重 (iamb) or have
both syllables of equal prominence ( 其左右音节可以看作轻重不分或差不多 ). The
authors finally conclude: “[Only] the trochee pattern can be viewed as lexical stress; all
other stress patterns are induced by factors coming from elsewhere than the lexicon.”
( 左重为词汇重音,非左重形式由词汇以外因素决定 ) Most recently, Feng Shengli
(Feng, 2021: 7) has proposed “a new definition of lexical stress in colloquial speech
style”, taking into account speech style and word frequency, which influence the actual
surface stress pattern of a word. He claims that it is impossible to find a solution for
Chinese lexical stress without taking these factors into account. Feng challenges current
theories of lexical stress, seeing problems in the very understanding of what “stress” as
such is. Feng points out again that stress is a relationship ( 重音是关系而不是音体 ).
He wonders whether the Chinese colloquial speech style has word stress at all.
Regarding the distribution of the realization/surface patterns of disyllabic tonal
words, I generally distinguish two major situations:
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 185

– Some words favor the trochee pattern. These were treated in section 3.5.3.
– The majority of Chinese disyllabic tonal words may have more or less variable
accentuation. Disyllabic words with two tonal syllables often keep full tones on both
syllables in connected speech. Sometimes it is hard to tell which syllable is more
prominent. To put it another way, the word assumes a spondee pattern ( 等重 , 重重 ).
Occasionally, the second syllable may be more prominent, the word thus assuming the
iambic pattern ( 中重 , 右重 ). This may happen especially before a pause, as a result of
final lengthening. The spondee (/iamb) pattern may be viewed as a default pattern
of disyllabic tonal words (except for those treated in section 3.5.3, such as 做法 zuò·f⛝
“method”).
Yet in some situations this pattern may be modified. Under the pressure of rhythm
and syntactic/information structure, the second syllable may be pronounced with a weak
or even neutral tone, the whole word assuming the trochee pattern ( 重轻 ).
yīgòng 一共 “altogether”
Yígòng DUŌshao-qián? 一共多少钱?
“How much is the total?”

Weakening of the second syllable is dependent on both context and inherent


properties of the word, such as its inner morphological structure. Speech style and
frequency of the word in speech matter as well (Feng, 2021). The more common and
colloquial the word, the stronger the inclination to resort to the trochee pattern. The
more preserved or perceived the original meaning of the morphemes, and the more
formal the speech style, the stronger the inclination to keep the default spondee pattern
with full pronunciation of both morphemes (or, in other words, the lower the inclination
to weaken the second syllable). For instance, in the word zhīdào 知 道 “to know”
the original meaning of both morphemes (“to know” and “way”) is hardly perceived
nowadays. The pronunciation of the second morpheme is most often atonic. XHC writes
the word as zhī·dào.①

① Note that if the verb zhīdào 知道 is preceded by the negative bù 不 , the pattern is changed:
zhīdào, but bú-zhīdào.
186 韵 律 语 法 研 究 第 九 辑

The context, as said above, is an important factor determining the actual word
accentuation. If the word is in prepausal position and/or focused, it tends to keep the
original full prominence of the second syllable. On the other hand, we can observe that
when the word in question is not followed by a break – that is, if “something follows”
– weakening of the second syllable often occurs, though definitely not always (Wang &
Chu, 2008: 143). Let us give a few more examples of non-final disyllabic tonal words
realized as trochees:
hángkōng 航空 “aviation”
Shì-HÁNGkōng-xìn-ma? 是航空信吗?
“Is it an airmail letter?”

jīdàn 鸡蛋 “eggs”
Jīdàn HÉN-hǎochī. 鸡蛋很好吃。
“The eggs are (very) tasty.”

fàngxīn 放心 “relieved”
Wǒ-fàngxīn-DUŌ-le. 我放心多了。
“I was greatly relieved.”
This phenomenon is, among other things, undoubtedly related to syntax, as prosodic
structure and syntactic structure are to a large extent interrelated. As Feng (2019a,
2019b) points out, the interaction between syntax and prosody is bidirectional: prosody
not only constrains syntactic structures, but also activates syntactic operations. These
phenomena are also treated in the textbook (Feng & Wang, 2018). Concerning the
relationship between grammar and prosody, see also Lin (1962) and Švarný & Uher
(2014).
It must be pointed out that Chinese disyllabic tonal words are not equally ready to
surrender to such pressures and shift to a trochee pattern. The phenomenon of variable
stress patterns in disyllabic words was analysed by Oldřich Švarný back in the 1970s
(Švarný, 1974). Švarný’s description of such variability was based on a large corpus of
utterances (Švarný, 1998–2000; see section 2.1). He collected numerous tokens of the
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 187

same word type, examining their prominence patterns in various contexts.① By means
of descriptive statistics he established seven “accentuation types” of disyllabic words,
based on their degree of willingness to weaken the second syllable, i.e., their inclination
for the trochee pattern.② For instance, the word dòufu 豆 腐 “bean curd” belongs to the
extreme type (1) with a 100% inclination for trochee (1). The word zuòfǎ 做法 “method”
is the next type (2), with a strong inclination to trochee (2). The following four types (3,
4, 5, 6) display different degrees of variability. The last type (7) includes words such as
lǎoshī 老师 “teacher”, whose willingness to be realized as a trochee is very low. Note
that Švarný was concerned with “non-stress” in disyllabic words, instead of “stress”; cf.
the notion of “ 以轻显重 ”, mentioned in Feng (2021: 14).
Švarný did not explore the conditions and contexts of second syllable weakening in
detail. He only observed the tendency for iambs to occur at the end of a rhythmic unit,
and for trochees to occur at the beginning or inside a rhythmic unit (Švarný, 1998–2000:
xxxviii). This topic is examined, for example, in Wang & Chu (2008). Elucidating
the conditions for weakening of the second syllable in disyllabic words in connected
speech is a task which remains for future research. In any case, Švarný’s early analysis
of accentuation patterns in disyllabic words was fairly ahead of its time. The current
analysis of Feng seems to have much in common with Švarný: “The typical form of
lexical stress in colloquial speech style is atonic or weak pronunciation [of the second
syllable]; these two have nothing to do with ‘stress’ or ‘enhancement’” ( 口语词重音的

典型形式是“轻声”或“轻读”,二者均非“重音”或“加重”) (Feng, 2021: 7).


Important treatments of lexical stress can be found for instance in Kratochvil (1974);
Yin (1982); Wang et al. (2003).
As we know, the issue of “word stress” in Chinese remains unresolved to date.
Variability of accentuation in disyllabic words may be viewed as proof of the non-
existence of word stress in Chinese across the lexicon (Třísková, 2020: 87–94).

① Note that Švarny never used a computer.


② The number of accentuation types was later reduced to five in the textbook Švarny (1991–
1993) (vol. I., p. 176).
188 韵 律 语 法 研 究 第 九 辑

3.5.14 Contextual Weakening of Full Words


A content (autosemantic/lexical/full) word may become phonetically weakened in
speech if it is not semantically important in the given context. Such sort of weakening is
related to information structure and pragmatics.
• Content word may become weakened if repeated:
qìchē 汽车 “the car”
Zhè-liàng qìchē // shì-wǒmende dì-YÍ-liàng qìchē. 这辆汽车是我们的第一辆
汽车。
“This car is our first car.”
• Content words following the focus tend to be weakened (post-focus weakening):
lái 来 “to come”
Tāmen DŌU-lái-le. 他们都来了。
“All of them came.”
This phenomenon may seem quite natural and not hard to understand. It can be
found in many languages. However, it turns out that it is rather difficult to grasp in
pedagogical practice. Students may be reluctant to weaken tones in content words, even
though the information structure requires it. They tend to believe that the meaning of the
whole utterance could be damaged if the tones are not conspicuous enough. In fact, the
opposite is true: if the content word is not weakened in a particular context, the utterance
may sound awkward. While the meanings of single words are clear, the communicative
meaning of the whole message may be blurred or even hard to comprehend. The second
occurrence of the word qìchē 汽车 (the first sentence) may serve as an example. If it is
pronounced just as fully as the first qìchē, the utterance sounds odd.

3.6 Top-prominence Syllable (MĀ)

The most prominent syllable of an utterance/prosodic phrase usually belongs to


a content word. Function words bear such prominence less frequently. Theoretically,
only tonal function words can be “stressed” in Chinese. In our analysis, the word which
carries the most prominent syllable is called the nucleus.
Can the most prominent syllable in an utterance/prosodic phrase be predicted? I
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 189

shall treat the following situations: default nucleus, emphasis, particle ma 吗 questions,
A-not-A questions, question-word questions, and alternative questions.
3.6.1 Default Nucleus
In so-called “neutral” speech① without any special emphasis (broad focus), the greatest
prominence seems to rest on the last full word of the utterance:
Huǒchē-shàng rén-hěn DUŌ. 火车上人很多。
“There are a lot of people in the train.”

Xiáojiě //, wǒ-yào jì-fēng-XÌN. 小姐,我要寄封信。


“Miss, I want to send a letter.”
Default nucleus may not be particularly salient. In this case duō and xìn may
suffice, instead of DUŌ and XÌN.
3.6.2 Emphasis
Emphasized words bear the greatest prominence in the utterance or prosodic
phrase. Emphasis is fully determined by the will of the speaker and can be found
basically anywhere in the utterance. The items following the emphasized word tend to
be weakened (post-focus weakening).
When two items are contrasted (contrastive stress), both items are usually emphasized:
Tā-xǐhuan chī-YÚ //, bù-xǐhuan chī-RÒU. 他喜欢吃鱼,不喜欢吃肉。
“He likes to eat fish, does not like to eat meat.”

Negatives, such as bù 不 , méi 没 , bié 别 , are often emphasized:


Wǒmen-MÉIyǒu duōshao-qián. 我们没有多少钱。
“We do not have much money.”
Non-neutral, emotionally charged words often attract emphasis, e.g. piān 偏
“provocatively”:
- Nǐ-BIÉ-kàn! 你别看! “Do not look at it!”
- Wǒ-PIĀN-kàn! 我偏看! “I’m looking whether you like it or not!”

A function word may occasionally be emphasized if it is tonal. One example is the

① In fact, no such thing as “neutral” speech exists in real life. I use this common term only for
the sake of convenience.
190 韵 律 语 法 研 究 第 九 辑

personal pronoun tā 他 :
– SHÉI bù-xǐhuan chī-ròu? 谁不喜欢吃肉? “Who does not like to eat meat?”
– TĀ bù-xǐhuan chī-ròu. 他不喜欢吃肉。“He does not like to eat meat.”

Note that in emotionally charged, expressive speech there may be more emphasized
items in one prosodic phrase. Yet this situation is not so common.
3.6.3 Particle ma 吗 Questions (Polarity Questions)
Polarity questions are those which offer a choice between two possibilities,
expecting either a positive or a negative answer. Because the answer is typically (though
not always) either YES or NO, they are often called yes/no questions. In Chinese,
polarity questions are those comprising the particle ma 吗 . Grammatically unmarked
questions also belong here. In such questions, the most salient item is quite naturally the
item the speaker is asking about. This item will probably carry the nucleus. For instance:
Nǐ-shēnti HǍO-ma? 你身体好吗?
“How are you?” (Are you in good health?)

Shì-HÁNGkōng-xìn-ma? 是航空信吗?
“Is it an airmail letter?”
3.6.4 A-not-A Questions (Affirmative-negative Questions)
A-not-A questions use the affirmative and negative forms of the predicate.
If the verb/adjective is monosyllabic, the first item is the most prominent, while
the pronunciation of the negative bù 不 occurring between both items is atonic. The
three items form a prosodic word:
Tāng RÈ-bú-rè? 汤热不热?
“Is the soup hot or not?”
If the verb/adjective is disyllabic, the rhythmic pattern changes: the negative bù
不 assumes a certain prominence, standing as the first item of a new prosodic word.
Repeated verb is rather weak:
Ní-XǏhuan bù-xǐhuan? 你喜欢不喜欢?
“Do you like it or not?”
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 191

3.6.5 Question-word Questions


Question-word questions, as the name suggests, contain a question word: who,
where, when, etc. Thus, they are frequently called “wh-questions” in English. When
asking such questions, the speaker is seeking some specific information.
The question word is usually the most prominent item of the utterance:
Tā-SHÉNme-shíhou-lái? 他什么时候来?
“When shall he come?”
However, in some contexts, the speaker may emphasize the item he/she is asking
about. This is the case in the following question:
XIĀNGJIĀO zěnme-mài? 香蕉怎么卖?
“How much are bananas?”
Bananas may be placed on a vendor’s stall among other fruits and vegetables.
While asking about the price is expected in such a situation, clear identification of the
item “bananas” is crucial for the speaker to get the right answer. Note that both syllables
in the word xiāngjiāo are marked as top-prominence syllables. The adjacency of such
syllables (like the adjacency of normal syllables) is perfectly acceptable.
3.6.6 Alternative Questions
The speaker offers the hearer two alternatives to choose from using the conjunction
háishi 还是 “or”. Usually, both items are equally prominent:
Nǐ-zhù-ZHÈR //, háishi-NÀR? 你住这儿,还是那儿?
“Do you live here, or there?”
When the choice is between the positive and negative forms of the same thing, the
first item is more salient than the second. The negative, e.g. bù 不 , becomes prominent:
Ní-rènwei zhè-HǍO //, háishi BÙ-hǎo? 你认为这好,还是不好?
“Do you think this is good, or not good?”
If the items to choose from are whole phrases composed of several words/
morphemes, the salient syllables are chosen according to the actual objects of choice:
Tā-shì MĚIguó-rén //, háishi YĪNGguó-rén? 他是美国人,还是英国人?
“Is he American, or English?”
192 韵 律 语 法 研 究 第 九 辑

4. Phrasing (Prosodic Units)


Regarding prosodic units, Mandarin ToBI and Chinese ToBI have quite complicated
prosodic hierarchies.
Mandarin ToBI assumes the following prosodic units: minor prosodic phrase,
major prosodic phrase, breath group, and prosodic group. The break indices are:
0 (reduced syllable boundary), 1 (normal syllable boundary), 2 (minor-phrase
boundary), 3 (major-phrase boundary), 4 (breath group boundary), and 5 (prosodic
group boundary).
Chinese ToBI assumes the following prosodic units: prosodic word (PW), minor
prosodic phrase (MIP), major prosodic phrase (MAP), and intonation group (IG).
CHIPROT (like Švarný) assumes prosodic units of three levels:
prosodic word: words forming one PW are connected by a dash (-)
prosodic phrase: the boundary of a non-final PPh is marked by a double slash (//)
finished utterance: the boundary is marked by a sentence-final punctuation mark (.?!)

4.1 Prosodic Word

A prosodic word (PW) is usually, though not always, composed of several lexical
items. Most often they are grammatically related to each other. However, in rapid speech
a function word may “desert” to the preceding prosodic word (see section 3.5.2). Below
I will review the major structural types of prosodic words.
4.1.1 Single Word
Prosodic words composed of a single word are rather rare. Especially if the word is
monosyllabic (and/or a function word), it seldom stands as a prosodic word. Disyllabic
content words are better able to stand as prosodic words:
Zhè-shì YĬzi. 这是椅子。
This is a chair.
4.1.2 Content Word with Attached Function Word(s)
Most commonly, a prosodic word is formed by a content word with function
word(s) attached to it (before, after, or both). In the utterance below we can find two
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 193

FWs attached to some content words: the preposition bǎ 把 (a proclitic), and the
post-verbally placed preposition zài 在 (an enclitic):
Bǎ-xíngli cún-zài huǒchēZHÀN. 把行李存在火车站。
“Store your luggage at the train station.”
A function word contained in a prosodic word does not always have a grammatical
relationship with its neighbor (see section 3.5.2). For instance, the adverb jiù 就 in the
following example grammatically belongs to the following verb:
Wó-ZǍO-jiù kànguo Hóng-Lóu-Mèng. 我早就看过《红楼梦》。
“I read (the novel) Dream of the Red Chamber a long time ago.”
Note that “unstressed” function words cannot stand alone. In connected speech
they have to join some other word to form a prosodic word together. This phenomenon
has important pedagogical consequences.
4.1.3 Two Content Words
Many prosodic words comprise two content words, such as xué 学 “learn” and
zhōngwén 中文 “Chinese” in the following example:
NĚIxiē xuésheng xué-zhōngwén? 哪些学生学中文?
“Which students learn Chinese?”
Sometimes function word(s) may be added, e.g. the personal pronouns tā 他 “he”,
wǒ 我 “I” in the following example (the second prosodic phrase):
Wǒ-ZHĪdào // tā-bù-xǐhuan-wǒ. 我知道他不喜欢我。
“I know that he does not like me.”
Sometimes function word(s) may be inserted between two content words. This is
the case of the unstressed word jǐ 几 meaning “couple of” in the following example:
Zhè-běnr cídián-lǐ // SHÁO-jǐ-yè. 这本儿词典里少几页。
“There are a few pages missing in this dictionary.”
4.1.4 Two Function Words
Some prosodic words are composed of only two function words (usually standing
at the beginning of an utterance or prosodic phrase). These cases have already been
treated in section 3.5.2.
194 韵 律 语 法 研 究 第 九 辑

4.1.5 In-between Structures


Some structures (morphemic sequences) cannot be found in a dictionary, so there
are reasons to deprive them of the status of an independent word. Yet the relationship of
their components may be extremely tight, so they stand as one PW. Major examples are
reduplicated monosyllabic verbs/adjectives, and also verbs with directional, resultative,
and other complements, including potential forms of verbs. These structures should
enter a prosodic word as a whole, without falling apart (e.g. lái-de-zǎo 来得早 “to come
early”).
Note that if the complement is not monosyllabic, it is joined more loosely and can
stand as a separate prosodic word:
Zhè-zhǒng diǎnxin // zuò-de BÙ-hǎo-chī. 这种点心做得不好吃。
“This dessert is not tasty.”
4.1.6 Words with Particles
Verb aspect particles le 了 , zhe 着 , guo 过 , structural particles de 得 , de 的 , de
地 , and sentence-final particles such as ma 吗 , ne 呢 , le 了 , ba 吧 are toneless. They
are always tightly attached to the preceding word, forming a prosodic word. An example
is the particle ma 吗 in the following utterance:
Zhè-shì NǏde-ma? 这是你的吗?
“Is this yours?”
4.1.7 Idioms
Various types of idioms, stereotyped phrases, frequent collocations, etc., usually
form one prosodic word. For instance:
Ài //, Wáng-lǎoshī //, háo-jiǔ-bú-JIÀN! 哎,王老师,好久不见!
“Oh, teacher Wang, we have not seen each other for a long time!”
4.1.8 A-not-A Questions
An A-not-A question forms a prosodic word if the item asked about is monosyllabic:
Tāng RÈ-bú-rè? 汤热不热?
“Is the soup hot or not?”
If the verb/adjective is disyllabic, the patterning is different (see section 3.6.4).
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 195

4.2 Prosodic Phrase

Prosodic words join to form larger units: prosodic phrases (PPh). A prosodic phrase
may sometimes contain just one prosodic word. More often there are two or three (rarely
more) prosodic words in one prosodic phrase. In this section, we shall be concerned
with non-final prosodic phrases (such a phrase is not the last one in the utterance).
PPh boundary may occur after a non-final clause (4.2.1), after a prepositional phrase
(4.2.2), after a longer noun phrase standing utterance-initially (4.2.3), after particular
items in enumerations, and (less frequently) after a predicate followed by a longer
noun phrase. A hearer can detect the boundary using several signals, usually occurring
in combination: non-falling intonation pattern, slight final lengthening, and less often
a silent pause. Note that there may or may not be a comma in the orthography (e.g. a
longer noun phrase standing as a subject is not followed by a comma).
4.2.1 Non-final Clause
Zhèr-yǒu YǏzi //, nàr-yǒu ZHUŌzi. 这儿有椅子,那儿有桌子。
“Here is a chair, and there is a table.”
4.2.2 Prepositional Phrase
Bǎ-huāpíngr // fàng-zài ZHUŌzi-shàng. 把花瓶儿放在桌子上。
“Put the vase on the table.”
4.2.3 Preverbal Noun Phrase
A subject, time/place determination, or utterance-initially placed object may be
followed by a notable prosodic boundary if it is longer.
Nèi-jí-běnr SHŪ // wǒ-dōu kànWÁN-le. 那几本书我都看完了。
“I have read these books already.”

4.3 Finished Utterance

A prosodic boundary occurring at the end of finished utterances is indicated by


sentence-final punctuation. The acoustic signals of the boundary are more conspicuous
than for non-final prosodic phrases:
– a falling intonation pattern for statements, question-word questions, alternative
196 韵 律 语 法 研 究 第 九 辑

questions, and A-not-A questions; a non-falling intonation pattern for particle ma 吗


questions and grammatically unmarked questions
– noticeable final lengthening
– a silent pause (more likely than after a non-final prosodic phrase)

5. CHIPROT Cookbook
In previous paragraphs, I have tried to demonstrate that many features of prosodic
structure can be predicted. In this section, I will attempt to describe the CHIPROT
transcription procedure involving certain predictions. I have chosen four sentences, (A),
(B), (C), and (D), as examples to clarify the procedure. It has five steps (or six if we
include step /0/). Steps /2/, /3/, and /4/ are predictions.

Step /0/
The sentence is jotted down or already available in plain Hanyu Pinyin (in italics).
Tonal syllables carry tone marks, toneless syllables carry no tone mark: mā, ma.
(A) Zhuōzi shàng yǒu sān běn shū. 桌子上有三本书。
“There are three books on the table.”

(B) Shì hángkōng xìn ma? 是航空信吗?


“Is it an airmail letter?”

(C) Ní mǎi shénme, wǒ chī shénme. 你买什么,我吃什么。


“I will eat anything you buy.”

(D) Bǎ tā jiào dào wǒ zhèr lái. 把他叫到我这儿来。


“Call him to me.”

Step /1/
All tonal syllables will be put in bold type (they of course carry a tone mark): mā.
This can be easily done by putting the whole sentence in bold type and then unbolding
toneless syllables (see section 3.4). Regularly there are very few or even no toneless
syllables in a sentence.
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 197

(A) Zhuōzi shàng yǒu sān běn shū. 桌子上有三本书。


The only toneless syllable is the lexical suffix zi 子 .

(B) Shì hángkōng xìn ma? 是航空信吗?


The only toneless syllable is the question particle ma 吗 .

(C) Ní mǎi shénme, wǒ chī shénme. 你买什么,我吃什么。


There are two toneless syllables: two occurrences of the suffix me 么 .

(D) Bǎ tā jiào dào wǒ zhèr lái. 把他叫到我这儿来。


There is no toneless syllable in this sentence.

Step /2/
Tonal words/morphemes that are predicted to be weakened will be unbolded:
mā. Many of them will be cliticoids (see section 3.5.1). Note that normal syllables may
neighbor each other (see section 3.3).

(A) Zhuōzi shàng yǒu sān běn shū. 桌子上有三本书。


The following items can be predicted as weakened: the postposition shàng 上 , the
verb yǒu 有 , and the classifier běn 本 . All of them belong to the cliticoids.

(B) Shì hángkōng xìn ma? 是航空信吗?


Only one item can be predicted as weakened at this point: the verb shì 是, belonging
to the cliticoids. In the phrase hángkōng xìn 航空信 “airmail letter” we have to
wait for step /4/ to decide on the prominence of particular syllables. The reason is that the
weakening/enhancement of some syllables may be rooted in the information structure.

(C) Ní mǎi shénme, wǒ chī shénme. 你买什么,我吃什么。


The following items can be predicted as weakened: the personal pronouns nǐ 你 ,
wǒ 我 (the cliticoids) and the question word shénme 什 么 used as a relative pronoun
(3.5.9).

(D) Bǎ tā jiào dào wǒ zhèr lái. 把他叫到我这儿来。


The following items can be predicted as weakened: the preposition bǎ 把 , the personal
pronoun tā 他 , the preposition dào 到 , and the personal pronoun wǒ 我 . All of them
198 韵 律 语 法 研 究 第 九 辑

belong to the cliticoids. The word zhèr 这儿 is a place word, thus we keep it as a normal
syllable at this point. The verb lái 来 functions as a directional complement here, being
just a formal indicator of the direction towards the speaker. Thus, it will be predicted as
weak.

Step /3/
mark phrasing (prosodic words/phrases) -, //
The words which would presumably be tightly bound in speech, forming a
prosodic word, will be connected by a dash, e.g. sān-běn-shū 三本书 . Remember that
toneless and weakened items cannot stand alone. The most frequent weak, unstressed
items are the clitics (3.4.1) and cliticoids (3.5.1).
Short utterances usually stand as a single prosodic phrase. Its boundary is already
marked by a sentence-final punctuation mark. Longer utterances may be composed of
two or, less commonly, three or more prosodic phrases. The boundary of the non-final
prosodic phrase will be marked by a double slash (//). With respect to decisions about
prosodic boundaries of non-final prosodic phrases, the relevant factors to consider were
outlined in section 4.2.

(A) Zhuōzi-shàng-yǒu sān-běn-shū. 桌子上有三本书。


The weak postposition shàng 上 must be tightly attached to the preceding noun.
The verb yǒu 有 has a close grammatical relationship with the following noun phrase
sān běn shū 三本书 . However, in rapid speech yǒu 有 will most probably “desert” to
the preceding word zhuōzi 桌 子 under the pressure of rhythm, forming a prosodic
word with it. The noun phrase sān běn shū 三本书 forms a rather typical prosodic
word composed of a numeral, a classifier, and a noun.

(B) Shì-hángkōng-xìn-ma? 是航空信吗?


The copula shì 是 is usually “unstressed”, thus it cannot stand alone. In this sentence,
it would join the following noun phrase, hángkōng xìn 航空信 , as a proclitic. Shì could
certainly be prominent if it is emphasized by the speaker (then it could possibly even
stand alone as a prosodic word). However, this is hard to determine without knowing
the previous context and hearing the audio, so at this point we presume shì is a weak
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 199

proclitic. The toneless question particle ma 吗 has no other choice but to be attached

to the preceding word. The resulting prosodic word, forming a prosodic phrase and a

finished utterance at the same time, is rather long (five syllables). This long prosodic

word could fall apart into two prosodic words if the speaker hesitates and inserts a break

after the verb shì 是 . This may manifest in perceptible lengthening and (in the case of

a strong hesitation) by a silent pause. If shì is emphasized, it could also possibly stand

alone as a prosodic word, as mentioned above.

(C) Ní-mǎi-shénme //, wǒ-chī-shénme. 你买什么,我吃什么。

This utterance is composed of two clauses, and thus most probably of two prosodic

phrases divided by a break. The prosodic boundary would be manifested by a final

lengthening of the syllable me 么, and possibly by a silent pause. The personal pronouns

nǐ 你 , wǒ 我 will be tightly attached to their verbs as proclitics.

(D) Bǎ-tā jiào-dào wǒ-zhèr-lái. 把他叫到我这儿来。

Two cliticoids bǎ 把 , tā 他 at the beginning of the utterance will most probably

form a disyllabic prosodic word. The preposition dào 到 is placed after the verb in this

sentence. Its pronunciation is typically atonic in such a position, tightly joining the

preceding verb as an enclitic. The expression wǒ zhèr 我这儿 is a set phrase “here

where I am”, thus both items must be tightly joined. The last item lái 来 is formal and

weak, so it joins the preceding item.

Step /4/

We look for the items which are presumably the most prominent in the utterance:

the words carrying emphasis, contrastive stress, default nucleus, etc. (3.6). Pertinent

syllables will be capitalized: MĀ.

(A) Zhuōzi-shàng-yǒu sān-běn-SHŪ. 桌子上有三本书。

If there is no concrete context suggesting that the speaker wishes to emphasize

some particular word (e.g. zhuōzi 桌子 , shàng 上 , sān 三 ), this utterance will have a

default nucleus on the last content word shū 书 .


200 韵 律 语 法 研 究 第 九 辑

(B) Shì-HÁNGkōng-xìn-ma? 是航空信吗?


In this particle ma 吗 question, the speaker is obviously wondering whether this
letter will be sent by airmail or as ordinary mail (píngxìn 平信 ). Thus, in the phrase
hángkōng xìn 航空信 “airmail letter” we can expect the word hángkōng 航空 to be the
most prominent. Because it is not prepausal, its accentuation pattern will probably be a
trochee (3.5.13). The word xìn 信 is not semantically important in the given context and
will undergo post-focus weakening.

(C) Ní-MǍI-shénme //, wǒ-CHĪ-shénme. 你买什么,我吃什么。


The two verbs mǎi 买 “buy” and chī 吃 “eat” are semantically the most important,
thus they will probably be the most prominent items of the utterance. Both relative
pronouns shénme 什么 (3.5.9) can be expected not just to be weakened, but even to be
completely atonic here (post-focus weakening).

(D) Bǎ-tā jiào-dào WǑ-zhèr-lái. 把他叫到我这儿来。


The listener is urged to take somebody to the speaker. Thus the most important
word in this imperative sentence is clearly the personal pronoun wǒ 我 in the expression
wǒ zhèr 我这儿 “here where I am”. The morpheme zhèr 这儿 and the morpheme lái 来
are only supplementary and formal. Their weak, atonic pronunciation is supported by
their post-focal position.

Step /5/
Listen to the audio and make corrections. I have tried to show that some/many
prosodic features can be predicted without hearing the audio recordings because they are
rule-governed to a large extent. However, our predictions may certainly be imperfect.
Speech tempo, speech style, individual habits of the speaker, specific information
structure or pragmatic context, etc. may influence the surface prosodic form and make
some of our predictions wrong. Careful listening to the audio is thus the last step, which
gives the transcript the final touch. While evaluating the prominence of particular
syllables or phrasing in listening, there may still be some questionable points. In such
cases speech analysis software (such as PRAAT) would be needed to support our final
assessments. There may be some unclear cases, but they should not be frequent.
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 201

6. Minimodules
I have shown how the CHIPROT transcription can be used to transcribe whole
utterances. It may also be used to indicate the prominence structure of commonly used
short phrases such as:

shù-shàng 树上 trochee ●•
ní-hǎo 你好 iamb •●
zhè-běn-shū 这本书 cretic ●• ●
gěi-bàba 给爸爸 amphibrach • ●•
wūzi-lǐ 屋子里 dactyl ●••
zài-Běijīng 在北京 bacchius •●●
xuéxiào-lǐ 学校里 antibacchius ●●•
I call these brief, two- or three-syllable sequences minimodules, or phonetic
chunks. They draw on the notion of formulaic language (Třísková, 2017c) and can be
efficiently used in pedagogic practice. The labels for prominence patterns are borrowed
from verse meter of Ancient Greek poetry. Note that minimodules do not need to employ
the highest degree of prominence (MĀ), since most of them are not finished utterances.

7. Conclusion
The CHIPROT transcription was, above all, designed as a pedagogic tool. It may
aid those who are studying Chinese as a second/foreign language and struggling with
the prosodic form of the utterances. The aim is to help learners speak with more ease,
fluency, and naturalness. Language teachers may test CHIPROT here or there. They
may find it useful to exploit some of its features while preparing teaching materials and
handouts. Students can experiment with CHIPROT. They may, for example, find it
useful to draw up some transcripts related to particular lessons (the annotation procedure
is relatively easy, user-friendly, and computer-friendly; the system does not contain
any unusual graphic marks, complicated conventions, etc.). However, my long-term
objective is to encourage writers of pedagogic materials to incorporate CHIPROT into
202 韵 律 语 法 研 究 第 九 辑

their texts. This would, of course, take a good deal of prosodic knowledge and practical
transcription skill. Indeed, this is the process I have been through myself, discovering
various flaws, drawbacks, and traps in the system and improving it step by step.
Linguists engaged in research on connected speech may also find CHIPROT
useful. It may help them discover major prosodic rules and tendencies while analysing
the prosodic form of Chinese utterances anchored in real communication contexts –
instead of artificial, fabricated sentences pronounced in isolation. As the transcription
procedure can be executed in several clear steps, it may perhaps even be automated to
some extent. The necessary software, if designed, could be used to process larger sets of
speech data such as spoken language corpora.
CHIPROT certainly may have its shortcomings or points which escaped my notice. Yet
I trust that its final version represents a rather consistent, theory-based, and robust system.
Its occasional blind spots or lurking problems may be successfully solved in the course of
time. Feedback from future users of the CHIPROT system may greatly help to polish it. Any
comments or criticisms would certainly be welcome.

References

Beckman M E, Ayers G M. 1994. Guidelines for ToBI Labeling. Ohio State University. [Link]
[Link]/tobi/[Link].
Chao Y R ( 赵元任 ). 1968. A Grammar of Spoken Chinese. Berkeley and Los Angeles: University of California
Press.
Feng S L ( 冯胜利 ). 2019a. Prosodic Syntax in Chinese: History and Changes. New York: Routledge.
Feng S L ( 冯胜利 ). 2019b. Prosodic Syntax in Chinese: Theory and Facts. New York: Routledge.
Feng S L ( 冯胜利 ). 2021. 韵律语体语法与汉语的词重音 (Prosody of stylistic-register grammar and lexical
stress in Chinese). Paper presented at the 7th International Conference on Prosodic Grammar (ICPG-7),
Tianjin.
Feng S L ( 冯胜利 ), Wang L J ( 王丽娟 ). 2018. 汉语韵律语法教程 (A Course of Prosodic Grammar).
Beijing: Peking University Press.
Jiang L P ( 姜丽萍 ). 2014. HSK 标准教程 1 (HSK Standard Course 1). Beijing: Beijing Language and Culture
University Press.
Kratochvil P. 1974. Stress shift mechanism and its role in Peking dialect. Modern Asian Studies, 8.4: 433-458.
Lee W-S, Zee E. 2014. Chinese phonetics. In: Huang C-T J, Li Y-H A, Simpson A. The Handbook of Chinese
Linguistics. Oxford: Wiley Blackwell, 369-399.
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 203

Li A J ( 李爱军 ). 2002. Chinese prosody and prosodic labeling of spontaneous speech. Proceedings of Speech
Prosody, Aix-en-Provence, 39-46.
Li A J, Zu Y Q. 2007. Corpus design and annotation for speech synthesis and recognition. In: Lee C H, et al.
Advances in Chinese Spoken Language Processing. Hong Kong: World Scientific, 263-268.
Li W M ( 厉为民 ). 1981. 试论轻声和重音 (Discussion on the neutral tone and stress). 中国语文 (Studies of
the Chinese Language), 1: 35-40.
Li Z Q ( 李智强 ). 2018. 汉语语音系的与教学研究 (Studies in Acquisition and Teaching of Mandarin
Chinese Phonetics). Beijing: Beijing Language and Culture University Press.
Liang L ( 梁 磊 ). 2003. 声 调 与 重 音 —— 汉 语 轻 声 的 再 认 识 (Tone and stress: the Chinese neutral tone
revisited). 第六届全国现代语音学学术会议论文集 ( 上 ) (Proceedings of the 6th National Conference
on Modern Phonetics, 1). Tianjin: Nankai University: 192-197.
Lin T ( 林焘 ). 1957. 现代汉语补足语里的轻音现象所反映出来的语法和语义问题 (Grammatical and
semantic problems related to the non-stress phenomenon in modern Chinese complements). 北京大学学
报 ( 人文科学 ) (Journal of Peking University; Philosophy and Social Sciences), 9: 61-74.
Lin T ( 林焘 ). 1962. 现代汉语轻音和句法结构的关系 (The relationship between non-stress and grammatical
structure in Modern Chinese). 中国语文 (Studies of the Chinese Language), 7: 301-311.
Liu Y H, et al. 2017. Integrated Chinese (4th ed.). Boston: Cheng & Tsui Company.
Peng S-H, Chan M K M, Tseng C-Y, et al. 2005. Towards a Pan-Mandarin system for prosodic transcription.
In: Jun S-A. Prosodic Typology: The Phonology of Intonation and Phrasing. Oxford: Oxford University
Press, 230-270.
Silverman K, Beckman M, Pitrelli J, et al. 1992. ToBI: a standard for labeling English prosody. Proceedings of
the 1992 International Conference on Spoken Language Processing (ICSLP 92), 867-870.
Švarny O. 1974. Variability of tone prominence in Chinese (Pekinese). Asian and African Languages in Social
Context. Dissertationes Orientales (34). Praha: Academia, 127-186.
Švarny O. 1991a. The functioning of the prosodic features in Chinese (Pekinese). Archiv Orientální, 59.2:
208-216.
Švarny O. 1991b. Prosodic features in Chinese (Pekinese): prosodic transcription and statistical tables. Archiv
Orientální, 59.3: 234-254.
Švarny O. 1998-2000. Učební Slovník Jazyka Čínského, I-IV (A Learning Dictionary of Modern Chinese, I-IV).
Olomouc: Palacky University.
Švarny O, et al. 1991-1993. Gramatika Hovorové Čínštiny v Příkladech, I-IV (A Grammar of Spoken Chinese
in Examples, I-IV). Bratislava: Komensky University.
Švarny O, Uher D. 2014. Prozodická Gramatika Čínštiny (A Prosodic Grammar of Chinese). Olomouc:
Palacky University.
Třísková H. 2011. Prozodická transkripce čínštiny O. Švarného: čtyři historické verze (O. Švarny´s prosodic
trancription of Chinese: four subsequent versions). Nový Orient, 66.4: 45-50.
Třísková H. 2016. De-stressed words in Mandarin: drawing parallel with English. In: Tao H Y. Integrating
Chinese Linguistic Research and Language Teaching and Learning. Amsterdam/Philadelphia: John
Benjamins Publishing Company, 121-144.
Třísková H. 2017a. Acquiring and teaching Chinese pronunciation. In: Kecskes I. Explorations into Chinese
204 韵 律 语 法 研 究 第 九 辑

as a Second Language. Cham: Springer International Publishing, 3-30.


Třísková H. 2017b. 普通话语音教学探究 (A Journey through the teaching of the sounds of Standard Chinese).
In: College of Arts, Capital Normal University ( 首 都 师 范 大 学 文 学 院 ). 燕 京 论 坛 2014 (Yanjing
Forum 2014). Beijing: Social Sciences Academic Press, 243-279.
Třísková H. 2017c. De-stress in Mandarin: clitics, cliticoids, and phonetic chunks. In: Kecskes I, Sun C F. Key
Issues in Chinese as a Second Language Research. New York & London: Routledge, 29-56.
Třísková H. 2020. Is the glass half-full, or half-empty? The alternative concept of stress in Mandarin Chinese
(玻璃杯半满抑或半空?汉语重音的另类观). 韵律语法研究 (Studies in Prosodic Grammar) (第四辑),
2019 (2): 64-105. Beijing: Beijing Language and Culture University Press.
Třísková H. 2021. Mluvte čínsky hezky: prozodie hovorové čínštiny (Speak Chinese with Ease: Prosody of
Colloquial Chinese). Praha: Academia Publishing House.
Uher D, et al. 2007. Učebnice čínské konverzace I (Textbook of Chinese Conversation I). Praha: Leda.
Uher D, et al. 2016. Učebnice čínské konverzace II (Textbook of Chinese Conversation II). Praha: Leda.
Uher D, Slaměníková T. 2019. Oldřich Švarný: Prosodia Linguae Sinensis. Special issue of the journal Far
East, 9.1: 74-106. Available at [Link]
archiv/Dalny_vychod_IX_1_2019_e_verze.pdf.
Wang Y J ( 王韫佳 ). 2016. 轻声规范和教学琐议 (A Discussion of the Standardization and Teaching of the
Neutral Tone). 国际汉语教学研究 (Journal of International Chinese Teaching), 2: 26-35.
Wang Y J ( 王韫佳 ), Chu M ( 初敏 ). 2008. 关于普通话词重音的若干问题 (Some problems related to
lexical stress in putonghua). 中 国 语 音 学 报 (Chinese Journal of Phonetics), 1: 141-147. Beijing: The
Commercial Press.
Wang Y J ( 王韫佳 ), Chu M ( 初敏 ), He L ( 贺琳 ), et al. 2003. 连续话语中双音节韵律词的重音感知 (The
perception of stress in disyllabic prosodic words in connected Chinese). 声学学报 (Acta Acoustica),
28.6: 534-539.
Wang Z J ( 王志洁 ), Feng S L ( 冯胜利 ). 2006. 声调对比法与北京话双音组的重音类型 (Tonal contrast
and disyllabic stress patterns in Beijing Mandarin). 语言科学 (Linguistic Sciences), 5.1: 3-22.
Yin Z Y ( 殷作炎 ). 1982. 关于普通话双音常用词轻重音的初步考察 (A preliminary study on stress patterns
of common disyllabic words in putonghua). 中国语文 (Studies of the Chinese Language), 3: 168-173.
Zhang X R ( 张洵如 ). 1957. 北京话轻声词汇 (A Dictionary of Neutral-tone Words in Pekinese). Beijing:
Zhonghua Book Company.
Chinese Prosodic Transcription (CHIPROT) and the Prediction of Prosodic Structure 205

汉语韵律标注(CHIPROT)与韵律
结构的预测
廖 敏
捷克科学院东方研究所

摘 要 本文介绍了一种称为 CHIPROT 的新的韵律标注方式(基于汉语拼


音),它最初是为第二语言教学而设计的。它是一种用于标注自然
语速口语体的普通话话语的工具。CHIPROT 也可用于标注“小模
块”或语音块(2 ~ 3 个音节组成的短语)的突显模式。“小模块”
借鉴了公式化语言(formulaic language)的概念。CHIPROT 标注
有以下特征:1. 区分各个音节的突显程度(ma、mā、mā、MĀ);
2. 分 节 法( 韵 律 词 和 韵 律 短 语)。 该 系 统 的 构 设 是 在 史 瓦 尔 尼
(Švarny)教授的系统及 Mandarin ToBI、C-ToBI 的启发下,基于对
汉语轻重音的不同语音分析而完成的。CHIPROT 的核心概念是“常
音节”和“弱音节”。笔者认为,韵律结构的许多特征(尤其是音
节弱化)是可以预测的,这些预测可以用于标注过程中。CHIPROT
图形具有标志性和直观性。它已在教学实践中得到检验。本文介绍
的最终版本已应用于最近出版的教科书中。不仅教师和教材编写者
发现 CHIPROT 很好用,而且从事连续语音研究的语言学家也会发
现它很有用。
关键词 汉语 普通话 语音学与音位学 韵律 韵律描写 汉语作为第二
语言教学

Hana Třísková
Oriental Institute, the Czech Academy of Sciences, Prague
triskova@[Link]

You might also like