ERA

Download the full-sized PDF of The Wenzhou Spoken CorpusDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R3ZG3H

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Linguistics, Department of

Collections

This file is in the following collections:

Research Publications (Linguistics)

The Wenzhou Spoken Corpus Open Access

Descriptions

Author or creator
Newman, John
Lin, Jingxia
Butler, Terry
Zhang, Eric
Additional contributors
Subject/Keyword
Wu language
Dialectology
Corpora (linguistics)
Type of item
Journal Article (Published)
Language
English
Place
China
Time
Description
The creation of the Wenzhou Spoken Corpus, an online searchable corpus of a modern Chinese dialect, presents a number of challenges that are of interest to the corpus linguistic community. We review issues involved with collection of spoken data, its transcription and markup, as well as the functionality of the search tools. The transcription makes use of Chinese characters as well as IPA symbols for Wenzhou colloquial forms not conventionally represented by characters. XML was adopted as the standard for the basic format of files, with file searches expressed in XPath form. The search tools provide the usual options of restricting searches by age, gender, etc., and yield concordances and tables of collocates. Though the collection of data for the corpus was ‘opportunistic’ in some ways, and so not ideally balanced or representative, it is nevertheless proving to be a valuable tool for corpus-based research on Wenzhou.
Date created
2007
DOI
doi:10.7939/R3ZG3H
License information
Rights
© 2007 Edinburgh University Press
Citation for previous publication
Newman, J. et al. (2007). The Wenzhou Spoken Corpus. Corpora, 2(1), 97-109.
Source
Link to related item

File Details

Date Uploaded
Date Modified
2014-05-01T02:30:37.573+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 611767
Last modified: 2015:10:12 13:14:19-06:00
Filename: Corpora_2_2007_97.pdf
Original checksum: a6a76549b49291f85a289baddd032e1e
Well formed: true
Valid: true
File title: The Wenzhou Spoken Corpus
File author: Mr Davies
Page count: 13
Activity of users you follow
User Activity Date