
490 People - Thai Speech Data by Mobile Phone_Guiding
- 490 people
- 15.7 hours
- 50 sentences for each person
Datatang has passed the certification of ISO27001 Information Security Management System and ISO9001 Quality Management System.


Data Introduction
Thai speech data (guiding) is collected from 490 Thailand native speakers and is recorded in quiet environment. The recording is rich in content, covering multiple categories such as in-car scene, smart home, speech assistant. 50 sentences for each speaker. The valid volumn is 15 hours. All texts are manual transcribed with high accuray.
Data Specification
- Format
- 16kHz, 16bit, uncompressed wav, mono channel
- Recording environment
- quiet indoor environment, without echo
- Recording content (read speech)
- smart car; smart home; speech assistant
- Speakers
- 490 Thais, 58% of which are female
- Device
- Android mobile phone, iPhone
- Language
- Thai
- Transcription content
- text, time point of speech data, 4 noise symbols, 5special identifiers
- Accuracy rate
- 95% (the accuracy rate of noise symbols and other identifiers is not included)
- Application scenarios
- speech recognition, voiceprint recognition
Sample
-
00:00/00:00
บริเวณใกล้ๆมีห้างสรรพสินค้าอะไรบ้าง
-
00:00/00:00
หนึ่งอาคารที่เป็นสัญลักษณ์เด่นๆในพื้นที่บริเวณนี้มีอะไรบ้าง
-
00:00/00:00
เปลี่ยนเพลง[[lipsmack]]
-
00:00/00:00
พรุ่งนี้ฉงชิ่งฝนจะตกไหม
-
00:00/00:00
เดือนนี้ใช้ไฟไปเท่าไหร่[[lipsmack]] แล้ว