831 Hours - British English Speech Data by Mobile Phone
831 Hours–Mobile Telephony British English Speech Data, which is recorded by 1651 native British speakers. The recording contents cover many categories such as generic, interactive, in-car and smart home. The texts are manually proofreaded to ensure a high accuracy rate. The database matchs the Android system and IOS.
EnglishUKMobile phoneReadingSample
6,968 People 48,310 Images Cross-age Faces Collection Data
6,968 People 48,310 Images Cross-age Faces Collection Data. The data includes indoor and outdoor scenes. The dataset includes female and male(Chinese). For most people, the age spans are 10 years at least, the age spans of only a few people are less than 10 years (130 people). For each person, at least 6 front side images were collected. The data can be used for tasks such as cross-age face recognition.
Several images for one personDifferent scenesDifferent agesSample
1,003 People - Emotional Video Data
1,003 People - Emotional Video Data. The data diversity includes multiple races, multiple indoor scenes, multiple age groups, multiple languages, multiple emotions (11 types of facial emotions, 15 types of inner emotions). For each sentence in each video, emotion types (including facial emotions and inner emotions), start & end time, and text transcription were annotated.This dataset can be used for tasks such as emotion recognition and sentiment analysis.
Multiple racesMultiple indoor scenesMultiple age groupsMultiple languagesMultiple emotionsSample
370 Hours - Malay Speech Data by Mobile Phone
675 Malaysians native speakers participated in the recording with authentic accent. The recorded script is designed by linguists and cover a wide range of topics including generic, interactive, on-board and home. The text is manually proofread with high accuracy. It matches with mainstream Android and Apple system phones. The data set can be applied for automatic speech recognition, and machine translation scenes.
MalayMalaysiaCellphoneReadingSample
502 Hours - Chinese Speaking English Speech Data by Mobile Phone
1,279 Chinese speakers from major dialect regions participated in the recording, it is in line with the specific accent of Chinese English speakers. The recorded script cover many categories such as spoken English, speech, and human-computer interaction, rich in content, extensive in fields, and balanced in phonemes. It can be used to improve the recognition effect of the automatic speech recognition system on Chinese people speaking English.
EnglishChinaCellphoneReadingSample
19.46 Hours - American English Speech Synthesis Corpus-Female
Female audio data of American English, 19,841 sentences in total and 19.46 hours. It is recorded by American English native speakers, with authentic accent and sweet sound. The phoneme coverage is balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.
American EnglishTTSSample
Female audio data of adults imitating children, 6599 sentences in total and 6.78 hours. It is recorded by Chinese native speakers, with authentic accent and sweet sound. The phoneme coverage is balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.
TTSSample