
1,000 Hours - European Portuguese Speech Data by Mobile Phone
- 2,000 people
- Portugal locals
- 16kHz, 16bit, wav
Datatang has passed the certification of ISO27001 Information Security Management System and ISO9001 Quality Management System.


Data Introduction
It is speech data of 2,000 Portuguese natives with authentic accents. The recorded text is designed by professional language experts and is rich in content, covering multiple categories such as general purpose, interactive, vehicle-mounted and household commands. The recording environment is quiet and without echo. The texts are manually transcribed with a high accuracy rate. Recording devices are mainstream Android phones and iPhones.
Data Specification
- Format
- 16kHz, 16bit, uncompressed wav, mono channel
- Recording Environment
- quiet indoor environment, low background noise, without echo
- Recording text
- oral category; news category ;human-machine interaction category; smart home command and control category; in-car command and control category; numbers;
- Demographics
- 2,000 speakers totally, with male and female accounting within ±5% of the half; and 60% speakers of all are in the age group of 18-25,35% speakers of all are in the age group of 26-45, 5% speakers of all are in the age group of 46-60, with a floating rate of 5%;
- Device
- mobile phones Android ; iPhone
- Language
- Portuguese
- Application scenario
- speech recognition, voiceprint recognition
Sample
-
00:00/00:00
O extintor de incêndio do carro está quebrado o que devo fazer?
-
00:00/00:00
Não é um caso para eu concordar, Steve.
-
00:00/00:00
Proprietários em pé de guerra por causa da requisição de casas para arrendamento temporário
-
00:00/00:00
Ficaria muito agradecido se você mudasse a música What I Want When How!
-
00:00/00:00
Diminir a potência do aspirador.