Our off-the-shelf datasets cover 800TB of image and video data, 200,000 hours of speech data, 2 billion pieces of text data, and they are ready to go.
200,000 hours of speech recognition data, recorded by a variety of professional equipment, covering diversified scenes and multiple languages.
We offer an extensive volume of datasets covering different fields such as computer vision, speech recognition, and NLP. All the datasets have clear copyright.
Our “Human-in-the-loop” intelligent data labeling platform achieves the semi-automatic labeling and creates up to 3-4 times efficiency improvement.
As world’s leading AI data service provider, we have provided work opportunities for over 80,000 people from more than 50 countries and regions.
We have obtained ISO 27701 and ISO 27001 information security management system certification. In Datatang, we are committed to ensuring the security of your data.
Security and Compliance
Datatang has supported us in various projects in CV and speech recognition researches for years. Truly appreciate the prompt turn-around, great parallel projects management skills, and high quality data that Datatang has showcased/provided along the year.
We’re making considerable progress with our algorithmic development thanks to Datatang’s ready-to-go datasets which really help us catch up the project. I would recommend Datatang’s datasets and service to anyone who need reliable training data.
Training Data is a very important composition of ML development. But data labeling is quite labor-intensive. With Datatang’s well-designed platform, annotation service and extraordinary project management, we are able put more focus on improving algorithms and do what we are good at.