Incorporating diversity in data collection and training artificial intelligence models is crucial for unbiased, fair, and accurate results. This was the case for a global technology company that needed to generate a dataset from a virtual assistant’s phone query logs for their machine learning model training.
How We Do It
- Crowd Recruitment: Executed intensive marketing efforts such as utilizing different acquisition channels, incentivizing referrals, and designing targeted, personalized messaging to recruit 17 individual demographic groups.
- Crowdsourced Data Collection: Combined automated and manual tools to ensure only new unique queries were collected.
- Data Quality: We ran manual and automated checks against all submitted data.
- Data Management & Security: We ensured all captured data were handled according to appropriate privacy and security guidelines.
- 17 Different demographic groups covered
- 78.5% Participant Approval Rate
- 9,700 TaskVerse sign-ups through organic and marketing efforts
- Developed a custom-built PII detection tool
To learn how to improve consistency and reduce bias in your machine learning models, download our case study Virtual Assistant Data Collection for a Global Technology Company.
Download Case Study