Research and catalog machine learning training, validation, and testing data sets, including video, images, and audio
Guide the creation and unification of data acquisition tools (crawl, synthesize, capture)
Guide (uniform) approaches to metadata quality processes and tools (crowd sourcing, manual labeling etc.)
Work with IT and back-end engineering on compatibility of data tools and methods with back-end infrastructure
Guide the *** ion of data sets (training, validation, testing) in compliance with product requirements
Fill specific data needs, working with our image and audio acquisition teams
Source and negotiate the licensing and/or acquisition of data sets
Guide and be an evangelist of PII policy (GDPR, CCPA, US Privacy Shield…)
What skills are needed?
BS or MSc degree in a relevant field with at least 2 years relevant industry experience
Experience sourcing or creating data sets for machine learning training or testing
Understands common intellectual property business fundamentals and common terms and conditions, such as fields of use, licensing rights, indemnification, etc.
Detail orientation with strong analytical and troubleshooting skills
Self-motivated and focused
Comfortable collaborating with geographically dispersed teams
Excellent written and spoken communication skills
Conversant in machine learning fundamentals
A strong drive to solve problems and disrupt the status quo
Understanding that data security and policy are critical part of data curation