Romana  /  English


Project closed!

Machine Learning Data Architect

·      Architect and integrate data versioning platforms like DVC. Similar to Get for code
·      Architect and develop international MAM and DAM data platforms and integrations for digital other ML assets
·      Architect, integrate and implement data transfer software like Aspera vs Signiant
·      Architect, integrate and implement workflow management with Airflow, Luigi or Jenkins
·      Develop API data connection with NAS, RAID, cloud and HDSF data stores
·      Develop application Rest API and web service SOAP APIs integrations
·      Script SQL and NoSql queries for acquiring, retrieving and augmenting data assets and metadata from NAS, DynamoDB, MongoDB, Snowflake, S3 and Blob storage
·      Python scripting for workflows, jobs, ETLs and machine learning execution
·      Architect and integrate enterprise NAS storage
·      Familiarity with Pytorch and/or TensorFlow frameworks
·      Creation, population and maintenance of XML and JSON data schemas
·      Develop and automate third party application rest APIs
·      Scripting of runtime languages like Node.js
·      Architect and implementation Kubernetes and Docker integrations
·      Architect and support code GPU ML infrastructure and data integrations
·      Architect and support data backups, replication and failover
·      Architect and implement message queuing and process flows via Kafka, RabbitMQ
·      Architect and implement of Linux servers, load balances, and server farms
·      Architect and implement solutions to monitor, performance tune, alert and troubleshoot GPU, ML data augmentation, ML processes and data flow jobs
·      Architect and implement AWS cloud infrastructure like Lambda functions, Kinesis, EMR, S3, S3 Glacier and/or AZURE Functions, AWS BLOB
·      Expertise in normalized de-normalized data and when to use each schema
·      Expertise in structured, semi structured, and unstructured data storage and pipelines
·      Architect and implement Apache solutions like (Hadoop, HIVE, Spark Flum, Airflow, Flink)
·      Expertise in PII data tagging, workflows and categorization (GDPR, CCPA)
·      Expertise in general data security, encryption, retention and access workflow
·      Implementation of distributed computing platforms
·      Experience using Atlassian tools like Jira and Confluence, plus Bitbucket and Git for version control.
·      Familiar with development methodologies such as Agile/Kanban

What skills are needed?
  • BS or MSc degree in a relevant field with at least 3 years relevant industry experience
  • Experience as a data architect designing and implementing enterprise data platforms
  • Experience as a handson developer with Python, SQL, Linux scripting
  •  Detail orientation with strong analytical and troubleshooting skills
  • Self-motivated and focused
  • Comfortable collaborating with geographically dispersed teams
  • Excellent written and spoken communication skills
  • Conversant in machine learning fundamentals
  • Take pride in finding ways to engineer things better, faster and correct. A focus on automation


6 months
Experience level:
Job description:
Number of Hires:

Applications (1)