Dice is the leading career destination for tech experts at every stage of their careers. Our client, Apex Systems, is seeking the following. Apply via Dice today!
Job#: 2028235
Job Description:
Apex Systems is working with our client to find multiple Machine Learning Engineers. In this role, you will assist in production to maintain existing models. While you will be working with the Data Science team, this is an Engineering role.
While this will be a fully remote position, you will need to be able to work on PST times.
Job Description/Responsibilities:
- Build and maintain scalable infrastructure for machine learning model & pipeline deployment, including containerization & orchestration.
- Develop and maintain scalable & secure REST APIs for serving multiple machine learning models to various users.
- Collaborate with data scientists and software engineers to ensure seamless integration of ML models into our systems.
- Design and optimize data pipelines, data storage, and data processing systems to support the training and inference processes of machine learning models.
- Build and maintain data and model dashboards to monitor model performance and health in production environments.
- Collaborate with cross-functional teams to identify and address data quality, data governance, and security considerations in the context of ML operations.
- Monitor model performance and health in production environments, establishing and maintaining appropriate monitoring and alerting mechanisms.
Must-Have/Required:
- Bachelors degree in Computer Science, Data Science, or a related field. A Masters or Ph.D. degree is a plus.
- 5+ years of hands-on experience in ML operations, ML engineering, or related roles.
- Experience with AWS & Databricks cloud platforms, specifically AWS Sagemaker, AWS Jumpstart, & AWS Bedrock.
- Experience with REST API development, AWS Networking Protocols.
- Solid understanding of infrastructure components and technologies, including containerization (e.g., Docker) and CI/CD pipelines.
- Strong knowledge of software engineering principles and best practices, including version control, code review, and testing.
- Excellent problem-solving skills, with the ability to analyze complex issues and provide innovative solutions in a fast-paced environment.
- Strong communication and collaboration skills, with the ability to work effectively with cross-functional teams and stakeholders.
Preferred/Nice to Have:
- Familiarity with load balancing, EKS (Kubernetes), & latest ML Model Serving Techniques (ex. NVIDIA Triton).
- Familiarity with the Hugging Face Diffusers Library.
#J-18808-Ljbffr