Salary: $140,000 - $210,000
Who Are We
Voxel is building the future of Computer Vision and Machine Learning for operations, risk, and safety. We use computer vision and AI to enable existing security cameras to automatically detect hazards and high-risk activities, keep people safe and drive operational efficiencies. Our technology addresses the key cost drivers for workers’ compensation, general liability, and property damage, which cost employers over $500 billion annually.
Job Overview
Voxel is seeking a highly skilled and experienced Senior ML Infrastructure professional to join their team. This role will focus on designing and implementing systems to support Voxel's ML development. The ideal candidate will have a strong background in software engineering, extensive experience working with distributed systems, and a deep understanding of infrastructure and ML systems.
What You'll Do
- Build and maintain cloud infrastructure and distributed systems for MLOps.
- Build out internal training framework to make research with large distributed models easy and fun.
- Design and develop systems to support Voxel's ML development, with a focus on computer vision applications.
- Provide technical guidance, mentorship, and project management support.
- Build distributed systems and pipelines for data management, ensuring scalability, reliability, and performance.
- Demonstrate a deep understanding of DevOps practices and apply them to ML operations.
- Utilize technologies such as Kubernetes and various CI/CD systems to optimize infrastructure deployment and management.
- Handle the complexities of designing and implementing ML infrastructure systems in a dynamic and fast-paced startup environment.
- Contribute to the overall impact of Voxel's AI-powered workplace safety solutions by delivering robust and efficient ML infrastructure.
Qualifications
Must-Haves
- Bachelor's degree in Computer Science or a related field.
- Minimum 5+ years of experience in software engineering, with a focus on infrastructure design.
- Experience designing large, highly available distributed systems with Kubernetes.
- Proven experience in designing complex systems and strong software engineering skills.
- Strong understanding and experience with distributed systems like Apache Spark, Ray, etc., and infrastructure design.
- Proficiency in containerization technologies like Docker and orchestration platforms like Kubernetes.
- Demonstrated expertise in DevOps practices, with a focus on ML.
- Experience with ML tools that researchers love.
Nice-to-Haves
- Knowledge of advanced ML operations techniques, such as model deployment and monitoring.
- Experience with pipeline automation and data management in ML workflows.
- Previous experience building ML systems and working on ML adjacent teams.
#J-18808-Ljbffr