What you'll do...
Position: Senior Software Engineer
Job Location: 10500 NE 8th Street, Bellevue, WA 98004
Duties:
- Create and maintain Python Software Development Kits (SDKs) for internal use.
- Ensure SDKs are well-documented for easy integration and usage by AI Engineers and cross-functional teams.
- Regularly update and improve SDKs to align with evolving project requirements and industry best practices.
- Develop code for Triton client to be utilized in image recognition.
- Integrate Triton client into ML pipelines for seamless execution of inference tasks.
- Collaborate with AI Engineers to optimize and fine-tune Triton client code for improved model performance.
- Design and implement NVIDIA DALI (Data Loading Library) Models for preprocessing and postprocessing of image datasets.
- Continuously optimize DALI models to improve performance and adapt to changing data requirements.
- Develop Docker images for executing NVIDIA Triton Inference Server.
- Conduct research on state-of-the-art vector databases, focusing on platforms such as Qdrant, to understand their architecture, capabilities, and potential applications.
- Design and implement a custom vector database solution tailored to the organization's requirements.
- Develop ROS nodes to facilitate the interaction between ML models and robotic systems.
- Implement custom ROS messages and services to enable the exchange of data and commands between the ML components and the robotic framework on Nvidia AGX.
- Maintain existing Kubeflow pipelines and regularly improve and update pipeline Software Development Kits (SDKs) through internal tooling.
- Ensure containerized environments support easy deployment and execution of Triton models.
- Implement best practices for containerization, including dependency management and security considerations.
- Create and deploy small services on Google Cloud Platform (GCP) to support various ML tasks.
- Utilize GCP services such as Cloud Functions, Cloud Run, or App Engine for scalable and efficient microservices architecture.
- Document the usage and integration of developed tools, SDKs, Triton clients, and Docker images.
- Track system metrics such as GPU, CPU, memory, and network usage during training or other data processing tasks through Weights and Biases.
- Maintain code bases on GitHub and add necessary documentation to repos for usage.
- Ensure code builds are up and running for any production or dev GitHub repos.
- Debug issues with CI/CD pipelines across Jenkins builds.
- Use ML libraries including PyTorch, TensorFlow, and ONNX to interact and query ML models.
- Implement tools to utilize data science frameworks including Pandas, Numpy, and SciPy for algorithms and calculations.
- Create WandB artifacts of qdrant database snapshot from the data stored in the Fiftyone, and use the snapshot for image recognition with triton client.
- Containerize Kubeflow components in Docker containers to allow easy dependency resolution and create lightweight standalone packages.
- Carry out unit testing of components, integration testing of pipelines, and regression testing of automated systems to ensure failure is avoided in production.
Minimum education and experience required: Master’s degree or the equivalent in Computer Science, Engineering (any) or related field and 2 years of experience in large scale enterprise software development environment; OR Bachelor's degree or the equivalent in Computer Science, Engineering (any) or related field and 5 years of experience in large scale enterprise software development environment.
Skills required: Experience designing and developing microservices using cloud computing using GCP or AWS. Experience with Object oriented programming languages including Python or JAVA. Experience dashboarding using different tools including Grafana or CloudWatch. Experience using cloud computing tools for containerization including Kubernetes. Experience working on the Relational and NoSQL Database including RDS or Postgres. Experience developing packages for internal use. Experience maintaining code bases on GitHub and adding necessary documentation to repos for usage. Experience debugging issues with CI/CD pipelines. Experience with Unit and integration testing. Experience designing client server architecture for simulations. Employer will accept any amount of experience with the required skills.
Salary Range: $190,486/year to $216,000/year. Additional compensation includes annual or quarterly performance incentives. Additional compensation for certain positions may also include: Regional Pay Zone (RPZ) (based on location) and Stock equity incentives.
Benefits: At Walmart, we offer competitive pay as well as performance-based incentive awards and other great benefits for a happier mind, body, and wallet. Health benefits include medical, vision and dental coverage. Financial benefits include 401(k), stock purchase and company-paid life insurance. Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty and voting. Other benefits include short-term and long-term disability, education assistance with 100% company paid college degrees, company discounts, military service pay, adoption expense reimbursement, and more.
Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to a specific plan or program terms. For information about benefits and eligibility, see One.Walmart.com.
Wal-Mart is an Equal Opportunity Employer.
#LI-DNI #LI-DNP
#J-18808-Ljbffr