Site Reliability Engineer II - CTJ - Top Secret
Microsoft is seeking a Site Reliability Engineer II (SRE) to join our Silver Infrastructure and Sovereign Operations team. This pivotal role involves defining operations for new, existing and emerging environments. We are looking for a candidate who thrives on solving complex issues, has a clear vision, and possesses the ability to execute end-to-end programs effectively.
As a Site Reliability Engineer II, you will be instrumental in defining operating models for deploying and managing systems within sovereign and air-gapped environments. This role offers the unique opportunity to collaborate with engineers dedicated to enabling a wide range of Azure services for both internal and external customers in highly secured and regulated industries. The systems, processes, and frameworks you develop will be essential in meeting the stringent security policy and assurance requirements of our diverse customer base in the public and private sectors.
If you are passionate about operational excellence and have a track record of success in similar environments, we encourage you to apply and help shape the future of our operations.
Minimum/Required Qualifications:
- 4+ years technical experience in software engineering, network engineering, or systems administration
- OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration
- OR Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.
Other Requirements:
- The successful candidate must have an active U.S. Government Top Secret Security Clearance. Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. Failure to maintain or obtain the appropriate clearance and/or customer screening requirements may result in employment action up to and including termination.
- Clearance Verification: This position requires successful verification of the stated security clearance to meet federal government customer requirements. You will be asked to provide clearance verification information prior to an offer of employment.
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
- Criminal Justice Information Services: This position requires passing a background check conducted through the CJIS criminal justice information system by authorized local, state, and/or federal agencies and across multiple states. This role requires candidates to maintain CJIS screening eligibility.
- Citizenship & Citizenship Verification: This position requires verification of U.S. citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customer and is subject to certain citizenship-based restrictions where required or permitted by applicable law.
Preferred/Additional Qualifications:
- 3+ years of experience with PowerShell, C#, or C++.
- Experience working on large-scale distributed services with on-call responsibilities.
- Ability to build and influence broadly towards common goals and priorities.
- Ownership for end-to-end project lifecycle with solid project management and communication skills.
- Experience applying SRE principles in a large production environment.
Responsibilities:
- Defines and develops standardized, repeatable, scalable solutions to guarantee quality and efficient operations. Drive the design, optimization, efficiency and reliability of service management.
- Communicate on a deeply technical level with software engineers, project management, and operations teams to improve and optimize products, improve infrastructure, reduce manual toil, and evolve services.
- Drives efforts to collect, classify, and analyze data on a range of metrics. Drives the refinement of products through data analytics and makes informed decisions in engineering products through data integration.
- Drives efforts to integrate instrumentation for gathering telemetry data on system behavior such as performance, reliability, availability, and usage.
- Applies debugging tools and examines logs, telemetry, and other methods to verify assumptions through writing and developing code proactively before issues occur and reactively as issues occur for products.
- Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions.
- Ability to meet on call responsibilities periodically to support 24x7 operations.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.
#J-18808-Ljbffr