Description

Microsoft bets on Artificial Intelligence (AI) as the next growth opportunity for the company. OpenAI, Mistral, and other Large language Model (LLM) driven innovations are happening throughout the industry. Azure AI is focused on building a platform that makes it easy for both first party Microsoft teams and third-party customers to build cutting edge applications on top of these large language models.

The Back Plane team in Azure Machine Learning is looking for a Senior Software Engineer who loves to build scalable, highly available, and secure microservices that run in Kubernetes. The infrastructure team focuses on managing a large fleet of Azure Kubernetes Service (AKS) clusters that represents the control plane for Azure AI.

The team focuses on:

  1. Managing Kubernetes Cluster live site and deployments at Scale.

  2. Secure Control Plane assets from malicious attacks and unauthorized access using industry standard tools and frameworks.

  3. Automate Monitors and critical alerts using best in class observability tools such as: Azure Monitor, Prometheus, Azure Data Explorer, Grafana.

  4. Automate CI/CD deployments using YAML builds and releases.

  5. Extensive experience with Kubernetes cluster creation, management, and optimization. Hands-on experience with GPU support in Kubernetes.

For the Azure ML platform, we build tools to increase the observability of the applications running in the Kubernetes clusters, improve the speed, security, and reliability of our deployments, secure our supply chain and services, and debug production with ease. We use the best of open source, like Prometheus, Grafana, and NGINX, and build solutions to enable Azure ML to deliver a global service that handles large scale ML training and inferencing workloads.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

As a Senior Software Engineer on our team, you will drive the design, development, and support of the platform that powers Azure Machine Learning. You’ll work across teams to help make the whole organization successful. Your responsibilities will include the following:

  • Extensive experience with Kubernetes cluster creation, management, and optimization.

  • Expert level knowledge around security and compliance experience for Kubernetes.

  • Extensive experience developing Infrastructure as Code (IaC) and Configuration as Code (CaC) solutions using tools like YAML, Bicep, and Helm.

  • Solid understanding of networking in k8s for inter, intra and external pod-to-pod, control and data plane communications.

  • Write clean and concise code with unit tests.

  • Design, implement, and support new features as well as extend existing systems.

  • Investigate live site issues and implement and deploy fixes.

  • Participate in an on-call rotation.

  • Security Configuration and Compliance: Configure, update, and maintain security tools used for endpoint security, log collection and reporting, vulnerability, and compliance scanning. Hardening and compliance with best practices, benchmarks, and remediate vulnerabilities as reported by security.

Other

Embody our culture and values

Qualifications

Required Qualifications:

  • Bachelor’s Degree in Computer Science, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

  • OR equivalent experience.

  • 4 + years experience in/with object oriented design fundamentals.

  • 4+ years of experience with coding in one of C#, Python, Go, Rust, Java, C or C++.

Other Requirements

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  • Experience with improving service operations or engineering fundamentals.

  • Collaboration skills, team player, thrive to make a difference.

  • Understanding of Microservices architecture, K8s, NGINX, Observability (Logs, Metrics, etc..), Network Layer protocols is a plus.

  • Certifications: one or more of Certified Kubernetes Application Developer (CKAD), Certified Kubernetes Admin (CKA).

Software Engineering IC4 – The typical base pay range for this role across the U.S. is USD $117,200 – $229,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $153,600 – $250,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Microsoft will accept applications for the role until September 10, 2024

#aiplatform , #azureml , #Infrastructure , #backplane

Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .