Home
/
Comprehensive
/
Principal Site Reliability Engineer
Principal Site Reliability Engineer-December 2024
Vancouver
Dec 26, 2024
About Principal Site Reliability Engineer

  What is Viva Engage?

  Viva Engage is the industry-defining social network for the enterprise. We provide a platform for millions of employees, including those from 85% of Fortune 500 companies, to build community and culture, share knowledge, and connect with their leaders and each other.

  Why Viva Engage?

  Acquired by Microsoft in 2012, Viva Engage combines the benefits of a startup - rapid innovation, cutting-edge technology, outsized individual impact - with the advantages of working for one of the most successful software companies in the world. We believe in mission-driven work and in this post-Covid world, our platform has become more indispensable than ever as it fosters connection and a sense of belonging among remote teams. #VivaEngage

  You will have:

  Autonomy and freedom to innovate

  Choice of the best of open source and Microsoft-internal technology

  The ability to experiment, A/B test, and make data-driven decisions

  Opportunity for outsized impact as part of a small but mighty team on a rapidly-growing product needed now more than ever.

  As a Principal Site Reliability Engineer in Viva Engage, you will have two critical accountabilities:

  The first is driving efforts to fully embrace site reliability engineering principals while building critical infrastructure, optimizing existing systems, and eliminating toil. You will lead efforts that combine software and systems engineering to build, scale and operate the large-scale conversation platform that powers Viva Engage experiences.

  The second expectation is to improve overall reliability for Viva Engage. This means guiding and influencing peers to develop missing capabilities, and driving changes to our culture and processes to make reliability a critical aspect of how we work. We have been growing rapidly to become a critical workload for many of the world’s largest organizations and are looking for you to help us get to the next level.

  Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

  Responsibilities

  Develop and execute on the observability and telemetry strategy

  Own the telemetry and monitoring infrastructure

  Continually seek deeper insights into the performance, reliability & scalability of our systems

  Improve service reliability for the entire Yammer team, by reducing mean time to recovery (MTTR)

  Help all of Yammer prevent service incidents altogether

  Qualifications

  Required/Minimum Qualifications

  8+ years technical experience in software engineering, network engineering, or systems administration

  OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 5+ years technical experience in software engineering, network engineering, or systems administration

  OR Master's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration

  OR Doctorate Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.

  6+ years of experience building large scale distributed systems.

  6+ years of experience in a Site Reliability Engineering role building and operating systems with world-class reliability at huge scale

  Preferred Qualifications/Attributes

  Knowledge of log and metrics pipelines (ELK stack or cloud services)

  Troubleshooting skills and ability to trace request through an entire stack.

  Micro services development, deployment, and monitoring.

  Curious about reliability and performance, in all levels of the stack 

  Experience with large datasets and data migrations

  Azure | AWS | GCP automation 

  Site Reliability Engineering IC5 - The typical base pay range for this role across Canada is CAD $132,800 - CAD $247,200 per year.

  Find additional pay information here:

  https://careers.microsoft.com/v2/global/en/canada-pay-information.html

  Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .

Comments
Welcome to zdrecruit comments! Please keep conversations courteous and on-topic. To fosterproductive and respectful conversations, you may see comments from our Community Managers.
Sign up to post
Sort by
Show More Comments
SIMILAR JOBS
RN - Registered Nurse - Float Pool Tier 1- FT Nights
At Bon Secours Mercy Health, we are dedicated to continually improving health care quality, safety and cost effectiveness. Our hospitals, care sites and clinicians are recognized for clinical and ope
Associate Project Director ESRD - REMOTE - 1039/709/3686_43067705237_19-3819
This job was posted by https://okjobmatch.com : For more information,please see: https://okjobmatch.com/jobs/2995702 Job Details Are you passionate about improving the quality of healthcare? Are you
Principal Member of Technical Staff
Job Description The Cloud Lifecycle Management Service team’s charter is to build fully managed, cloud native services focused on simplifying the development of Mobile and Web Applications using Orac
Quality Control - Aggregate Tester
Now taking applications for an Aggregate Tester needed to assist with quality control on gravel and asphalt paving construction projects. Work is seasonal April through November. Persons hired must b
Customer Support Engineer (MTLCSEAA)
Represents the company to the customer and assumes accountability for customer satisfaction with service. Responsible for customer service activities associated with updating, troubleshooting, diagno
Market Clerk
JOB FUNCTION: As a Market Clerk, you will play a critical role in providing high-quality meat products to our customers and ensuring the smooth and efficient operation of our market department. Your
Staff Engineer
26647BR Service Line: Geotechnical Office Name: San Antonio Job Description: General Responsibilities: Provide engineering and consulting services for a broad array of projects and clients. This may
Renewal Specialist
Renewal SpecialistRemote - United StatesJR009952 At Ensono, our Purpose is to be a relentless ally, disrupting the status quo and unleashing our clients to Do Great Things! We enable our clients to a
Manufacturing Production Manager
Job detailsHeres how the job details align with yourprofile. Pay$60,000 - $90,000 a year Job typeFull-time Shift and scheduleDay shiftMonday to Friday LocationChestertown, MD 21620 BenefitsPulled fro
CCS Internship Program
What You Will Do Come join the brightest minds at the most innovative R&D facility supporting our national security! The Computer, Computational and Statistical Sciences (CCS) Division is seeking
Copyright 2023-2024 - www.zdrecruit.com All Rights Reserved