If you are looking for a high-impact Site Reliability role with a global leader in digital transformation, EPAM is the perfect next step in your career! As an EPAMer, you’ll have the opportunity to work with a supportive team, on a variety of interesting projects for some of the biggest brands in the world. Are you ready for the next step in your career journey? Apply now!
Req.#578148855
RESPONSIBILITIES
Function as program management point person for L2 support, ensuring that the efforts of the team are smooth and working as expected; create a program to drive value through SRE activities that reduce the need for L2 support over time
Lead development teams through architectural reviews and recommendations
Define what it means for a service to be available and develop, monitor, and alert on SLIs/SLOs
Define, track, and enforce error budgets
Review code instrumentation with development teams and ensure necessary dashboards are created to monitor SLI/SLO/SLAs
Establish, test, and tune alerting for varying tiers of applications
Document and maintain runbooks and procedures, automate as much as possible
Plan and execute periodic Disaster Recovery exercises including both tabletop and simulated failures (fault injection)
Perform periodic load and scalability testing to establish baselines, drift, and capacity planning
Design and implement peak readiness reviews for anticipated high-volume times
Lead weekly operational state reviews covering performance trends, anomalies, errors, and other availability events with SREs, product owners, and development teams
Participate in quarterly business and operational reviews aligning on roadmaps, development velocity, efficiency, growth trends, etc
Socialize SRE culture across teams within the organization to publicize the value of SRE, mentor and train other engineers around proactive reliability decision-making and planning
REQUIREMENTS
5+ years of SRE Engineering experience
3+ years as team lead or SRE champion
Experience functioning as program management point person for L2 support
Experience creating a program that drives value through SRE activities to reduce the need for L2 support over time
Bachelor's degree in Computer Science, similar technical field of study, or equivalent practical experience
Proven experience in troubleshooting, mitigating, and resolving issues in a distributed system
Strong communication and collaboration skills for varying groups of stakeholders
Be self-motivated and can prioritize effectively between competing priorities
Experience with implementing SRE practices for services and applications deployed in production in the cloud
Must understand most SRE concepts, including SLI/SLO/SLA, Error Budget, MTTD/MTTR/MTBF, Toil, Capacity Planning, Observability, Monitoring/Alerting, Release Engineering, and Incident Management/Blameless Post-Mortems
BENEFITS
Medical, Dental and Vision Insurance (Subsidized)
Health Savings Account
Flexible Spending Accounts (Healthcare, Dependent Care, Commuter)
Short-Term and Long-Term Disability (Company Provided)
Life and AD&D Insurance (Company Provided)
Employee Assistance Program
Unlimited access to LinkedIn learning solutions
Matched 401(k) Retirement Savings Plan
Paid Time Off – the employee will be eligible to accrue 15-25 paid days, depending on specific level and tenure with EPAM (accrual eligibility may change over time)
Paid Holidays - nine (9) total per year
Legal Plan and Identity Theft Protection
Accident Insurance
Employee Discounts
Pet Insurance
Employee Stock Purchase Program
If otherwise eligible, participation in the discretionary annual bonus program
If otherwise eligible and hired into a qualifying level, participation in the discretionary Long-Term Incentive (LTI) Program
ABOUT EPAM
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potentialADDITIONAL
This posting includes a good faith range of the salary EPAM would reasonably expect to pay the selected candidate. The range provided reflects base salary only. Individual compensation offers within the range are based on a variety of factors, including, but not limited to: geographic location, experience, credentials, education, training; the demand for the role; and overall business and labor market considerations. Most candidates are hired at a salary within the range disclosed. Salary range: $85k– $155k. In addition, the details highlighted in this job posting above are a general description of all other expected benefits and compensation for the position
Applications will be accepted on a rolling basis
EPAM Systems, Inc. is an equal opportunity employer. We recognize the value of diversity and inclusion in creating success for our customers, business partners, shareholders, employees and communities. We are committed to recruiting, hiring, developing and promoting employees without discrimination. As a global employer, this commitment includes complying with all laws in the countries in which we operate. Nevertheless, we believe equal employment practices should not be limited to what the law requires. Equal opportunity and inclusion are essential to motivate, empower and recognize the best in everyone.
At EPAM, employment actions are based on individual qualifications, without regard to race, color, religion, creed, gender, pregnancy status, sexual orientation, gender identity, gender expression, marital or familial status, national origin, ancestry, genetics, age, disability status, veteran status, citizenship status when otherwise legally able to work, or any other characteristic protected by law.