Job Description
Sr Engineer: DevOps/Site Reliability Engineering (SRE)
We are looking for a talented, motivated, and experienced Cloud DevOps and Site Reliability Engineer (SRE). As part of the IoT (Internet of Things) team you will be working on the next generation of IoT products. This role includes research and analysis of SW solutions for next generation IoT products. Develop prototypes and proof of concepts to evaluate incremental as well as radical new architectures. Develop SW components that are scalable, available, and secure.
Cloud engineers within Qualcomm are dedicated to building, evolving, and operating rapidly changing, secure and resilient systems at scale. Our engineers participate in the entire service lifecycle from design through the development and testing process to production support and operations. Our practice is to bring the best-of-show development techniques to operational tasks.
Responsibilities
Serve as an advocate for quality practices including the development of automated testing to improve business processes.
Act as a critical part of a multi-team effort to deliver, manage and maintain configuration automation to meet business needs.
Create and maintain configuration standards for software and infrastructure.
Manage CI & CD tools and pipelines as a partner to development and QA teams.
Develop and socialize operational standards for teams throughout engineering.
Recommend, develop and implement system enhancements that will improve the performance and reliability of the system including installing, upgrading/patching, monitoring, problem resolution, configuration management and security.
Oversight of critical incident and major system escalations from initiation to resolution.
Create mechanisms/architectures that enable fault tolerance and rapid recovery from failure.
Participate in a rotating on-call escalation service.
Create and maintain configuration standards for software and infrastructure.
Capacity Planning and Chaos Engineering.
Strong communication skills, verbal and written.
Qualifications
Bachelor’s degree in a technical field, or equivalent experience
4 - 7 years’ experience in an operational environment, preferred
Technical Requirements
Experience with Linux Operating Systems in a production and development environments
Experience in network and server engineering
Experience with automation/configuration management such as Ansible, Chef, Puppet or equivalent
Experience with workflow data pipeline management services such as Airflow and/or Luigi
Expertise on the latest Cloud compute, load balancing and scaling, storage, networking, security, and virtualization technologies with Cloud providers such as Azure (preferred), AWS and/or Google.
Demonstrated experience installing, operating and troubleshooting a variety of open- source technologies.
Experience with relational and non-relational databases
Practical experience developing software or meeting operational needs with code and scripting (Bash, Python, Perl, Ruby and/or Java)
Experience with software quality principles and associated tools for testing and analysis.
Knowledge of CI & CD practices and supporting tools (Jenkins, Bamboo, or similar)
Experience with IaC Technologies such as Terraform, CloudFormation or Pulumi
Experience with PaaS technologies such as containers, container orchestration and scheduling, service registration / discovery and monitoring (Docker, Kubernetes, etc.)
Load, scalability, systems, or performance testing experience
Observability & Monitoring expertise to dissect data to get to the root cause of system and infrastructure issues.
EEO Employer: Qualcomm is an equal opportunity employer; all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or any other protected classification