Summary:
Meta is seeking a Production Systems Engineer to join our Release to Production (RTP) team in Dublin. Our servers and data centers are the foundation upon which our rapidly scaling infrastructure operates efficiently to deliver our innovative services. The RTP team is responsible for the end-to-end Hardware Lifecycle of all Meta servers, from exploration and development to production health. RTP Engineers work closely with Production Engineering teams, Enterprise Networking, Hardware Designers, Networking Teams, Manufacturers, Vendors, Datacenter Operation teams and New Product Introduction teams to ensure the smooth operation of systems across the planet.We encounter problems from the very smallest of scales (errors occurring at the microscopic scale, within single registers of a CPU) up to the very largest - deploying solutions to our entire millions-strong fleet. We look for people with curiosity and drive, who want to tackle the hardest problems in the domain.Typically we will hire engineers from backgrounds such as Site Reliability Engineer (SRE), Software Engineer, Systems Engineer, Systems Development Engineer, DevOps Engineer, Systems Administrator, or similar. You will have demonstrated ability to drive projects to successful business outcomes. Your previous experience will always be less important than demonstrated problem solving abilities and attitude.
Required Skills:
Production Systems Engineer Responsibilities:
Build and develop tooling solutions to automate business critical processes in service of managing the health of the Meta production fleet
Troubleshoot, diagnose and root cause system failures, working with key partners to identify and deliver solutions
Proactively identify opportunities to fix or enhance tooling, hardware and processes
Build subject matter expertise in one or more of the specialist areas covered by the RTP team in Dublin - Firmware Deployment
Edge/CDN hardware
or Silicon Sustaining
Minimum Qualifications:
Minimum Qualifications:
Bachelors degree in Computer Science, related technical discipline, or equivalent work experience
4+ years experience coding in a higher-level language (Python, PHP, Java, Go, Rust, C++)
Experience building, maintaining and debugging production services or platforms - usually (but not necessarily) in a linux/unix environment
Knowledge of server architecture and components across Compute/Storage/AI Systems/Networking
Scientific approach to troubleshooting, root-cause analysis and investigation
Good communication skills, able to collaborate easily with others
Industry: Internet