Systems Reliability Engineer
1 month ago
About Us
At Cloudflare, we have our eyes set on an ambitious goal: to help build a better Internet. Today the company runs one of the world’s largest networks that powers approximately 25 million Internet properties, for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online without adding hardware, installing software, or changing a line of code. Internet properties powered by Cloudflare all have web traffic routed through its intelligent global network, which gets smarter with every request. As a result, they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was named to Entrepreneur Magazine’s Top Company Cultures list and ranked among the World’s Most Innovative Companies by Fast Company.
We realize people do not fit into neat boxes. We are looking for curious and empathetic individuals who are committed to developing themselves and learning new skills, and we are ready to help you do that. We cannot complete our mission without building a diverse and inclusive team. We hire the best people based on an evaluation of their potential and support them throughout their time at Cloudflare. Come join us
Production Engineering is responsible for the world’s most reliable, observable, performant, and safe network ecosystem. Our customers rely on our products and systems to safely modify, troubleshoot, and release products without external impact.
Our external customers rely on us to provide seamless and predictable incident, traffic, policy management, resulting in the fastest and safest network services in the world.
We are accountable for the overall performance of internal and external facing services, guiding our product teams to optimal configurations and maximum efficiency. From the moment that a packet enters the Cloudflare ecosystem, we know exactly what its expected purpose and behavior is and we are capable of determining and exposing anomalous behavior.
The Cloudflare network makes it possible to solve challenges at massive scale and efficiency which would be impossible for almost any other organization.
In this role, you can expect to:
- Develop Software: Design, write, and deliver software that improves Cloudflare's Edge platform
- Work on large scale systems: Scale and evolve systems through software and automation to improve reliability and velocity
- Maintain and manage distributed systems: Manage and be part of the on-call rotation that supports the largest distributed edge system in the world.
- Document, Propose and Implement: Collaborate with other engineers to design and implement scalable solutions that support our growing user base.
- Guide and mentor: Participate in the constant cycle of knowledge sharing and mentoring.
- Optimize and Automate: Research and introduce cutting-edge technologies. Develop and maintain sustainable tools that work on an extremely large scale.
- Open Source: Contribute to open-source
We are growing quickly and focused on building an extraordinary company. This is a systems reliability engineering role and is a superb opportunity to be part of a high performing team and help to support Cloudflare’s mission and help build a better internet.
You will build services and APIs to constantly improve availability, performance and uptime.
You may be a good fit for our team if you have:- Proficiency in distributed Linux/Unix environments
- Proficiency in high-level programming (e.g., Golang, Python)
- Proficiency in configuration management (e.g., Saltstack, Chef, Puppet, Ansible)
- Proficiency in networking protocols Layer 3-7 of the OSI model
- Experience in performance analysis, debugging, and troubleshooting
- Experience in SQL databases (e.g., Postgres, MySQL)
- Experienced with being part of a rotation that tends to high priority reliability objectives
- Experience in load balancing and reverse proxies (e.g., Nginx)
- Familiarity with Key/Value stores (e.g., Redis)
- Familiarity with Internetworking and BGP
- Exquisite written and verbal communication skills
- Strong bias for action
- Experience with continuous integration and delivery (CI/CD)
- Experience working in a 24/7/365 service environment
- Experience with high-bandwidth transit Internetworking and routing
- Passion for tooling and automation
Tell employers what skills you have
Nginx
Unix
Cross Functional Relationships
SQL
Configuration Management
Python
Ansible
Excellent Interpersonal Communication Skills
Linux
BGP
-
Reliability Engineer
3 weeks ago
Singapore ZERRO POWER SYSTEMS PTE. LTD. Full timeRoles & ResponsibilitiesJob Description:Test Devices in lab for correlationPlan and test Device Qualification runs (HTOL, ESD, LU) with external test houseRequirements:Degree in Electrical/Electronics.Diploma holders with significant experience will be considered.Minimum 5 years of work experience in reliability tests of Integrated Chip and or modules...
-
Engineer Reliability
4 weeks ago
Singapore GlobalFoundries Full timeAbout GlobalFoundriesGlobalFoundries is a leading full-service semiconductor foundry providing a unique combination of design, development, and fabrication services to some of the world's most inspired technology companies With a global manufacturing footprint spanning three continents, GlobalFoundries makes possible the technologies and systems that...
-
Reliability Engineer
3 weeks ago
Singapore Zerro Power Systems Pte. Ltd. Full timeJob Description:Test Devices in lab for correlationPlan and test Device Qualification runs (HTOL, ESD, LU) with external test houseRequirements:Degree in Electrical/Electronics.Diploma holders with significant experience will be considered.Minimum 5 years of work experience in reliability tests of Integrated Chip and or modules including ESD/LU, HTOL is...
-
SMTS Reliability Engineer
3 weeks ago
Singapore ADVANCED MICRO DEVICES (SINGAPORE) PTE LTD Full timeRoles & ResponsibilitiesTHE ROLE:Join a dynamic global team focused on advancing package reliability testing at the board and system level for AMD's cutting-edge processor and accelerator products that employ innovative chiplet and CoWoS technologies. Work closely with cross-functional teams within AMD Global Operations, various business units, and an...
-
SMTS Reliability Engineer
3 weeks ago
Singapore Advanced Micro Devices (singapore) Pte Ltd Full timeTHE ROLE:Join a dynamic global team focused on advancing package reliability testing at the board and system level for AMD's cutting-edge processor and accelerator products that employ innovative chiplet and CoWoS technologies. Work closely with cross-functional teams within AMD Global Operations, various business units, and an external, worldwide supply...
-
Reliability Engineer
1 day ago
Singapore Tower Research Capital Full timeTower Research Capital, a high-frequency proprietary trading firm founded in 1998, seeks a Reliability Engineer to join our APAC Application Reliability Engineering team in Singapore.The Managed Services division is responsible for providing innovative processes and tools for the operation of Tower's high/mid-frequency trading platforms.Responsibilities...
-
Site Reliability Engineer
1 month ago
Singapore ADYEN SINGAPORE PTE. LTD. Full timeRoles & ResponsibilitiesThis is AdyenAdyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition.For our teams, we create an environment with opportunities for our people to succeed,...
-
Site Reliability Engineer
1 month ago
Singapore Adyen Singapore Pte. Ltd. Full timeThis is AdyenAdyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition.For our teams, we create an environment with opportunities for our people to succeed, backed by the culture and...
-
Site Reliability Engineer
1 week ago
Singapore Adecco Personnel Pte Ltd Full timeResponsibilitiesTo be responsible for reliability, availability, user experience, capacity planning, toil reduction, process enhancement and digitalization of the cloud-based internet services.Handle SRE role for assigned cloud services owning the KPIs for reliability, issue to resolution, service deployment, business continuity management, security policy...
-
Site Reliability Engineer
2 weeks ago
Singapore ADECCO PERSONNEL PTE LTD Full timeRoles & ResponsibilitiesResponsibilitiesTo be responsible for reliability, availability, user experience, capacity planning, toil reduction, process enhancement and digitalization of the cloud-based internet services.Handle SRE role for assigned cloud services owning the KPIs for reliability, issue to resolution, service deployment, business continuity...
-
Site Reliability Engineer
3 weeks ago
Singapore LIVERAMP PTE. LTD. Full timeRoles & ResponsibilitiesABOUT THIS JOBThe SRE team is responsible for owning and supporting deployments of global products, and providing first line operational support. We are looking for a Site Reliability engineer who is excited about establishing and advocating for best practices for product deployments and SRE. You will be able to leverage your software...
-
Site Reliability Engineer
4 weeks ago
Singapore Wipro Limited Full timeJob Role : Site Reliability Engineer Location : SingaporeExperience : 2+ Years of relevant experience Job Description : Responsibilities : Hands-on design, implement, and extend automation tools for infrastructure, application, and container management. Monitor Staging, Test and Development environments for a myriad of Products in an agile and dynamic...
-
ASE - Site Reliability Engineer
2 weeks ago
Singapore APPLE SOUTH ASIA PTE. LTD. Full timeRoles & ResponsibilitiesJob SummaryApple Services Engineering team is one of the most exciting examples of Apple’s long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud services for millions of Apple users. We are building and...
-
Site Reliability Engineer
3 weeks ago
Singapore Liveramp Pte. Ltd. Full timeABOUT THIS JOBThe SRE team is responsible for owning and supporting deployments of global products, and providing first line operational support. We are looking for a Site Reliability engineer who is excited about establishing and advocating for best practices for product deployments and SRE. You will be able to leverage your software engineering expertise...
-
ASE - Site Reliability Engineer
2 weeks ago
Singapore Apple South Asia Pte. Ltd. Full timeJob SummaryApple Services Engineering team is one of the most exciting examples of Apple's long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud services for millions of Apple users. We are building and supporting new and existing...
-
ASE - Site Reliability Engineering Manager
2 weeks ago
Singapore APPLE SOUTH ASIA PTE. LTD. Full timeRoles & ResponsibilitiesJob SummaryApple Services Engineering team is one of the most exciting examples of Apple’s long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineering Manager, to help support and scale cloud services for millions of Apple users. This is a...
-
ASE - Site Reliability Engineering Manager
2 weeks ago
Singapore Apple South Asia Pte. Ltd. Full timeJob SummaryApple Services Engineering team is one of the most exciting examples of Apple's long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineering Manager, to help support and scale cloud services for millions of Apple users. This is a hands-on role, to establish...
-
Reliability Engineer
2 weeks ago
Singapore Atr Eastern Support Pte Ltd Full timeAvions de Transport Regional (ATR) GIE Founded in 1981. ATR has become world leader on the market for regional aircraft with 90 seats or less. Since its creation, ATR has sold over 1,500 aircraft to over 200 operators based in more than 100 countries. ATR planes have totaled over 28 million flight hours. ATR is a joint partnership between two major European...
-
Site Reliability Engineer
4 weeks ago
Singapore Sciente Consulting Full timeMandatory Skill-set Bachelor's degree in Computer Science, Mathematics, Engineering, or any related field; Has 3 to 4 years of proven experience in monitoring application and systems; Expertise in Grafana, Elastic Stack (Elasticsearch, Logstash, Kibana, Beats), and Kafka, including setup, configuration, upgrades, patching, data management, monitoring,...
-
Site Reliability Engineer
4 days ago
Singapore A-IT SOFTWARE SERVICES PTE LTD Full timeRoles & ResponsibilitiesRole: Site Reliability EngineerJob Level: 3-5 years of relevant experience (L2)Job DescriptionJob Title: Site Reliability EngineerJob ObjectivesThe Site Reliability Engineer/Software Engineer is a contract position responsible software and systems engineering to build and run large-scale, distributed, fault-tolerant systems.As a...