Site Reliability Engineer

2 weeks ago

Singapore Thought Machine Full time

**General information**:

- Job Title- Site Reliability Engineer- City- Singapore- Country- Singapore- Division- Engineering**Description**:

- Thought Machine’s mission is bold - to properly and permanently rid the world’s banks of legacy technology. To achieve this, we have developed the foundations of modern banking through core and payments technology which run natively in the cloud. What we are attempting is hard and means we need great people working together to build great technology.We have grown rapidly in the past few years - growing our team to more than 550 individuals across offices in London, New York, Singapore and Sydney. We have raised more than $500m in funding and are now valued at $2.7bn. Our investors include Temasek, Standard Chartered Ventures, Molten Ventures, Eurazeo, Intesa Sanpaolo, Nyca Partners, JPMorgan Chase Strategic Investments, and more.- We have created a culture enabling our team to produce the best work in the industry, ensuring we have fun along the way. We're regularly cited as having a fantastic workplace culture and have been recognised by Sifted magazine as having one of the highest Glassdoor ratings for a UK fintech company and the most generous employee share package in the industry. We've been named in the IDC list of top 100 fintechs, and the Singapore HR Awards awarded us Gold and Silver for our workplace culture and employee experience.We are spinning up a new regional SaaS platform team responsible for providing a world-class SaaS offering, by continuously improving and maintaining our SaaS platform. The team will be geographically distributed across our two main hubs: UK, SG.Joining this team is an excellent opportunity to get exposure to how mission-critical systems are run in production. You will be part of a team that owns the system end-to-end and have a deeper understanding of exactly how our clients use the system (for example by extracting usage analytics).The team will own the platform end-to-end, making use of existing infrastructure, improving core Terraform modules, as well as developing operators, tooling and additional infrastructure where appropriate. They will also be responsible for L2 support (for client-initiated support requests) and L1 (for alerting-based incidents). Support will be provided during working hours, with a follow-the-sun model and handovers happening between the 3 regions.Definition and development of the SaaS roadmap is another critical responsibility of this team. Alongside the Product Management function, they will define technical requirements, features and implement them with the goal of offering an excellent SaaS experience to our clients.**Duties**
- Provision SaaS environments as new clients are onboarded.
- Be part of the on-call rota (during business hours), responsible for resolving alerts generated by proactive monitoring and working closely with CANs to provide L2 support for client-initiated support requests.
- Define and implement the feature roadmap to improve the SaaS platform, for example by implementing self-service functionality, exposing metrics to clients, improving automation and self-healing properties of the system.
- Improving the scalability, security and performance of the SaaS platform, by implementing automated compliance and controls, testing different Kafka and DB setups (e.g. Aurora vs RDS) and running load tests at every level of the stack.
- Implementing and regularly testing DR strategies to ensure the highest level of resilience and fault tolerance of the platform.

**Requirements**:
**Essential**
- Strong background in Linux/Unix administration, e.g. Ubuntu, Debian
- A strong background in at least one of Go, Python or Java
- A strong background in one of the following: database administration, Kafka, observability tools (such as Prometheus or Zipkin) or infrastructure automation.
- Experience with AWS or GCP is essential
- Experience or knowledge of container orchestration tools, e.g. Kubernetes

**Desirable**
- Experience in supporting production systems
- Experience with automation/configuration management, e.g. Terraform, Puppet, Chef, Ansible

**Benefits**:

- Highly competitive salary
- Bonus incentive
- Healthcare
- 25 days holiday and public holidays
- Competitive maternity and paternity leave
- $1,500 SGD per year flexible spend benefit
- All the latest tech you need
- A talented and experienced team as your colleagues
- An environment where we encourage learning and progress

Site Reliability Engineer

4 days ago

Singapore Beijing Foreign Enterprise Management Consultants Co.,Ltd. Full time

Get AI-powered advice on this job and more exclusive features. Direct message the job poster from Beijing Foreign Enterprise Management Consultants Co.,Ltd. On behalf of Huawei, a world-renowned information and communication technology company, we are seeking passionate and talented individuals to join our team as Site Reliability Engineer. Job...
Site Reliability Engineer

24 minutes ago

Singapore ETEAM WORKFORCE PTE. LTD. Full time

Position: Site Reliability Engineer (SRE) Work Mode - Onsite/Hybrid Timing - 9am to 6 pm Duration – 1 Year (Highly extendable) Salary: 6018 SGD Work Location: Robinson Road, Singapore About the Role We are looking for a seasoned Site Reliability Engineer (SRE) with 5+ years of experience to join our Platform Engineering team. This role is ideal for someone...
Site Reliability Engineer

1 day ago

Singapore JJ Consulting Services Full time

Our Client is a fast growing company in Singapore, who is seeking to recruit a Site Reliability Engineer. **Site Reliability Engineer** **Key Roles & Responsibilities** - Providing ancillary support of Enterprise-Grade Products and solutions at customer's sites - Ironing out deployment issues or challenges that our customers may face - Responsible for...
Site Reliability Engineer

7 days ago

Singapore NodeFlair Full time

**Job Summary**: **Salary** S$11,500 - S$16,500 / Monthly **Job Type** **Seniority** Senior **Years of Experience** At least 7 years **Tech Stacks** Microsoft Puppet Java Ansible Python **This is Adyen** Adyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the...
Site Reliability Engineer

4 days ago

Singapore HCLTech Full time

We are seeking a highly experienced Site Reliability Engineer (SRE) with 10 years of expertise in building, managing, and optimizing reliable, scalable, and secure systems. This role requires strong proficiency in end-to-end SRE practices across multi-cloud, hybrid cloud, and on-premises data center environments. The ideal candidate will drive automation,...
Site Reliability Engineer

24 minutes ago

Singapore Salt Talent Search Pte Ltd Full time

SALT is hiring Site Reliability Engineer for a global technology client in Singapore for 12 months & renewable contract assignment. Responsibilities Reliability Engineering: Define and implement SLIs, SLOs, and error budgets to measure and improve service reliability. Cloud Infrastructure: Design, deploy, and manage infrastructure on Google Cloud Platform...
Site Reliability Engineer

2 weeks ago

Singapore Abaxx Commodity Futures Exchange and Clearinghouse Full time

Site Reliability Engineer - Networking We are seeking a competent candidate joining our Infrastructure Team for the mission building and operating a MAS regulated marketplace and clearing house. This role is ideal for someone with a strong foundation in AWS services, infrastructure as code, and cloud security, who is passionate about building scalable,...
Site Reliability Engineer

5 days ago

Singapore DT One Full time

Site Reliability Engineer role at DT One Keeping more people, more connected, more often DT One was founded with the aim to provide mobile carriers with the infrastructure and services they need to help migrant workers stay in touch with their family and friends back home. Today, we operate a leading global network for mobile top-up solutions, innovative...
Site Reliability Engineer

2 weeks ago

Singapore NetEase Games Full time

Overview Join to apply for the Site Reliability Engineer role at NetEase Games . As a leading internet technology company based in China, NetEase provides premium online services centered around content creation and operates a broad gaming ecosystem. Job Description Site Reliability Engineering (SRE) refers to using software engineering methods to manage...
Site Reliability Engineer

2 weeks ago

Singapore NetEase Games Full time

Overview Join to apply for the Site Reliability Engineer role at NetEase Games . As a leading internet technology company based in China, NetEase provides premium online services centered around content creation and operates a broad gaming ecosystem. Job Description Site Reliability Engineering (SRE) refers to using software engineering methods to manage...

Americas

Europe

Asia / Oceania

Africa

Site Reliability Engineer