Staff Site Reliability Engineer, Platform

3 weeks ago


Singapur, Singapore GEMINI Full time

Department

: Platform

Our Platform organization’s purpose is to enable Gemini to scale effectively and empower our engineering teams to focus on building innovative financial products and experiences for individuals around the world. Platform focuses around building a scalable and secure foundations platform, enabling Engineering to deploy, validate, and operate their services in production, improve resiliency of the service and increase organizational efficiency by reducing operational toil and increase system efficiency through architectural evolution.

The Site Reliability Engineering team engages directly with our other engineering teams to onboard them onto our platform systems, reviewing and recommending design and architectural decisions, and guiding our engineering teams on how to implement the tooling provided by the larger Platform organization required to ensure systems can scale and react to changing conditions, with continuous improvement loops.

The Role:   Staff Site Reliability Engineer

You will be an integral part of leading Gemini’s engineering teams towards modern DevOps practices, both by developing and providing modern automation and operational tooling, and working cross-functionally across Gemini’s engineering teams to influence and shape our development practices and culture.

Responsibilities:

Provide primary operational support and engineering for various Gemini services Improve reliability, quality and time-to-market across all Gemini services and offerings Guide engineering teams onto the various supported services provided by Platform Run on-going performance evaluations and improvements for Gemini systems Architecture recommendations and engagement as part of SDLC Create “Production-ready Scorecards” to evaluate the health of systems pre-launch Implement and teaching monitoring, alerting and automated resolution best practices Define SLIs, SLOs with Engineering teams Educate and guide Engineering teams on reliability and resiliency best practices, like statelessness, chaos testing, blue/green deployments, etc. Design, build, and maintain operational tooling and automation that streamline processes and enhance system reliability

Qualifications:

7+ years using monitoring, alerting, and automation tooling to understand and remediate performance and health issues in systems at scale Good knowledge for various cloud technology providers like AWS, GCP, or Azure Expert in an infrastructure as code environment (Terraform), developing automated solutions to solve support and operational issues Experience as a Technical Leader within a team, helping evaluating and making tech decisions for the team Expert working with containerization such as Nomad, EKS (k8s), Docker, etc. Expert working with Configuration Management such as Ansible, Chef, Puppet Proficient writing scripts or cli tools that help increase Developer Productivity in high-level languages like Python, Go, etc. Expert analyzing system and application performance, identifying bottlenecks, and recommending architectural or systemic improvements Experience working with Engineering teams, teaching, training, and mentoring on how to implement best-practice technical solutions

It Pays to Work Here

We take a holistic approach to compensation at Gemini, which includes:

Comprehensive health plans covered at 100% for employees and dependents Long-term incentive in the form of a new hire equity grant Paid Parental Leave Up to 14 paid vacation days (in addition to public/bank holidays)

  • Singapur, Singapore GEMINI Full time

    About the RoleWe are seeking a highly skilled Staff Site Reliability Engineer to join our Platform organization at Gemini. As a key member of our team, you will play a critical role in enabling our engineering teams to focus on building innovative financial products and experiences for individuals around the world.Key ResponsibilitiesProvide primary...


  • Singapur, Singapore GEMINI Full time

    Department : Platform Our Platform organization’s purpose is to enable Gemini to scale effectively and empower our engineering teams to focus on building innovative financial products and experiences for individuals around the world. Platform focuses around building a scalable and secure foundations platform, enabling Engineering to deploy, validate,...


  • Singapur, Singapore Shopee Full time

    Company Name: ShopeeTitle: Lead Site Reliability EngineerJob Overview:Join a dedicated Engineering and Technology team at Shopee, where innovation meets reliability.Engage in the development and upkeep of essential marketplace operations.Work across comprehensive platforms and solutions, focusing on system design and optimization.Experience a vibrant work...


  • Singapur, Singapore Shopee Full time

    Company Name: ShopeeTitle: Lead Site Reliability EngineerJob Overview:Join a dedicated Engineering and Technology team at Shopee, where innovation meets reliability.Engage in the enhancement and upkeep of essential marketplace operations.Work across comprehensive platforms and solutions, focusing on system design and optimization.Experience personal and...


  • Singapur, Singapore Shopee Full time

    Company Name: ShopeeTitle: Lead Site Reliability EngineerJob Overview:Join a dedicated Engineering and Technology team at Shopee, where innovation meets reliability.Engage in the enhancement and upkeep of essential marketplace operations.Develop and refine comprehensive platforms and solutions, focusing on system design and optimization.Experience personal...


  • Singapur, Singapore Shopee Full time

    Company Name: ShopeeTitle: Lead Site Reliability EngineerJob Overview:Join a dedicated Engineering and Technology team at Shopee, where innovation meets passion.Engage in the development and upkeep of essential marketplace operations.Work across comprehensive platforms and solutions, focusing on system design and optimization.Experience personal and...


  • Singapur, Singapore Shopee Full time

    Company Name: ShopeeTitle: Lead Site Reliability EngineerJob Overview:An exceptional chance to become part of a dedicated Engineering and Technology team at Shopee.Engage in the enhancement and upkeep of essential marketplace operations.Contribute to comprehensive platform solutions, focusing on system design and optimization.Experience personal and...


  • Singapur, Singapore TikTok Full time

    About the team Our Compute Platform SRE team supports all Big Data services and products across the company. We are a newly established team and waiting for talents like you to shape the team's future together. We are responsible for the reliability of all the company's major data warehouse products, services, and query engines. We serve business needs...


  • Singapur, Singapore TikTok Full time

    About the TeamOur Compute Platform SRE team is dedicated to supporting all Big Data services and products within the organization. As a newly formed team, we are eager to welcome talented individuals like you to help shape our future. We are tasked with ensuring the reliability of the company's key data warehouse products, services, and query engines,...


  • Singapur, Singapore Shopee Full time

    About the RoleAs a Senior Site Reliability Engineer at Shopee, you will play a critical role in managing the technical operations of our core marketplace businesses. This includes product lines such as voucher management, discount/coins management, selling listing online, intelligence and data, and more.Key ResponsibilitiesDesign and Implement Scalable...


  • Singapur, Singapore TikTok Full time

    About the TeamOur Compute Platform SRE team is dedicated to supporting all Big Data services and products across TikTok. As a newly formed team, we are eager to welcome talented individuals like you to help shape our future. We are entrusted with ensuring the reliability of the company's primary data warehouse products, services, and query engines, catering...


  • Singapur, Singapore United Overseas Bank Full time

    AVP Site Reliability Engineer, Group Infrastructure Platform Services Posting Date: 21-May-2023 Location: Singapore, Singapore Company: United Overseas Bank Ltd About UOB United Overseas Bank Limited (UOB) is a leading bank in Asia with a global network of more than 500 branches and offices in 19 countries and territories in Asia Pacific,...


  • Singapur, Singapore DBS Bank Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineering Lead to join our team at DBS Bank. As a Site Reliability Engineering Lead, you will play a critical role in ensuring the stability and excellence of our operations.Key ResponsibilitiesLead the development and implementation of site reliability engineering practices and tools to ensure...


  • Singapur, Singapore Shopee Full time

    About the RoleAs a Senior Site Reliability Engineer at Shopee, you will play a critical role in managing the technical operations of our core marketplace businesses. This includes product lines such as voucher management, discount/coins management, selling listing online, intelligence and data, and more.Key ResponsibilitiesDesign and Implement Scalable...


  • Singapur, Singapore TikTok Full time

    About the TeamTikTok's Compute Platform SRE team is a newly established team that supports all Big Data services and products across the company. We are responsible for the reliability of all the company's major data warehouse products, services, and query engines. Our team serves business needs across domains within TikTok.ResponsibilitiesLead a global SRE...


  • Singapur, Singapore Citadel Securities Full time

    Job SummaryCitadel Securities is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our trading systems and applications.Key ResponsibilitiesDesign and implement scalable and efficient infrastructure solutions to support our...


  • Singapur, Singapore Sea Full time

    Our Infrastructure team provides the end-to-end managed services and solutions for the Group's entire Internet infrastructure alongside running business applications. We excel in building the architecture, providing solutions and operations of data centre, connectivity, cloud, networking, system, storage and security. We are a proud provider of high-quality...


  • Singapur, Singapore Sea Full time

    Our Infrastructure team provides the end-to-end managed services and solutions for the Group's entire Internet infrastructure alongside running business applications. We excel in building the architecture, providing solutions and operations of data centre, connectivity, cloud, networking, system, storage and security. We are a proud provider of high-quality...


  • Singapur, Singapore NodeFlair Full time

    We are working with one of the leading pioneers in the Cryptocurrency space as one of the largest data platform, and as part of their continued growth, NodeFlair has been engaged to search for a Senior Site Reliability Engineer to join their Singapore/Remote team. Summary: Our client, a top player in cryptocurrency data monitoring, tracks over 10,000 tokens...


  • Singapur, Singapore Shopee Full time

    Senior Site Reliability Engineer (Promotion) - Engineering Infra DepartmentEngineering and TechnologyLevelExperienced (Individual Contributor)LocationSingapore The Engineering and Technology team is at the core of the Shopee platform development. The team is made up of a group of passionate engineers from all over the world, striving to build the best...