Compute Grid Site Reliability Engineer- AVP
2 weeks ago
Purpose of the role
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them.
Accountabilities
- Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.
- Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring.
- Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.
- Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning.
- Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations.
- Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth.
Assistant Vice President Expectations
- To advise and influence decision making, contribute to policy development and take responsibility for operational effectiveness. Collaborate closely with other functions/ business divisions.
- Lead a team performing complex tasks, using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomes
- If the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an environment for colleagues to thrive and deliver to a consistently excellent standard. The four LEAD behaviours are: L – Listen and be authentic, E – Energise and inspire, A – Align across the enterprise, D – Develop others.
- OR for an individual contributor, they will lead collaborative assignments and guide team members through structured assignments, identify the need for the inclusion of other areas of specialisation to complete assignments. They will identify new directions for assignments and/ or projects, identifying a combination of cross functional methodologies or practices to meet required outcomes.
- Consult on complex issues; providing advice to People Leaders to support the resolution of escalated issues.
- Identify ways to mitigate risk and developing new policies/procedures in support of the control and governance agenda.
- Take ownership for managing risk and strengthening controls in relation to the work done.
- Perform work that is closely related to that of other areas, which requires understanding of how areas coordinate and contribute to the achievement of the objectives of the organisation sub-function.
- Collaborate with other areas of work, for business aligned support areas to keep up to speed with business activity and the business strategy.
- Engage in complex analysis of data from multiple sources of information, internal and external sources such as procedures and practises (in other areas, teams, companies, etc).to solve problems creatively and effectively.
- Communicate complex information. 'Complex' information could include sensitive information or information that is difficult to communicate because of its content or its audience.
- Influence or convince stakeholders to achieve outcomes.
All colleagues will be expected to demonstrate the Barclays Values of Respect, Integrity, Service, Excellence and Stewardship – our moral compass, helping us do what we believe is right. They will also be expected to demonstrate the Barclays Mindset – to Empower, Challenge and Drive – the operating manual for how we behave.
Join us in the role as Compute Grid Site Reliability Engineer- AVP in Singapore. The Compute Grid team is responsible for building and maintaining the bank's distributed super-computer which runs the bank's compute intensive workloads. The system harnesses CPU capacity sourced from on-prem and public cloud. The team's mission statement is: "To provide a stable platform for the distributed execution of computation tasks at the lowest possible price". In this role, you will work to continuously improve the Compute Grid service, operating within the team's EngOps framework (a mix of SRE & DevOps), taking part in support, operations, engineering, and development work on rotation.
To be successful in the role, you must have
Essential Skills/Basic Qualifications
- Strong verbal and written communication skills.
- Strong technical aptitude and can-do attitude along with good problem-solving skills.
- Experience in Windows/Unix Systems Administration
- PowerShell and Python scripting
Desirable skills/Preferred Qualifications
- Experience with High Performance Computing software such as IBM Symphony and Tibco/DataSynapse GridServer.
- Experience with Microsoft Azure & AWS.
- Experience using Splunk.
- Experience in DevOps tooling(Git, Chef, Jenkins, Terrafor
Microsoft Azure
Scalability
Systems administration
Powershell
Software Engineering
Scripting
Reliability
Business Strategy
Stewardship
Python
High Performance Computing
Performance Tuning
Decision Making
Service Excellence
Software Development
-
Site Reliability Engineer Leader
4 days ago
Singapore OCBC Full timeJob Description:We are seeking a Site Reliability Engineer Leader to join our team at OCBC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our infrastructure. This role requires strong expertise in automating releases, continuous integration/delivery systems, and relevant infrastructure...
-
Site Reliability Engineer
2 weeks ago
Singapore COMBUILDER PTE LTD Full timeRoles & ResponsibilitiesWe are seeking talented and driven professionals to join our Site Reliability Engineering (SRE) team. This role involves helping organizations enhance the availability, performance, and resilience of their applications and services through the deployment and administration of Observability Platforms.Key ResponsibilitiesDeploy and...
-
Senior Site Reliability Engineer
2 weeks ago
Singapore GK CONSULTING PTE. LTD. Full timeRoles & ResponsibilitiesWe're seeking an experienced Senior Site Reliability Engineer to ensure the reliability, availability, and performance of our cloud-based internet services.Key Responsibilities1. Own reliability, availability, and user experience for assigned cloud services2. Develop and implement service governance initiatives to increase reliability...
-
Site Reliability Manager
7 days ago
Singapore Oxford Knight Full timeRequirementsOxford Knight seeks a highly motivated and experienced Senior Site Reliability Engineer with a strong background in Linux administration, cloud computing, and programming languages (preferably Python). The ideal candidate should have a degree in Computer Science or a related field and excellent communication skills.Key Skills and Qualifications5+...
-
Site Reliability Engineer
2 weeks ago
Singapore FUNFLY PTE. LTD. Full timeRoles & ResponsibilitiesPosition OverviewAs a site reliability engineer, you will be responsible for ensuring the smooth operation of game services by maintaining, monitoring, and responding to faults daily. They will develop automation tools to enhance operational efficiency and manage game servers for optimal performance. The role includes collaborating...
-
Grid Optimization Engineer
2 days ago
Singapore NTU (Nanyang Technology University- Main Office-HR) Full timeProject Overview:The Nanyang Technology University's Distributed Energy Resource Management System aims to optimize power/energy flow in distribution networks. As a Research Fellow, you will contribute to this project by developing advanced optimization algorithms/tools for DERs deployment and planning.Key Deliverables:Efficient constraint optimization...
-
Site Reliability Engineer
2 weeks ago
Singapore FLOWDESK ASIA PTE. LTD. Full timeRoles & ResponsibilitiesAbout the jobAre you passionate about maintaining robust and high-performing infrastructures? Do you thrive in managing complex network environments and ensuring system reliability?Join our infrastructure team and help us elevate operational excellence to new heights.As a Site Reliability Engineer at Flowdesk, you will be at the heart...
-
Site Reliability Engineer
1 week ago
Singapore TIKTOK PTE. LTD. Full timeRoles & ResponsibilitiesTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.Why Join UsAt TikTok, our people are humble, intelligent, compassionate and creative. We create...
-
Site Reliability Engineer
2 weeks ago
Singapore HELLO PLANET PTE. LTD. Full timeRoles & ResponsibilitiesWe are a global dating app created to give everyone a chance at love. The sense of belonging and connectedness we get from relationships helps us survive and thrive, and we're working to make it a little easier for people to find that. We're inspired by the stories we hear from employees, friends, and family who have used our app to...
-
Site Reliability Engineer
2 weeks ago
Singapore PATSNAP PTE. LTD. Full timeRoles & ResponsibilitiesAbout the RoleWe are looking for a skilled and experienced DevOps Engineer / Site ReliabilityEngineer (SRE) to ensure the high availability, stability, and performance of ourbusiness platform. This role will be responsible for designing and implementing scalableand maintainable DevOps architecture and automation systems to...
-
Site Reliability Engineer
3 weeks ago
Singapore TRINITY CONSULTING SERVICES PTE. LTD. Full timeRoles & Responsibilities· Must have minimum 5 years' experience.· Strong technical knowledge and experience in supporting enterprise-level applications.· Proficiency in troubleshooting application issues, performing log analysis, and using monitoring tools.· Experience with databases and SQL query language.· Familiarity with software development life...
-
Power Grid Optimization Expert
2 days ago
Singapore NTU (Nanyang Technology University- Main Office-HR) Full timePower Grid Optimization:The NTU Main Office-HR is seeking a highly skilled expert to join our Distributed Energy Resource Management System project team.Key Responsibilities:Develop and deploy power/energy optimization algorithms and tools for DERs deployment and planning.Formulate constrained optimization problems using convex optimization, dynamic...
-
Associate Site Reliability Engineer
1 day ago
Singapore DBS Bank Limited Full timeBusiness Function Group Technology enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group Technology, we manage the majority of the Bank's processes and inspire to delight our business partners through our...
-
Site Reliability Engineer
3 weeks ago
Singapore GOCODE PTE. LTD. Full timeRoles & ResponsibilitiesJob HighlightsProfessional Growth Collaborative Environment Positive Company CultureJob DescriptionCollaborate with various teams that includes Development/Infra/Products to ensure successful delivery, maintenance planning and correction of build errors. Day-to-day monitoring, backup, deployment and maintenance of systems. ...
-
VP of Site Reliability Engineering
1 day ago
Singapore DBS Bank Limited Full timeBusiness Function Group Technology enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group Technology, we manage the majority of the Bank's processes and inspire to delight our business partners through our...
-
GEL – Site Reliability Engineer
3 weeks ago
Singapore TOSS-EX PTE. LTD. Full timeRoles & ResponsibilitiesRoles & ResponsibilitiesJob PurposeThe Site Reliability Engineer (SRE) combines software development and system engineering to build and run distributed solutions in a secured multi-tier heterogeneous environment to safeguard, provide and continuously improve the software and systems behind the organization’s cloud platform...
-
GEL – Site Reliability Engineer
3 weeks ago
Singapore TOSS-EX PTE. LTD. Full timeRoles & ResponsibilitiesRoles & ResponsibilitiesJob PurposeThe Site Reliability Engineer (SRE) combines software development and system engineering to build and run distributed solutions in a secured multi-tier heterogeneous environment to safeguard, provide and continuously improve the software and systems behind the organization's cloud platform...
-
Site Reliability Engineer
2 weeks ago
Singapore SOURCEO PTE. LTD. Full timeRoles & ResponsibilitiesRequired Expertise and ExperienceAt least 3 years of experience in SRE, DevOps, or a related engineering role. Proficiency in Infrastructure as Code (IaC) using Terraform to manage complex infrastructure. Hands-on experience with log analytics and observability tools, including ELK (Elasticsearch, Logstash, Kibana) and the Grafana...
-
Reliability Engineer
2 weeks ago
Singapore IT CONSULTANCY & SERVICES PTE LTD Full timeRoles & ResponsibilitiesRequirementDiploma/Degree in Information Technology/Computer Science/Business Admin. or any related study At least 2 years of experience in IT operation automation and monitoring solution Skills on Scripting experience preferably Ansible, shell, Python Skills/Experience on Monitoring and observability Implementation Familiar with...
-
Associate VP
7 days ago
Singapore DBS Bank Limited Full timeBusiness Function Group Technology enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group Technology, we manage the majority of the Bank's operational processes and inspire to delight our business partners...