Sr. Site Reliability Engineer

1 week ago


Singapore Visa Full time

Company Description

Visa is a world leader in digital payments, facilitating more than 215 billion payments transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable and secure payments network, enabling individuals, businesses and economies to thrive.

When you join Visa, you join a culture of purpose and belonging - where your growth is priority, your identity is embraced, and the work you do matters. We believe that economies that include everyone everywhere, uplift everyone everywhere. Your work will have a direct impact on billions of people around the world - helping unlock financial access to enable the future of money movement.

**Join Visa: A Network Working for Everyone.**

**Job Description**:
Product Reliability Engineering(PRE) is part of the Visa's technology organization. The division is responsible for maintaining and supporting Visa's data assets and provides support for value added products and services to drive innovation for our partners and clients, within Visa and globally. Product Reliability Engineering Big Data Platform Team is part of PRE supports open source Big Data and Kafka clusters in Visa.

As a Senior Big data Engineer you will be responsible for monitoring, troubleshooting, automating and continuously developing software tools to improve the availability and resiliency of open source Big Data Platforms at Visa. In this hands-on role, you will Administer and ensure performance, reliability and increase the operational efficiency of open source big data platforms.

Key Responsibilities:
Person will be responsible to Perform Big Data Administration and Engineering activities on multiple opensource Hadoop, Kafka, HBase and Spark clusters
Strong Troubleshooting and debugging skills.
Cross-team teamwork, build and maintain relationships with the customer teams, the user community, architects, and engineering teams, jointly work on key deliverables ensuring production scalability and stability
Effective Root cause analysis of major production incidents and developing learning documentation.
Identify and implement HA solution for services with SPOF.
Plan and perform capacity expansion and upgrades in timely manner avoiding any scaling issues and bugs.
Automation of repetitive tasks to reduce manual effort and avoid Human errors.
Tune alerting and setup observability to proactively identify the issues and performance problems.
Work closely with L-3 teams in reviewing new use cases, cluster hardening techniques for building a robust and reliable platforms.
leverage devops tools, disciplines( Incident, problem and change management) and standards in day to operations.
Ensure the Hadoop platform can effectively meet performance and SLA requirements.
Perform security remediation, automation and selfheal as per the requirement.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

**Qualifications**:
Basic Qualifications:
2+ years of relevant work experience and a Bachelors degree, OR 5+ years of relevant work experience
Hands on experience working as a Hadoop system engineer in managing Hadoop platforms.
Experience in building, managing and tuning performance of Hadoop platforms.
Extensive knowledge on Hadoop eco-system such as Zookeeper, HDFS, Yarn, HIVE and SPARK.
Excellent Shell, Python programming skills for automation requirement for repetitive dev-ops tasks
Understanding of security tools like Kerberos and Ranger.
Experience on Hortonworks distribution or Open Source preferred.
Knowledge on Kafka, HBASE and Kubernetes is a plus.
understanding of Linux, networking, CPU, memory and storage.
Knowledge on Java and Python is good to have.
Excellent interpersonal, verbal, and written communication skills.
This position is not ideal for a Hadoop developer.

Additional Information

Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.



  • Singapore Sea Limited Full time

    Engineering and Technology - Infrastructure, Singapore - Entry Level Our DevOps Engineering team plays an important role in developing and maintaining the internal systems and tools for the Infrastructure team. As a Site Reliability Engineer, you are responsible for improving the availability and reliability of our Infrastructure services. - Responsible for...


  • Singapore Visa Full time

    **Company Description** Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and...


  • Singapore Micron Full time

    **Our vision is to transform how the world uses information to enrich life for all.** Join an inclusive team passionate about one thing: using their expertise in the relentless pursuit of innovation for customers and partners. The solutions we build help make everything from virtual reality experiences to breakthroughs in neural networks possible. We do it...


  • Singapore f5 Full time

    Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. Role Overview: We are looking for a Senior Site Reliability Engineer to join our team! You will participate in the design and implementation...


  • Singapore NodeFlair Full time

    **Job Summary**: **Salary** S$11,500 - S$16,500 / Monthly **Job Type** **Seniority** Senior **Years of Experience** At least 7 years **Tech Stacks** Microsoft Puppet Java Ansible Python **This is Adyen** Adyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the...


  • Singapore Retentia technology private limited Full time

    **3+ years of experience in Site Reliability Engineering, DevOps**, or a related field. - **Strong knowledge of cloud platforms (AWS, GCP, Azure) and containerization technologies (Docker, Kubernetes).** - Experience with automation and configuration management tools (e.g., T**erraform, Ansible, Chef, or Puppet).** - Proficiency in at least **one programming...


  • Singapore The Edge Asia Full time

    Our client is a US hedge fund and their Technology group is constantly improving the company’s IT infrastructure, positioning them at the forefront of a rapidly evolving technology landscape. They are a team of experts experimenting, discovering new ways to harness the power of open-source solutions, and embracing enterprise agile methodology. Their...


  • Singapore Oxford Knight Full time

    Senior Site Reliability Engineer Job OverviewOxford Knight is seeking a highly skilled Senior Site Reliability Engineer to join our team and support our Linux trading infrastructure.Key ResponsibilitiesDesign and implement software components and systems to improve trading services.Provide level II support, including emergency response and advanced...


  • Singapore Visa Full time

    Company Description Visa is a world leader in digital payments, facilitating more than 215 billion payments transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable and secure...


  • Singapore Gravitas Recruitment Group Full time

    Job details - Location - Singapore - Salary - S$9000 - S$13000 per month - Job Type - Permanent - Ref - BBBH137137_1690786002 - Posted - about 1 hour ago Job summary **Our client, a trading firm, is looking for a Site Reliability Engineer to join their team. They are seeking team players who demonstrate a creative approach to problem-solving and take...


  • Singapore NextWave Partners Full time

    Location: - Singapore- Job Type: - Permanent- Discipline: - Software Engineering- Salary: - Negotiable- Contact: - Chelsea Phan**Senior Site Reliability Engineer** **Singapore** **About the role** We are working with a climate technology, who is currently working on a smart carbon measurement, accounting, and management Saas platform that allows...


  • Singapore IFUN GAMES Full time

    **Responsibilities** - Design, implement, and maintain tools and processes for monitoring, alerting, and incident response - Collaborate with developers to improve the design and operation of systems, with a focus on reliability, performance, and scalability - Participate in on-call rotations to respond to incidents and handle escalations - Analyze system...


  • Central Singapore Emprego SG Full time

    **Location** Singapore, Central Singapore **Job Type** Permanent **Salary** 9,000 - 15,000 Per **Date Posted** 5 hours ago Additional Details **Job ID** 16908 **Job Views** 1 Roles & Responsibilities **Objectives of this Role** - Run the production environment by monitoring availability and taking a holistic view of system health Improve...


  • Singapore Sea Limited Full time

    Engineering and Technology - Infrastructure, Singapore - Experienced (Individual Contributor) Our DevOps Engineering team plays an important role in developing and maintaining the internal systems and tools for the Infrastructure team. As a Senior Site Reliability Operation Engineer, you are responsible for improving the availability and reliability of our...


  • Singapore J P INFOTEC PTE. LTD. Full time

    **Site Reliability Engineer** **Responsibilities** - Support and/or own the deployment of global products including setting up production and internal environments - Provide 24/7 first line of Engineering support (via follow the sun teams in all regions) for any issues related to global product deployment, availability and internal operations support. -...


  • Singapore Experis Full time

    **Site Reliability Engineer**: - Location- Singapore- Job reference- BBBH133368_1699927914- Salary- S$6000 - S$7500 per month- Consultant name - Rajasekar Shirley Monisha Consultant contact no. - 6232 5244 - EA License No. - 02C3423 - Consultant Registration No. - R22106767 **Responsibilities**: - Responsible for deployment, change, issues triage and...


  • Singapore GXS BANK PTE. LTD. Full time

    **Job Description & Requirements**: Get to know the Role: - As a Site Reliability Engineer (SRE) you will help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. - Much of our support and software development focuses on optimizing existing systems, building...


  • Singapore Ambition Singapore Full time

    Ambition SingaporeAbout the CompanyAmbition Singapore is a top quantitative trading firm with a results-driven culture, seeking a Site Reliability Engineer to safeguard their innovative services and strategies.


  • Singapore DADACONSULTANTS PTE. LTD. Full time

    Roles & ResponsibilitiesSenior Site Reliability Engineer (SRE) | Big DataResponsibilities:Manage the full lifecycle of services, from design to deployment and maintenance.Develop and improve automation tools for scalability and reliability.Troubleshoot and resolve software and infrastructure issues, ensuring data security.Optimize system architecture and...


  • Singapore INFOSYS COMPAZ PTE. LTD. Full time

    Roles & ResponsibilitiesJob DescriptionWe are seeking talented and driven professionals to join our Site Reliability Engineering (SRE) team. This role involves helping organizations enhance the availability, performance, and resilience of their applications and services through the deployment and administration of Observability PlatformsKey Responsibilities...