Senior Site Reliability Engineer

3 weeks ago


Singapore Visa Full time

Company Description

Visa is a world leader in digital payments, facilitating more than 215 billion payments transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable and secure payments network, enabling individuals, businesses and economies to thrive.

When you join Visa, you join a culture of purpose and belonging - where your growth is priority, your identity is embraced, and the work you do matters. We believe that economies that include everyone everywhere, uplift everyone everywhere. Your work will have a direct impact on billions of people around the world - helping unlock financial access to enable the future of money movement.

Join Visa: A Network Working for Everyone.



Product Reliability Engineering(PRE) is part of the Visa\'s technology organization. The division is responsible for maintaining and supporting Visa\'s data assets and provides support for value added products and services to drive innovation for our partners and clients, within Visa and globally. Product Reliability Engineering Big Data Platform Team is part of PRE supports open source Big Data and Kafka clusters in Visa.

As a Senior Big data Engineer you will be responsible for monitoring, troubleshooting, automating and continuously developing software tools to improve the availability and resiliency of open source Big Data Platforms at Visa. In this hands-on role, you will Administer and ensure performance, reliability and increase the operational efficiency of open source big data platforms.

Key Responsibilities:

Person will be responsible to Perform Big Data Administration and Engineering activities on multiple Open-source Hadoop, Kafka, HBASE and Spark clusters

Strong Troubleshooting and debugging skills.

Cross-team teamwork, build and maintain relationships with the customer teams, the user community, architects, and engineering teams, jointly work on key deliverables ensuring production scalability and stability

Effective Root cause analysis of major production incidents and developing learning documentation .

Identify and implement HA solution for services with SPOF.

Plan and perform capacity expansion and upgrades in timely manner avoiding any scaling issues and bugs.

Automation of repetitive tasks to reduce manual effort and avoid Human errors.

Tune alerting and setup observability to proactively identify the issues and performance problems.

Work closely with L-3 teams in reviewing new use cases, cluster hardening techniques for building a robust and reliable platforms.

leverage Devops tools, disciplines( Incident, problem and change management) and standards in day to operations.

Ensure the Hadoop platform can effectively meet performance and SLA requirements.

Perform security remediation, automation and selfheal as per the requirement.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Qualifications

Basic Qualifications
2+ years of relevant work experience and a Bachelors degree, OR 5+ years of relevant work experience

Preferred Qualifications
3 or more years of work experience with a Bachelor\'s Degree or more than 2 years of work experience with an Advanced Degree (e.g. Masters, MBA, JD, MD)
Hands on experience working as a Hadoop system engineer in managing Hadoop platforms.
Experience in building, managing and tuning performance of Hadoop platforms.
Extensive knowledge on Hadoop eco-system such as Zookeeper, HDFS, Yarn, HIVE and SPARK.
Excellent Shell, Python programming skills for automation requirement for repetitive dev-ops tasks
Person will be responsible to perform Administration and Engineering activities on Data Streaming Platform like Kafka or equivalent technology
Understanding of security tools like Kerberos and Ranger.
Experience on Hortonworks distribution or Open Source or Confluent Kafka preferred
Hands-on experience in debugging Hadoop issues both on platform and applications.
Knowledge on Kafka, HBASE and Kubernetes is a plus.
understanding of Linux, networking, CPU, memory and storage.
Knowledge on Java and Python is good to have.
Excellent interpersonal, verbal, and written communication skills.
This position is not ideal for a Hadoop developer.

Please Note: Due to the COVID-19 pandemic and the evolving visa/travel restrictions in place, we are currently only able to extend offers to candidates with the right to work in Singapore. We are keeping the situation under close review and will adjust accordingly should the restrictive measures be lifted.

Additional Information

Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.

Visa



  • Singapore Sea Limited Full time

    Engineering and Technology - Infrastructure, Singapore Experienced (Individual Contributor)Our DevOps Engineering team plays an important role in developing and maintaining the internal systems and tools for the Infrastructure team. As a Senior Site Reliability Operation Engineer, you are responsible for improving the availability and reliability of our...


  • Singapore IFUN GAMES Full time

    Responsibilities Design, implement, and maintain tools and processes for monitoring, alerting, and incident response Collaborate with developers to improve the design and operation of systems, with a focus on reliability, performance, and scalability Participate in oncall rotations to respond to incidents and handle escalations Analyze system logs and...


  • North-East Singapore PERSOLKELLY Full time

    The Site Reliability Engineer is responsible for ensuring the reliability, scalability, and efficiency of our systems and infrastructure. This role involves monitoring, troubleshooting, and resolving issues to maintain optimal performance. The engineer will also collaborate with cross-functional teams to automate processes and improve system reliability....


  • Singapore ADYEN SINGAPORE PTE. LTD. Full time

    Roles & ResponsibilitiesThis is AdyenAdyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition.For our teams, we create an environment with opportunities for our people to succeed,...


  • Singapore ADYEN SINGAPORE PTE. LTD. Full time

    Roles & ResponsibilitiesThis is AdyenAdyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition.For our teams, we create an environment with opportunities for our people to succeed,...


  • Singapore NodeFlair Full time

    Job Summary:SalaryS$6,000 - S$10,000 / MonthlyJob TypeSenioritySeniorYears of ExperienceAt least 2 yearsTech StacksContainer Docker Jenkins Go CentOS ubuntu Zabbix Grafana Prometheus Linux Kubernetes kafka Ansible Python Perform regular and adhoc serverside deployments, performance finetuning and troubleshooting. Design and develop automations for workflows....


  • Singapore Visa Full time

    Company Description Visa is a world leader in digital payments, facilitating more than 215 billion payments transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable and secure...


  • Singapore D L RESOURCES PTE LTD Full time

    Roles & ResponsibilitiesJob ObjectivesThe Site Reliability Engineer/Software Engineer is a contract position responsible software and systems engineering to build and run large-scale, distributed, fault-tolerant systems. As a SRE you will help to ensure that our services are reliable, available, and improving at a rapid pace. You will write code, ...


  • Singapore 2K Full time

    Who We Are Founded in 2005, 2K Games is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our studios responsible for developing 2K\'s portfolio of world-class games across multiple platforms, include Visual Concepts, Firaxis, Hangar 13, CatDaddy, Cloud Chamber, and HB Studios. Our...


  • Singapore 2K Full time

    Who We Are Founded in 2005, 2K Games is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our studios responsible for developing 2K\'s portfolio of world-class games across multiple platforms, include Visual Concepts, Firaxis, Hangar 13, CatDaddy, Cloud Chamber, and HB Studios. Our...


  • Singapore 2K Full time

    Who We Are Founded in 2005, 2K Games is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our studios responsible for developing 2K\'s portfolio of world-class games across multiple platforms, include Visual Concepts, Firaxis, Hangar 13, CatDaddy, Cloud Chamber, and HB Studios. Our...


  • Singapore 2K Full time

    Who We Are Founded in 2005, 2K Games is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our studios responsible for developing 2K\'s portfolio of world-class games across multiple platforms, include Visual Concepts, Firaxis, Hangar 13, CatDaddy, Cloud Chamber, and HB Studios. Our...


  • Singapore 2K Full time

    Who We Are Founded in 2005, 2K Games is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our studios responsible for developing 2K\'s portfolio of world-class games across multiple platforms, include Visual Concepts, Firaxis, Hangar 13, CatDaddy, Cloud Chamber, and HB Studios. Our...


  • Singapore Gravitas Recruitment Group Full time

    Job details Location Singapore Salary S$9000 S$13000 per month Job Type Permanent Ref BBBH137137_ Posted- about 1 hour agoJob summaryOur client, a trading firm, is looking for a Site Reliability Engineer to join their team. They are seeking team players who demonstrate a creative approach to problem-solving and take initiative in exploring different methods...


  • Singapore TIKTOK PTE. LTD. Full time

    About TikTokTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul, and Tokyo. Why Join UsCreation is the core of TikTok's purpose. Our platform is built to help imaginations thrive....


  • Singapore ALTROCKS TECH PTE. LTD. Full time

    Responsibilities: Be responsible for design and implementation of new strategies in an Agile Environment to optimize all aspect of the CI, release and deployment processes using latest container and virtualization techniques (Docker, Kubernetes, Ansible, AWS ECS, et al) Provide DevOps architecture implementation and operational support Resolve future needs...


  • Singapore Experis Full time

    Site Reliability Engineer: Location Singapore Job reference BBBH133368_ Salary S$6000 S$7500 per month Consultant name Rajasekar Shirley Monisha Consultant contact no. EA License No. 02C342 Consultant Registration No. RResponsibilities: Responsible for deployment, change, issues triage and infra management of overseas games and relevant components and...


  • Singapore Experis Full time

    Site Reliability Engineer - Devops/cloud: Location Singapore Job reference BBBH133368_ Salary S$6000 S$7500 per month Consultant name Rajasekar Shirley Monisha Consultant contact no. EA License No. 02C342 Consultant Registration No. RResponsibilities: Responsible for deployment, change, issues triage and infra management of overseas games and relevant...


  • Singapore Experis Full time

    As a Site Reliability Engineering (SRE):You will work on key initiatives to help the operational scaling and growth of the non-production and prod services You will contribute to and maintain engineering system standards You will provide deep support to the live services with emphasis on mitigation over break and fix You will perform regular performance...


  • Singapore Continental Technology Solutions Full time

    **Job Description: - **Job Objective (Why does the job exist?)Roles and Responsibilities (What does the job do?)Setup the server infrastructure as per design. Ensure implementation meets bank's security standards and industry's security standards.Perform continuous improvement for the platform covering areas such as: capacity planning, observability,...