Reliability Engineering Specialist

3 hours ago


Singapore Advanced Micro Devices Full time

**Reliability Engineering Specialist**:

- Singapore, Singapore
- Engineering
- 66974

**Job Description**:
**WHAT YOU DO AT AMD CHANGES EVERYTHING**
- We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.
- AMD together we advance_

**THE ROLE**:

- Join a dynamic global team dedicated to advanced reliability testing of module and system boards of AMD's cutting-edge products. Collaborate closely with cross-functional teams across AMD Global Operations & Quality, and Data Center organizations on accelerator-product system setup and reliability testing.

**KEY RESPONSIBILITIES**:

- System-level setup and testing:

- Plan, execute, and optimize system-level setups for accelerator products, including server rack and system configurations.
- Ensure seamless integration and functionality of server systems with advanced cooling solutions and environmental management systems.
- Validate and maintain reliability test scripts for automated and manual testing processes.
- Reliability assessment and testing:

- Conduct comprehensive reliability assessments of accelerator systems, focusing on mechanical, thermal, and electrical stress factors.
- Design and implement environmental stress tests to simulate data center conditions, including operational stress, thermal cycling, signal, and power integrity.
- Evaluate material interactions and their impact on product reliability, ensuring robustness in diverse operating environments.
- Analyze results to identify potential reliability risks and areas for design improvement.
- Functional testing and fault isolation:

- Perform detailed functional testing to evaluate system performance under various operational conditions.
- Identify, isolate, and troubleshoot faults using advanced diagnostic tools and methodologies.
- Failure analysis and reporting:

- Perform root cause analysis for identified reliability failures and develop corrective actions for design and process enhancement.
- Collaborate with cross-functional teams to conduct root cause analysis of reliability testing failures.
- Collaboration and documentation:

- Work closely with design, manufacturing, and quality teams to align reliability goals with overall product requirements.
- Generate comprehensive reports detailing reliability test results, analysis, and recommendations.
- Maintain meticulous records of testing methodologies and outcomes for future reference and continuous improvement initiatives.
- Mentorship:

- Effectively mentor junior engineers, providing guidance in both technical domains and professional skill development to foster growth and team success.

**PREFERRED EXPERIENCE**:

- Knowledge of reliability engineering principles, product lifecycle, and standards in high-performance computing environments.
- Proven experience in system-level setup and testing for accelerator products or similar technologies.
- Proficiency in developing and executing reliability test scripts and protocols.
- Familiarity with reliability standards and best practices in high-performance computing environments.
- Familiarity with data center environmental management, server rack/system configurations, and integrated cooling solutions.
- Strong understanding of environmental stress factors, including thermal, mechanical, and electrical stresses, in server systems (L6-L10).
- Expertise in failure analysis techniques, including root cause analysis and fault isolation methodologies.
- Excellent written and verbal communication skills for clear reporting and collaboration.
- Strong analytical, problem-solving, and communication skills.
- Experience with reliability testing tools, simulation software and statistical tools is an added advantage.
- Knowledge in project and risk management is an added advantage.
- Self-starter and able to independently drive tasks to completion.
- Ability to structure and execute complex analysis, draw insights, and communicate summary conclusions/recommendations to senior management and AMD customers/partners.
- Ability to network, build relationships, and collaborate to drive effective decision-making across multiple functions and levels within AMD.

**ACADEMIC** **CREDENTIALS**:

- Bachelor’s or Master’s degree in Electrical/Electronics Engineering (EE) or a related field.

**LOCATION**:

- Singapore

LI-JV1



  • Singapore St Engineering Full time

    Engineer, Reliability, Programme Management Job ID: 20171Location: Aero - 505A Airport Road, SG Description: Key Responsibilities Lead reliability engineering efforts across multiple aerospace programs, collaborating closely with cross‐functional teams including manufacturing, quality assurance, and customer support. Perform reliability data analysis to...


  • Singapore ST Engineering Full time $120,000 - $180,000 per year

    Job ID: 20171Location:Aero - 505A Airport Road, SGDescription:Key ResponsibilitiesLead reliability engineering efforts across multiple aerospace programs, collaborating closely with cross-functional teams including manufacturing, quality assurance, and customer support.Perform reliability data analysis to identify potential risks and improvement...


  • Singapore ST Engineering Aerospace Full time

    Engineer, Reliability, Programme Management Lead reliability engineering efforts across multiple aerospace programs, collaborating closely with cross‐functional teams including manufacturing, quality assurance, and customer support. Perform reliability data analysis to identify potential risks and improvement opportunities to improve fleet reliability....


  • Singapore ST Engineering Group Full time $120,000 - $150,000 per year

    Key ResponsibilitiesLead reliability engineering efforts across multiple aerospace programs, collaborating closely with cross-functional teams including manufacturing, quality assurance, and customer support.Perform reliability data analysis to identify potential risks and improvement opportunities to improve fleet reliabilitySupport root cause...


  • Singapore Singapore Technologies Engineering Ltd Full time

    Job ID: 20171- Location: Aero - 505A Airport Road, SG- Description: - Key Responsibilities- Lead reliability engineering efforts across multiple aerospace programs, collaborating closely with cross-functional teams including manufacturing, quality assurance, and customer support. - Perform reliability data analysis to identify potential risks and...


  • Singapore ST Engineering Aerospace Systems Pte Ltd Full time

    Key Responsibilities Lead reliability engineering efforts across multiple aerospace programs, collaborating closely with cross-functional teams including manufacturing, quality assurance, and customer support. Perform reliability data analysis to identify potential risks and improvement opportunities to improve fleet reliability. Support root cause...


  • Singapore Singapore Technologies Engineering Ltd Full time

    Job ID: 16248- Location: Aero - 600 West Camp Road, SG- Description: - **Aircraft Component Reliability Engineer** - Analyse on-wing reliability of aircraft components, detect abnormal component removals. Provide removal forecast and expected repair cost. - Communicate with aircraft and component manufacturers and repair shops to investigate abnormal...


  • Singapore Singtel Group Full time

    Principal Specialist, Platforms Reliability Engineering (Networks)Singtel Networks, the most established telecommunications infrastructure provider in Singapore, is transforming to enable the digital generation of tomorrow. We are introducing new capabilities in 5G, Cloud, Analytics, Digital Commerce, Software Engineering, Cyber Security to enhance our core...


  • Singapore Chevron Full time

    **Responsibilities for this position may include but are not limited to**: - Facilitates & stewards the roll out of global reliability initiatives, such as Facility Integrity & Reliability Management (FIRM), within SMP by engaging cross functional stakeholder - Responsible for plant reliability KPI tracking & reporting - Leads the assessment of Asset...


  • Singapore Chevron Full time

    **Responsibilities for this position may include but are not limited to**: - Facilitates & stewards the roll out of global reliability initiatives, such as Facility Integrity & Reliability Management (FIRM), within SMP by engaging cross functional stakeholder - Responsible for plant reliability KPI tracking & reporting - Leads the assessment of Asset...