
System Reliability Expert
1 week ago
This position involves working on cutting-edge products with a dynamic global team. Collaborate closely with cross-functional teams to optimize system-level setups for accelerator products, including server rack and system configurations.
Main Responsibilities:
- System-Level Setup and Testing:
- Plan and execute system-level setups for accelerator products.
- Ensure seamless integration of server systems with advanced cooling solutions and environmental management systems.
- Validate and maintain reliability test scripts for automated and manual testing processes.
- Reliability Assessment and Testing:
- Conduct comprehensive reliability assessments of accelerator systems.
- Design and implement environmental stress tests to simulate data center conditions.
- Evaluate material interactions and their impact on product reliability.
- Analyze results to identify potential reliability risks and areas for design improvement.
- Functional Testing and Fault Isolation:
- Perform detailed functional testing to evaluate system performance under various operational conditions.
- Identify, isolate, and troubleshoot faults using advanced diagnostic tools and methodologies.
- Failure Analysis and Reporting:
- Perform root cause analysis for identified reliability failures and develop corrective actions.
- Collaborate with cross-functional teams to conduct root cause analysis of reliability testing failures.
- Collaboration and Documentation:
- Work closely with design, manufacturing, and quality teams to align reliability goals with overall product requirements.
- Generate comprehensive reports detailing reliability test results, analysis, and recommendations.
- Maintain meticulous records of testing methodologies and outcomes for future reference and continuous improvement initiatives.
- Mentorship:
- Effectively mentor junior engineers, providing guidance in both technical domains and professional skill development to foster growth and team success.
Required Skills and Qualifications:
- Technical Knowledge:
- Reliability engineering principles and standards in high-performance computing environments.
- Product lifecycle and knowledge of data center environmental management systems.
- Understanding of thermal, mechanical, and electrical stresses in server systems.
- Professional Skills:
- Excellent written and verbal communication skills.
- Strong analytical, problem-solving, and collaboration skills.
Benefits:
- PREFERRED EXPERIENCE:
- Knowledge of project and risk management is an added advantage.
- Self-starter with ability to independently drive tasks to completion.
Others:
- Education:
- Bachelor's or Master's degree in Electrical/Electronics Engineering (EE) or a related field.
Location:
Singapore
Tell employers what skills you have:
Cycling
Manual Testing
Budget Management
Ubuntu
Root Cause Analysis
Reliability
Administration Management
Reliability Engineering
Infrastructure Architecture
RedHat
Technical Consultation
Environmental Management Systems
Technical Engineering
Failure Analysis
-
Reliable Systems Expert
2 days ago
Singapore beBeeSystem Full timeJob Title: Reliable Systems Expert We are seeking a skilled Reliable Systems Expert to join our team. About the Role: The Reliable Systems Expert will be responsible for managing the operational work of our services, ensuring they are running smoothly and efficiently. This involves designing and selecting basic runtime environments for game servers...
-
System Reliability Solutions Expert
3 days ago
Singapore beBeeReliability Full time $80,000 - $120,000Job Title: System Reliability Solutions Expert We are seeking a highly skilled System Reliability Solutions Expert to join our team. The ideal candidate will have expertise in reliability, availability, and maintainability (RAM) of systems and equipment throughout their life cycles. Key Responsibilities: Provide technical support covering RAM, system...
-
System Reliability Expert
1 week ago
Singapore beBeeSystemReliability Full time $100,000 - $150,000Job Title: System Reliability ExpertRole Overview:This is a critical role that ensures the reliability, scalability, and performance of our systems and services. You will work closely with development and operations teams to build and maintain robust infrastructure and tools that support high availability, monitoring, and rapid deployment.Key...
-
Reliability Expert
5 days ago
Singapore beBeeExpert Full timeReliability Expert We are seeking a skilled professional to join our team as a Reliability Expert. As a key member of our organization, you will be responsible for ensuring the smooth operation of our systems and applications. Key Responsibilities Provide day-to-day support for Ecommerce Platform Application, FX clients and Stockbrokers using FX. ...
-
System Reliability and Maintainability Expert
2 weeks ago
Singapore beBeeReliability Full time $120,000 - $150,000Reliability and Maintainability EngineerWe are seeking a highly skilled Reliability and Maintainability (RAM) expert to join our team. The ideal candidate will have a strong background in RAM, with excellent leadership, communication and teamwork abilities.About the RoleDevelop and implement RAM methodologies to ensure the long-term sustainability of complex...
-
Reliable Systems Expert
7 days ago
Singapore beBeeInfrastructure Full time $90,000 - $120,000System Reliability SpecialistAt Xtremax, we rely on our System Reliability Specialists to ensure the optimal performance and efficiency of mission-critical systems. This critical role involves working closely with developers, product managers, and user support teams to monitor system performance, resolve technical issues, and implement preventive measures....
-
Reliable Systems Expert
1 week ago
Singapore beBeeReliability Full time $80,000 - $120,000Job SummaryWe're seeking a highly skilled Reliability Engineer to join our team. This role requires a unique blend of technical expertise and strategic thinking, as you'll be responsible for ensuring the stability and performance of our systems.Our ideal candidate will have experience in Unix or Linux administration, with a focus on performance tuning. They...
-
Reliability and Technical Expert
6 days ago
Singapore beBeeMaintenance Full time $90,000 - $120,000Job Title: Reliability and Technical ExpertAbout the Role:We are seeking a highly skilled and experienced Reliability and Technical Expert to join our team. As a key member of our operations, you will play a crucial role in ensuring the reliability and efficiency of our technical systems.Key Responsibilities:Perform routine maintenance and troubleshooting of...
-
Reliable Systems Expert
1 week ago
Singapore beBeeSiteReliabilityEngineer Full time $80,000 - $120,000We are looking for a highly skilled Site Reliability Engineer to join our team in Monetization Technology. This role is responsible for ensuring the reliability, scalability, and operability of our services deployed across multiple data centers globally.Job DescriptionThis position involves engaging in and improving the whole lifecycle of Ads systems - from...
-
Senior Software Reliability Expert
2 weeks ago
Singapore beBeeSoftwareReliability Full time $100,000 - $200,000Job Title: Senior Software Reliability ExpertWe are seeking an experienced and skilled professional to join our team as a Senior Software Reliability Expert. This role involves designing and developing software systems that meet the highest standards of quality, reliability, and performance.Job Description:As a Senior Software Reliability Expert, you will be...