DevOps Engineer – AI Infrastructure
2 days ago
About the Role
Raydian Cloud is seeking a forward-thinking DevOps Engineer to help build and scale infrastructure that powers cutting-edge AI workloads. You'll work at the intersection of cloud-native technologies and Artificial Intelligence operations (AIOps), enabling high-performance, secure, and automated environments for AI development and deployment. Your expertise in Infrastructure as Code and Kubernetes will be critical in supporting scalable AI pipelines and platform services.
Key Responsibilities
- Design and manage cloud infrastructure optimized for AI/ML workloads using Infrastructure as Code (Terraform, Pulumi, etc.)
- Deploy and maintain Kubernetes clusters tailored for GPU scheduling, distributed training, and inference workloads
- Build CI/CD pipelines for AI model training, validation, and deployment across environments
- Collaborate with data scientists and ML engineers to streamline model lifecycle management
- Implement observability and monitoring for AI services (e.g., Prometheus, Grafana, OpenTelemetry)
- Ensure infrastructure security, compliance, and cost-efficiency in multi-tenant AI environments
- Automate provisioning of AI-specific resources (e.g., GPU nodes, storage volumes, feature stores)
- Document infrastructure patterns, DevOps workflows, and platform architecture
Required Skills & Qualifications
- Strong experience with Kubernetes, including GPU scheduling and Helm
- Proficiency in Infrastructure as Code tools (Terraform, Pulumi, etc.)
- Familiarity with cloud platforms (AWS, Azure, GCP) and AI services (e.g., SageMaker, Vertex AI)
- Experience with CI/CD tools (GitHub Actions, GitLab CI, Argo Workflows)
- Scripting skills in Python, Bash, or Go
- Understanding of ML model lifecycle and data pipeline orchestration
- Excellent communication and collaboration skills across technical and business teams
Nice to Have
- Experience with Kubeflow, MLflow, or similar MLOps frameworks
- Knowledge of containerized AI workloads (e.g., TensorFlow Serving, Triton Inference Server)
- Familiarity with service mesh technologies (Istio, Linkerd) in AI microservices
- Certifications in Kubernetes or cloud platforms (CKA, AWS DevOps Engineer)
Why Join Raydian Cloud?
- Shape the future of AI infrastructure and platform services
- Work with a visionary team blending deep tech and strategic execution
- Influence architecture decisions in a fast-moving AI startup environment
- Competitive compensation, flexible work culture, and growth opportunities
-
Senior DevOps Engineer
2 weeks ago
Singapore Goodnotes Full timeOverview At Goodnotes, we believe that every individual holds untapped potential waiting to be unleashed. By reimagining the way we interact with information, we're merging human creativity with the breakthrough capabilities of AI. Our renewed vision and mission drive us to create the best medium for human and AI collaboration, empowering users to explore...
-
Backend Engineer
5 days ago
Singapore Manus AI Full timeDirect message the job poster from Manus AI Human Resources | Keep doing what's right!We’re seeking a skilled Backend Engineer to join our growing team and drive the development of Manus’s core infrastructure and business systems. This role offers the opportunity to work across two specialized tracks: infrastructure engineering and business application...
-
Backend Engineer
2 weeks ago
Singapore Manus AI Full timeWe're seeking a skilled Backend Engineer to join our growing team and drive the development of Manus's core infrastructure and business systems. This role offers the opportunity to work across two specialized tracks: infrastructure engineering and business application development, allowing you to shape both the technical foundation and user-facing features...
-
DevOps Engineer
14 hours ago
Singapore Triton AI Pte Ltd Full timePerm - Must have design & build of Microsoft Azure cloud infrastructure and possess a Microsoft Azure Administrator (AZ-104) certification. - Min 3 years deep knowledge of DevOps practices and processes; prior working experience as DevOps Engineer. **Job Description**: - Design and development of innovative solutions in the cloud, deliver cloud migration...
-
DevOps Engineer, Data
2 weeks ago
Singapore Airwallex Full timeJoin to apply for the DevOps Engineer, Data & AI Infra role at Airwallex . About Airwallex Airwallex is the only unified payments and financial platform for global businesses. Powered by our proprietary infrastructure and software, we empower over 150,000 businesses worldwide – including Brex, Rippling, Navan, Qantas, SHEIN and many more – with fully...
-
DevOps Trainee
1 week ago
Singapore THERATECH.AI PTE. LTD. Full timeWe're seeking a DevOps Trainee / Junior DevOps Engineer who can manage day-to-day technical and facility-related tasks while having the opportunity to learn and grow into Cloud or DevOps roles.If you're someone who loves solving technical problems, enjoys hands-on facility work, and wants a clear path to build modern IT skills, this is the perfect role for...
-
DevOps & Infrastructure Engineer
1 week ago
Singapore Selby Jennings Full timeOur client is a global blockchain technology company and a founding contributor to one of the world's leading Layer 1 ecosystems. With a mission to drive adoption of decentralized technologies, the company builds and maintains critical infrastructure powering a widely used crypto wallet, blockchain nodes, staking pools, and enterprise-grade cloud services....
-
Backend Engineer
2 days ago
Singapore Manus AI Full time**Key Responsibilities** Core Backend Development - Build high-performance, scalable backend services using Golang to support core AI product features - Design, develop, and optimize APIs (RESTful and GraphQL) ensuring efficient frontend-backend interactions - Participate in system architecture design to ensure stability, reliability, and scalability at...
-
DevOps Engineer
1 week ago
Singapore Finsocial Full timeOperations Remote / Singapore Full-time **About This Role**: We're seeking a DevOps Engineer to build and maintain our cloud infrastructure and deployment pipelines. You'll ensure our AI and blockchain solutions are deployed efficiently, securely, and with high availability. **Responsibilities**: - Manage cloud infrastructure on AWS, GCP, or Azure -...
-
DevOps Engineer
3 days ago
Singapore Zettabyte Full timeWhy this role exists We’re looking for an Operations Engineer to help us design, build, and maintain the infrastructure powering our 10,000+ GPU cloud platform. You’ll be responsible for keeping our systems highly available, secure, and performant while working closely with backend, frontend, and infrastructure teams to enable rapid development and...