AI-Driven Big Data Engineer

4 days ago


Singapur, Singapore Pixalate, Inc Full time

AI- Driven Big Data Engineer Employment Type: Full-TimeLocation : Remote, SingaporeLevel: Entry to Mid Level (PhD Required) Bridge Cutting-Edge AI Research with Petabyte-Scale Data Systems Pixalate is an online trust and safety platform that protects businesses, consumers and children from deceptive, fraudulent and non-compliant mobile, CTV apps and websites. We're seeking a PhD-level Big Data Engineer to revolutionize how AI transforms massive-scale data operations. Our impact is real and measurable. Our software has uncovered: Gizmodo: An iCloud Feature Is Enabling a $65 Million Scam Washington Post: Your kids' apps are spying on them ProPublica: Porn, Piracy, Fraud: What Lurks Inside Google's Black Box Ad Empire About the Role Work at the intersection of big data and AI, where you'll develop intelligent, self-healing data systems processing trillions of data points daily. You'll have autonomy to pursue research in distributed ML systems and AI-enhanced data optimization, with your innovations deployed at unprecedented scale within months, not years. This isn't traditional data engineering - you'll implement agentic AI for autonomous pipeline management, leverage LLMs for data quality assurance, and create ML-optimized architectures that redefine what's possible at petabyte scale. Key Research Areas & Responsibilities AI-Enhanced Data Infrastructure Design intelligent pipelines with autonomous optimization and self-healing capabilities using agentic AI Implement ML-driven anomaly detection for terabyte-scale datasets Distributed Machine Learning at Scale Build distributed ML pipelines Develop real-time feature stores for billions of transactions Optimize feature engineering with AutoML and neural architecture search Required Qualifications Education & Research PhD in Computer Science, Data Science, or Distributed Systems (exceptional Master's with research experience considered) Published research or expertise in distributed computing, ML infrastructure, or stream processing Technical Expertise Core Languages : Expert SQL (window functions, CTEs), Python (Pandas, Polars, PyArrow), Scala/Java Big Data Stack : Spark 3.5+, Flink, Kafka, Ray, Dask Storage & Orchestration : Delta Lake, Iceberg, Airflow, Dagster, Temporal Cloud Platforms : GCP (BigQuery, Dataflow, Vertex AI), AWS (EMR, SageMaker), Azure (Databricks) ML Systems : MLflow, Kubeflow, Feature Stores, Vector Databases, scikit-learn + search CV, H2O AutoML, auto-sklearn, GCP Vertex AI AutoML Tables Neural Architecture Search: KerasTuner, AutoKeras, Ray Tune, Optuna, PyTorch Lightning + Hydra Research Skills Track record with 100TB+ datasets Experience with lakehouse architectures, streaming ML, and graph processing at scale Understanding of distributed systems theory and ML algorithm implementation Preferred Qualifications Experience applying LLMs to data engineering challenges Ability to translate complex AutoML/NAS research into practical production workflows Hands-on project examples of feature engineering automation or NAS experiments Proven success in automating ML pipelines, from raw data to an optimized model architecture Contributions to Apache projects (Spark, Flink, Kafka) Knowledge of privacy-preserving techniques and data mesh architectures What Makes This Role Unique You'll work with one of the few truly petabyte-scale production datasets outside of major tech companies, with the freedom to experiment with cutting-edge approaches. Unlike traditional big data roles, you'll apply the latest AI research to fundamental data challenges - from using LLMs to understand data quality issues to implementing agentic systems that autonomously optimize and heal data pipelines. #J-18808-Ljbffr


  • Senior Data Engineer

    2 weeks ago


    Singapur, Singapore PLAUD ai Full time

    ABOUT PLAUD AI PLAUD AI is a pioneering AI-native hardware and software company that turns meetings and conversations into actionable insights with AI devices like PLAUD NOTE and PLAUD NotePin. By recording, transcribing, and summarizing real-life conversations, our solutions boost productivity and save time. Designed for precision and flexibility, whether...

  • AI Data Engineer

    4 weeks ago


    Singapur, Singapore INNOCELLENCE SYSTEMS PTE. LTD. Full time

    We are looking for a skilled and experienced AI Data Engineer to join our team. The ideal candidate will be responsible for designing, building, and maintaining robust data pipelines to support the processing and analysis of clinical study and digital device sensor data. As a Data Engineer, you will work closely with data scientists and software engineers to...


  • Singapur, Singapore Centre for Strategic Infocomm Technologies (CSIT) Full time

    Join to apply for the Big Data Platform Engineer role at Centre for Strategic Infocomm Technologies (CSIT) 1 week ago Be among the first 25 applicants Join to apply for the Big Data Platform Engineer role at Centre for Strategic Infocomm Technologies (CSIT) Data processing is an essential part of national security in the modern world. As an engineer in...

  • Applied AI Scientist

    4 weeks ago


    Singapur, Singapore Bifrost AI Full time

    Applied AI Scientist - Synthetic Data for Perception Join to apply for the Applied AI Scientist - Synthetic Data for Perception role at Bifrost AI Applied AI Scientist - Synthetic Data for Perception Join to apply for the Applied AI Scientist - Synthetic Data for Perception role at Bifrost AI Get AI-powered advice on this job and more exclusive...

  • AI Data Engineer

    4 weeks ago


    Singapur, Singapore InnoCellence Full time

    We are looking for a skilled and experienced Data Science Engineer to join our team. The ideal candidate will be responsible for designing, building, and maintaining robust data pipelines to support the processing and analysis of clinical study and digital device sensor data. As a Data Science Engineer, you will work closely with data scientists and software...

  • Big Data Engineer

    7 days ago


    Singapur, Singapore LION & ELEPHANTS CONSULTANCY PTE. LTD. Full time

    Job Title: Big Data Engineer (Java, Spark, Hadoop) Location : Singapore Experience : 7- 12 years Employment Type : Full-Time Open to Citizens and SPR only | No Visa sponsorship available Job Summary We are looking for a Senior Big Data Engineer with 7–12 years of experience to join our growing data engineering team. The ideal candidate will bring deep...

  • AI/Data Engineer

    2 days ago


    Singapur, Singapore VISTRA Full time

    Overview Direct message the job poster from VISTRA It’s never been a more exciting time to join Vistra. At Vistra our purpose is progress. We believe that our clients have the power to change the world and to do great things for global progress, and we exist to remove the friction that comes from the complexity of global business – to help our clients...

  • Big Data Engineer

    2 days ago


    Singapur, Singapore DCS CARD CENTRE PTE. LTD. Full time

    Key Responsibilities Lead the design and implementation of a high-performance, real-time data platform for financial systems, supporting petabyte-scale daily transaction data across ingestion, storage, and computation layers. Architect a cloud-native data warehouse solution on AWS, incorporating Spark and Flink to support both batch and streaming use cases...


  • Singapur, Singapore Singtel Group Full time

    Select how often (in days) to receive an alert: Snr Data Scientist, Data & AI Engineering NCS is a leading technology services firm that operates across the Asia Pacific region in over 20 cities, providing consulting, digital services, technology solutions, and more. We believe in harnessing the power of technology to achieve extraordinary things, creating...

  • Big Data Engineer

    7 days ago


    Singapur, Singapore Baidu, Inc. Full time

    Build the company's big data warehouse system, including batch and stream data flow construction. In-depth understanding of business systems, understanding of project customer needs, design and implement big data systems that meet user needs, and ensure smooth project acceptance. Responsible for data integration and ETL architecture design and development....