Title:  Principal AI Engineer - Scalabale Systems, AI.DA STC

Job ID:  19694
Location: 

ST Engineering Hub, SG

Description: 

About the Role

We’re looking for a senior Software Engineer to help us build the next-generation agentic AI platform for computer vision. You’ll own core components of the platform that orchestrate AI agents, power automated ML workflows, and deliver robust, production-grade systems.

This role blends backend excellence, infrastructure ownership, and a collaborative engineering mindset to scale the capabilities of AI Engineers and AI Developers. You’ll work closely with AI colleagues to bring intelligent systems to life - helping them move from local experimentation to full production.

This is a hands-on, high-impact role for someone who thrives at the intersection of scalable systems and cutting-edge AI.

Key Responsibilities

Platform & Backend Development

  • Design backend services (Python, FastAPI, gRPC) to support agent workflows, computer vision pipelines, and evaluation loops.
  • Build scalable APIs for orchestration, task management, vector search, and model serving.

Infrastructure & Deployment

  • Own CI/CD pipelines (GitHub Actions, Terraform) and production deployments.
  • Develop infrastructure for memory stores, compute orchestration, and model packaging (Docker, TorchServe, BentoML).

Engineering Excellence

  • Establish quality practices including testing (Pytest), monitoring, and observability (Prometheus/Grafana).
  • Ensure fault-tolerant, modular, and scalable system design.

Collaboration & Leadership

  • Mentor peers through code reviews, documentation, and clean architecture.
    Lead system design discussions and integration with AI and platform teams.

Must-Have Skills

  • 6+ years of software engineering, including 2+ in AI/ML environments.
  • Proficient in Python and production-grade API development (FastAPI, Flask, gRPC).
  • Experience with CI/CD and infrastructure-as-code (GitHub Actions, Terraform).
  • Skilled in containerization (Docker, Kubernetes) and cloud platforms (AWS, GCP, or Azure).
  • Familiarity with databases: SQL, NoSQL, and vector DBs (FAISS, Weaviate, pgvector).
  • Understanding of ML lifecycles: data ingestion, inference, monitoring, and recovery.
  • Proven ability to design distributed systems (API gateways, data pipelines, compute orchestration).

Bonus Skills

  • Familiarity with AI agent frameworks (LangChain, AutoGen, CrewAI).
  • Understanding of computer vision concepts and deployment challenges.
  • Exposure to LLM APIs or GenAI integrations.
  • Experience with ML observability and error logging systems
  • Knowledge of front-end prototyping tools (Gradio, Streamlit, etc.).

What We Offer

  • Small, agile team (5–6 engineers + interns) with autonomy and real ownership.
  • Startup feel with a big company resources:
    • International environment where the majority of the team and leadership is from startups or big international corporations (Lazada, Gojek, IBM) and from various countries.
    • Low-bureaucracy, high-impact startup environment where your code directly supports next-gen AI deployment.
    • Experimentation and self-development are in our culture
    • Knowledge sharing and collaboration
  • Direct collaboration with top AI researchers and computer vision scientists
  • This role is on 2 years employment contract 
  • Hybrid work setup: ~2–3 days in office per week.