We are an early-stage deep-tech company operating in stealth. Our mission is to build a next-generation platform that unifies AI, computational modeling, and advanced R&D workflows for some of the most demanding scientific and industrial environments. The stakes are high. The problems are real. The impact is global.
We are looking for a Senior DevOps/Infrastructure Engineer who can operate at the founding-team level. Someone who understands that robust infrastructure is not just about keeping the lights on – it is the bedrock that allows complex science and AI to scale with unprecedented precision and efficiency.
What You Will Lead
Architecture for Hybrid & Complex Environments
You will design and own the infrastructure for a platform that spans cloud, on-premise, and hybrid environments. You will ensure seamless orchestration between standard microservices and high-performance computational workloads.
Security, Hygiene, and Automation
You will define the security posture from day one – managing IAM, secrets, and policies – while building the automation that makes compliance invisible and deployment effortless.
Reliability for Mission-Critical R&D
You will implement the observability stack (OTEL) and telemetry strategies required to monitor distributed systems, ensuring that multi-step AI and simulation workflows run reliably without constant intervention.
What Exceptional Looks Like
- You think in systems, not just scripts. You design infrastructure that stays coherent as complexity scales
- You live and breathe automation. You treat manual intervention as a bug. You write Infrastructure as Code (IaC) that others can safely build on
- You don't just deploy containers; you understand the topology, the permissions model, and the security implications of moving data across hybrid boundaries
- You are comfortable in ambiguous early-stage environments where initiative is expected, not requested
Responsibilities
- Own the Infrastructure Stack: Lead the design and implementation of our infrastructure using Terraform (IaC) across cloud and hybrid setups
- Build and administer VMs and Containers using Docker and Kubernetes, ensuring rigorous security and permission models are enforced
- Design effective CI/CD pipelines and implement GitOps practices to accelerate development velocity
- Establish best practices for IAM, secrets management, and overall infrastructure hygiene to mitigate reliability and security risks
- Implement a full observability setup (OTEL) for telemetry across distributed and remote systems
- Design and maintain secure, efficient network topologies suitable for data-intensive applications
Required Experience
- 7+ years of experience building and maintaining production-grade infrastructure
- Extensive experience managing cloud (AWS/GCP), on-prem, and hybrid environments
- Advanced understanding of Docker (including security/permissions) and hands-on experience administering Kubernetes
- Deep experience designing CI/CD pipelines and implementing GitOps workflows
- Strong understanding of modern network topologies and security protocols
- Strong experience with Terraform/Pulumi
- Demonstrated experience with IAM, secrets management, and security policies
Good to Have
- Exposure to HPC environments (Slurm, LSF, PBS) or Apptainer
- Experience with workflow engines (Airflow, Temporal)
- Any background in MLOps/LLMOps, or interest in growing into it
- Experience administering relational or NoSQL databases
Why Join?
You will be part of a small, focused team building the foundations of a platform that does not yet exist. The scope is large, the ownership is real, and the work touches both engineering fundamentals and cutting-edge research. If you want to build something with long-term weight, this is the right place.