About the job
About ALSO.
At ALSO, we are revolutionizing electric mobility. Originally part of Rivian, our dedicated team of innovators is committed to designing and developing cutting-edge, vertically integrated electric vehicles (EVs) that address the mobility challenges of today and tomorrow. Our mission is to inspire communities to choose ALSO by offering vehicles that are not only affordable and enjoyable but also significantly more efficient—10 to 50 times more than traditional vehicles.
We are seeking a Senior Site Reliability Engineer to join our team. In this pivotal role, you will be responsible for architecting and maintaining scalable, cloud-native systems that support vehicle telemetry, fleet management, and data analytics. The ideal candidate will possess profound knowledge in distributed systems, Kubernetes, AWS infrastructure, and data pipelines with a strong emphasis on reliability and operational excellence.
Key Responsibilities:
Operationalize microservices platforms utilizing Kubernetes (EKS) and AWS ECS.
Enhance vehicle telemetry data ingestion and optimize data pipelines leveraging streaming technologies (Kafka/Kinesis) for high-throughput, low-latency workloads.
Lead initiatives focused on reliability engineering, including the establishment of SLOs, SLIs, and incident response protocols.
Deploy advanced observability solutions using tools such as Datadog, Grafana, and comprehensive logging pipelines.
Design and maintain API Gateway-based service architectures.
Manage on-call rotations and incident response frameworks using PagerDuty.
Automate infrastructure provisioning employing Terraform and CI/CD tools (ArgoCD, Concourse).
Advance system resilience, failover strategies, and multi-region reliability.
Collaborate with product and platform teams to enhance vehicle lifecycle management and cohort analytics.

