About the job
About the Role
Join our innovative team at flatgigs as we develop a cutting-edge, hardware-agnostic IoT platform from scratch. We are seeking a seasoned engineer who excels in both system architecture design and backend development. This dynamic role requires you to craft robust architectures in the morning and implement production-grade code in the afternoon. As a key player in our fast-paced startup environment, you will also manage cloud infrastructure as an interim DevOps engineer until we scale.
The Ideal Candidate
You have extensive experience building IoT backend platforms rather than just utilizing them. You are well-versed in tackling complex challenges such as device authentication at scale, MQTT broker design, time-series data ingestion performance, multi-tenant data isolation, and real-time data delivery to web clients. You possess the autonomy to make architectural decisions, document them comprehensively, and uphold those decisions. Your discipline in remote work allows you to proactively address risks before they escalate into issues.
Key Responsibilities
Platform Architecture
- Design a comprehensive end-to-end IoT platform architecture including device connectivity, MQTT/protocol ingestion, stream processing, time-series storage, and real-time WebSocket delivery.
- Define a multi-tenant data model ensuring strict data isolation across customers with tenant-scoped API tokens and row-level security.
- Architect the device lifecycle management system incorporating provisioning, X.509/JWT authentication, device registry, status tracking, and decommissioning.
- Design a protocol abstraction layer that accommodates MQTT, Modbus, OPC-UA, CoAP, and HTTP devices, all normalizing to a unified internal data model.
- Create a configurable rule engine for event-condition-action rules facilitating alerts, automations, and integrations, requiring no coding from customers.
- Plan for OTA firmware updates management, covering secure delivery, versioning, rollback, and fleet orchestration.
- Document Architecture Decision Records (ADRs) for every significant technical choice to maintain thorough documentation.
- Strategize the scaling path from 100 devices during the pilot phase to over 500,000 devices in production without necessitating structural rework.
Backend Development
- Develop core platform services from the ground up, including device management, telemetry ingestion, rule engine, notification/alerting system, OTA updates, and a multi-tenant API gateway.
- Create REST and GraphQL APIs with comprehensive OpenAPI specifications, version-controlled from day one.
- Implement WebSocket and SSE endpoints to facilitate real-time telemetry delivery to web and mobile clients.
- Build a command-and-control system for devices with acknowledgement, retry logic, and timeout handling.
- Implement a device shadow service ensuring access to the last-known state of every device, even when offline.
- Write thorough unit, integration, and load tests, ensuring no service reaches staging without adequate test coverage.
- Take ownership of service reliability, including defining SLOs, creating alerting runbooks, and managing on-call incident response.
