About the job
About Etched
Etched is pioneering the world's first AI inference system specifically designed for transformers, achieving over 10x higher performance, significantly reduced costs, and minimal latency compared to B200 systems. Our custom ASICs enable the development of innovative products that were previously unattainable with GPUs, such as real-time video generation models and advanced chain-of-thought reasoning agents. With substantial backing from leading investors and a team of top engineers, Etched is revolutionizing the infrastructure of one of the fastest-growing industries in history.
Key Responsibilities
Assist in porting cutting-edge models to our architecture and contribute to the development of programming abstractions and testing capabilities to streamline the model porting process.
Develop, enhance, and scale Sohu’s runtime, focusing on multi-node inference, intra-node execution, state management, and effective error handling.
Optimize routing and communication layers utilizing Sohu's collectives.
Employ performance profiling and debugging tools to pinpoint bottlenecks and correctness challenges.
Ideal Candidate Profile
Strong proficiency in C++ or Rust programming languages.
Solid understanding of performance-critical and complex distributed software systems, including Linux internals, accelerator architectures (e.g., GPUs, TPUs), compilers, and high-speed interconnects (e.g., NVLink, InfiniBand).
Familiarity with machine learning frameworks such as PyTorch or JAX.
Experience in porting applications to non-standard accelerator hardware or platforms.
Preferred Qualifications
Experience in developing low-latency, high-performance applications using both kernel-level and user-space networking stacks.
In-depth understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols and communication patterns.
Thorough knowledge of Transformer architectures, particularly Mixture-of-Experts (MoE).
Experience building applications with substantial SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths.
Benefits
- Health Insurance
- 401k

