About the job
Join Devsinc, a forward-thinking technology company, as we seek a talented Machine Learning Engineer with over 2 years of hands-on experience in developing and optimizing Generative AI models (including LLMs and Diffusion Models), Vision-Language Models (VLMs), as well as classical and deep learning systems. In this role, you will take charge of the complete lifecycle of AI solutions, from conception to deployment in a production environment.
Your expertise in modeling and MLOps will be paramount as you engage in high-impact projects, such as creating Generative AI applications, implementing Stable Diffusion, and developing cutting-edge solutions for OCR, theft detection, and recommendation systems. You will design, optimize, and deploy custom models for practical, real-world applications.
Key Responsibilities:
- Develop production inference stacks: Optimize models (Torch → ONNX → TensorRT), perform quantization/pruning, and deliver low-latency GPU inference while maintaining accuracy.
- Build robust model-serving infrastructure: Implement FastAPI/gRPC inference services, manage model versioning, and conduct A/B testing.
- Create Computer Vision solutions: Design pipelines for object detection, theft detection, OCR, and surveillance analytics; fine-tune Hugging Face pretrained models.
- Fine-tune generative models for consistent image generation across various brand styles and downstream tasks.
- Train Vision-Language Models (VLMs) for multimodal tasks using both from-scratch and transfer-learning methods.
- Design LLM-based Generative AI systems for conversational agents and domain-specific applications.
- Implement MLOps practices: Automate CI/CD processes and monitor models for performance and drift.
- Develop data acquisition pipelines: Create scrapers and ingestion systems with proxy rotation and rate-limit management.
- Integrate third-party models and APIs and design hybrid inference strategies for optimal performance.

