Qualifications
Key ResponsibilitiesML/LLM Platform DevelopmentOperationalize model training, evaluation, packaging, and deployment utilizing Databricks, Delta Lake, and medallion architecture.Implement Unity Catalog for model governance, lineage tracking, and access control.Create reusable job templates, cluster policies, and standardized deployment patterns.ML/LLM Production DeploymentDeploy and manage ML and GenAI solutions including risk scoring, anomaly detection, predictive maintenance, NLP, and RAG pipelines.Build and optimize LLM pipelines utilizing vector databases, model serving endpoints, and inference workflows.Enhance model performance through quantization, caching, and tuning techniques.Establish batch and real-time inference pipelines with defined SLAs.Reliability, Security & ComplianceImplement data contracts, schema validation, and data quality checks across ML pipelines.Ensure secure handling of sensitive data, including PII detection, classification, and obfuscation.Maintain full lineage from data sources to deployed models and serving endpoints.Enforce data residency, governance, and compliance policies.CI/CD Automation & TestingDevelop CI/CD pipelines using GitHub Actions and Databricks Asset Bundles.
About the job
At Irth Solutions, we are at the forefront of technological innovation, developing advanced software platforms that empower organizations with data-driven insights in areas such as Damage Prevention, Asset Integrity, Land Management, and Stakeholder Engagement. Our vibrant product culture, collaborative atmosphere, and significant growth opportunities provide a stimulating environment for professionals eager to work on enterprise-level data and AI platforms.
We are currently constructing a governed, multi-cloud Databricks Lakehouse that will serve as the backbone for analytics, AI/ML advancements, and customer-oriented AI products across major platforms like AWS, Azure, and GCP.
Role Overview
As an MLOps / LLMOps Engineer, your primary responsibility will be to design, automate, and manage scalable ML and LLM systems on our enterprise Lakehouse platform. You will collaborate closely with Data Science, Engineering, and Product teams to implement reliable, secure, and production-ready ML and GenAI solutions. This role emphasizes the operationalization of ML models, development of CI/CD pipelines, governance and compliance assurance, and maintenance of high-performance, observable AI systems.