About the job
About the Team You Will Join
- The ML Engineer (OCR) will be part of the Automation Platform Team at Toss Securities.
- The Automation Platform Team (APT) is dedicated to solving problems through technology to provide real value, aiming to enhance Toss Securities' productivity by 10x and create sustainable scale-up opportunities.
- This team is responsible for developing and operating various automation products, including OCR, scraping, and QA automation, handling end-to-end engineering tasks.
Your Responsibilities Upon Joining
Develop OCR for Retail Operations Automation at Toss Securities.
- Automate various retail operations currently handled manually using OCR technology.
- Gradually integrate documents issued by external organizations and internal review content into the OCR pipeline, improving recognition rates for edge cases and expanding coverage.
- Collaborate directly with domain POs/engineers to define problems and design realistic solutions that OCR needs to address.
Build the OCR Learning Pipeline.
- Fine-tune and enhance OCR/VLM models.
- Lead all stages from data collection/preprocessing/augmentation/evaluation/deployment as an ML Engineer.
Manage the OCR Model Stack.
- Handle the current stack, including open-source OCR/VLM, document layout, and orientation classifiers, replacing models as needed or introducing in-house trained models.
- Take responsibility for quality across the entire process, including pre-processing and post-processing logic.
Ensure Model Stability in the Operating Environment.
- Work closely with service engineers to enhance runtime stability and accuracy.
Ideal Candidate Profile
- Experience in Image/Document Processing: Proficient in Python (OpenCV, PyMuPDF) and Node (sharp) for image/document processing. Experience with categorizing large volumes of images and optimizing embedding and indexing structures is a plus.
- VLM/OCR Modeling Experience: Demonstrated experience in applying and evaluating SOTA VLM/OCR models. Familiarity with domain-specific tuning techniques like LoRA, optimizing accuracy and availability of smaller models, and experience with document layout models is beneficial.
- Experience in Designing Learning Data Pipelines: Have synthesized domain documents and automatically generated labels or designed augmentation strategies that mimic actual input distributions (e.g., scan, fax, JPEG). Experience developing under conditions of limited training data is favorable.
Additional Preferred Experience
- Understanding and handling the peculiarities of financial domain documents, including personal data tagging and low-quality scans.
- Ability to independently manage the entire pipeline from data collection/preprocessing to modeling and service application.
- Familiarity with tools like DVC and MLflow for managing training data and experiment results.
