What are the responsibilities and job description for the Data Engineer position at Oliver James?
This position is no longer open for applications Qualifications : Master's degree or PhD in related field.Proficient in Python.Strong background in Software Engineering.Meticulous in preventing and catching data mistakes.Enthusiastic about engaging deeply with raw data.Committed to adhering to engineering best practices.Responsibilities : Strong understanding in the significance of high-quality data for creating high-performance machine learning systems.Integrate novel, high-quality text data sources into established data pipelines.Build models dedicated to precise classification and extraction of valuable text from raw HTML.Develop a sophisticated OCR pipeline to extract pretraining text from images and scans, ensuring exceptional quality.Amass an extensive volume of multimodal data, exemplified by the collection of video transcripts spanning thousands of years.Devise innovative data generation pipelines that capitalize on existing data, such as the conversion of code from one programming language to another.Unify various annotation service providers into a user-friendly interface tailored for researchers.#J-18808-Ljbffr