You haven't searched anything yet.
You will:
• Collaborate with data scientists and analysts to understand data requirements and translate them into scalable, high performant data pipeline solutions.
• Support data discovery & data preparation for model development. Perform detailed analysis of raw data sources by applying business context and collaborate with cross-functional teams to transform raw data into curated & certified data assets to be used for ML and BI use cases.
• Collaborate with data science and data engineering team to build scalable and reproducible machine learning pipelines for training and inference.
• Implement machine learning models into operations and processes via batch, streaming and API methods.
• Monitor and troubleshoot data pipeline performance, identifying and resolving bottlenecks and issues.
• Develop, test, and maintain robust tools, frameworks, and libraries that standardize and streamline the data & machine learning lifecycle.
• Contribute to developing and maintaining end-to-end MLOps lifecycle to automate machine learning solutions development and delivery.
• Implement robust monitoring framework for model performance.
• Collaborate with cross-functional teams of Data Science, Data Engineering, business units and various IT teams.
• Create and maintain effective documentation for project and practices ensuring transparency and effective team communication.
You Have:
• Bachelor’s or master’s degree with 5 years of experience in Computer Science, Data Science, Engineering, or a related field.
• 4 years of experience in working with Python, SQL, PySpark and bash scripts. Proficient in software development lifecycle and software engineering practices.
• 2 years of hands-on experience in using Databricks platform
• 3 years of hands-on experience in operationalizing Machine Learning solutions which are used in live production processes.
• 2 years of experience and proficiency in API development using FastAPI frameworks and familiarity with containerization technologies like docker or Kubernetes.
• 3 years of experience in developing and maintaining robust data pipelines data to be used by Data Scientists to build ML Models.
• 3 years of experience working with Cloud Data Warehousing (Redshift, Snowflake, Databricks SQL or equivalent) platforms and experience in working with distributed framework like Spark.
• Solid understanding of machine learning life cycle, data mining, and ETL techniques.
• Experience with machine learning frameworks (like Keras or PyTorch) and libraries (like scikit-learn, xgboost).
• Hands-on experience in building and maintaining tools and libraries which have been used by multiple teams across organization.
• Proficient in understanding and incorporating software engineering principles in design & development process.
• Hands on experience with CI/CD tools (e.g., Jenkins or equivalent), version control (Github, Bitbucket), Orchestration (Airflow, Prefect or equivalent)
• Excellent communication skills and ability to work and collaborate with cross functional teams across technology and business.
Good to have:
• Familiarity with deep learning frameworks and deploying deep learning models for production use cases.
• Familiarity in using GPU compute either for model training or inference.
• Understanding of Large language models (LLM) and MLOps lifecycle for operationalizing LLM models.
Full Time
Restaurants & Catering Services
$115k-142k (estimate)
06/23/2024
07/21/2024
stealthmedia.com
New York, NY
25 - 50
Restaurants & Catering Services