What are the responsibilities and job description for the Mainframe Trans24 Developer position at Cerebra Consulting Inc?
Job Details
Engineer Level
6 month contract to start,
will likely extend and be a long term,
multi year contract 100% Remote, MUST WORK EST HOURS
Required Skills :
The project involves operationalizing models, creating CI/CD pipelines, monitoring, alerting, and working closely with the 84.51 data science team.
The focus is on the serving side/ops.
Tech Stack: VertexAI, Python, TensorFlow, PyTorch, MLOps (not engineering - are not training the models, they are production and mlops, create CICD, monitoring, alerting, etc.)
Must Haves: VertexAI, Python Python libraries, MLOps experience (does not want a Data Scientist/ML Engineer who is training the models) strong soft skills - collaboration is key in this group/for this role as they are working with multiple teams/LOBs, so will need to share new ideas, be patient, be a team player
Job Description:
We are seeking a dynamic Senior Software Engineer with an ML focus to lead the integration and operationalization of machine learning models in our Search area. This role requires close collaboration with data scientists and leadership teams, leveraging MLOps best practices to ensure smooth deployment and operation of ML models. The ideal candidate will have expertise in diverse ML platforms, including Google Vertex AI, cloud technologies, and open-source solutions.
This role sits at the intersection of MLOps, data science, and software engineering, ensuring the robustness and scalability of our ML infrastructure.
Required Qualifications:
- 5 years of experience in software engineering with a focus on machine learning and MLOps.
- Expertise in Google Vertex AI, cloud ML platforms, and open-source ML tools.
- Hands-on experience with recommender systems and deep learning frameworks.
- Strong software engineering skills to integrate ML models into large-scale applications.
- Experience with A/B testing, model evaluation, and optimization techniques.
- Solid understanding of infrastructure needs for ML deployment (GPU/CPU, networking, scaling).
- Proficiency in Python, TensorFlow, PyTorch, and distributed computing frameworks.
- Strong collaboration skills to work with data scientists, engineers, and leadership teams.
Key Responsibilities:
Recommender Systems & ML Model Development
- Develop and integrate recommender systems into customer-facing products.
- Implement ML techniques such as embedding-based retrieval, reinforcement learning, and transformers.
- Collaborate with engineering teams to ensure seamless model integration.
- Drive A/B testing and iterative optimization using data-driven methodologies.
- Assess infrastructure needs for ML deployment, including CPU/GPU resources and networking requirements.
Feature Store Management
- Efficiently manage, share, and reuse machine learning features at scale using Vertex AI Feature Store.
- Implement centralized feature stores to maintain transparency and consistency across ML operations.
- Enable secure and scalable feature delivery while maintaining access control and governance.
Data Management & Collaboration
- Work with data engineers and scientists to ensure high-quality labeled datasets.
- Ensure end-to-end integration of data pipelines to AI workflows using BigQuery and BigTable.
- Optimize data structures and storage to enhance model performance and efficiency.
Continuous Monitoring & Optimization
- Monitor ML systems in production to identify bottlenecks and improvement opportunities.
- Implement automation strategies to improve model retraining, deployment, and performance tracking.
- Participate in support rotations and troubleshoot production ML issues as needed.
Hands-on experience working on recommender systems, drawing from ML techniques such as embedding based retrieval, reinforcement learning, and transformers
- Software engineering skills to work with teams integrating the recommender systems into customer facing products.
- Experience in AB testing and iterative optimization using data driven approaches.
- Understanding of infrastructure needs required to deploy ML systems (CPU/GPU, networking infrastructure).
Please share me that profiles at