What are the responsibilities and job description for the AI/ML Engineer position at Cloud and Things?
Our goal is to solve problems and deliver results for our clients. At Cloud and Things, you can be a part of transforming the public sector’s IT environment. Our team is on the forefront of helping to solve the government's most complex IT challenges. If you are seeking a role that offers the opportunity to work on rewarding projects, consider a career with Cloud and Things.
Overview:
We are seeking a highly skilled and experienced Senior AI/ML Engineer to work with our client. This role will also be responsible for architecting an on-premises AI/ML environment, ensuring a robust and scalable MLOps pipeline, and integrating model outputs into business reporting tools such as Power BI, Oracle Analytics, or APIs.
Duties:
Overview:
We are seeking a highly skilled and experienced Senior AI/ML Engineer to work with our client. This role will also be responsible for architecting an on-premises AI/ML environment, ensuring a robust and scalable MLOps pipeline, and integrating model outputs into business reporting tools such as Power BI, Oracle Analytics, or APIs.
Duties:
- AI/ML Infrastructure & Deployment
- Architect and deploy an on-prem AI/ML environment, including GPU clusters and high-performance computing resources.
- Collaborate with infrastructure teams to test and optimize networking storage and compute resources for AI workloads.
- Implement scalable storage solutions (e.g., distributed file systems, object storage) for efficient data handling.
- Ensure system reliability, security, and performance through best practices in Linux system administration and resource scheduling.
- Configure AI model training and inference environments, leveraging containerization (Docker, Kubernetes) and MLOps pipelines.
- Design and implement MLOps processes to support efficient model training, validation, deployment, and monitoring.
- Configure and set up ML Oracle Cloud from scratch, ensuring a scalable and production-ready infrastructure.
- Collaborate with cross-functional teams to understand data requirements and integrate AI/ML solutions into existing enterprise systems.
- Work with developers to integrate AI model outputs into business intelligence tools such as Power BI and Oracle Analytics.
- Master’s or Ph.D. in Computer Science, Data Science, Machine Learning, or a related field.
- 3 years of experience in AI/ML engineering with a focus on infrastructure, MLOps, and cloud AI deployment.
- Experience configuring and setting up ML platforms on-premises or in Oracle Cloud from scratch.
- Strong expertise in Linux-based AI/ML environments, including performance optimization, package management, shell scripting.
- Experience with HPC environments, GPU clusters (H100, A100, or similar), and distributed AI workloads.
- Strong programming skills in Python and experience with AI/ML frameworks such as TensorFlow, PyTorch, or similar.
- Hands-on experience with MLOps, including model training, validation, deployment, and monitoring.
- Experience integrating AI/ML models into business intelligence tools (Power BI, Oracle Analytics, or APIs).
- Experience with high-speed networking, storage solutions, and AI/ML system performance tuning.
- Onsite working in Downtown Brooklyn, NY.