What are the responsibilities and job description for the Senior Software Engineer - AI/ML position at Impax Recruitment?
About the Role
This is an exciting opportunity to join our team at Impax Recruitment as a Senior Software Engineer - AI/ML. You will be responsible for building and managing large-scale ML infrastructure, designing scalable pipelines, and exploring new training techniques.
Key Responsibilities:
- Architect and maintain distributed systems for training and inference of large machine learning models, ensuring optimal performance across all stages.
- Develop and implement end-to-end data processing pipelines capable of handling massive datasets, from ingestion and transformation to model training and deployment.
- Research and implement cutting-edge training methods, including parallelization strategies and precision trade-offs, to improve the performance and scalability of model training.
- Analyze and enhance low-level GPU operations to improve efficiency, reduce latency, and maximize hardware utilization in complex ML tasks.
- Stay updated on industry trends and advancements in ML research to incorporate new ideas and techniques into our systems.
What We're Looking For:
- A strong problem-solver with fast execution and adaptability in tackling complex problems with speed and creativity.
- Expertise in optimizing ML workloads, including leveraging advanced techniques like mixed-precision training and hardware optimization.
- Experience with distributed training frameworks, such as FSDP or DeepSpeed, and cloud platforms, including GCP, AWS, or Azure.
- Hands-on experience with containerization and orchestration tools like Docker and Kubernetes.
- Distributed systems and scalable serving expertise, including building task management systems and deploying ML models in production environments.
- Knowledge of monitoring and observability practices, including logging and tracking performance in ML systems.
What We Offer:
- A fully onsite position in SF with startup hours.