What are the responsibilities and job description for the Machine Learning Engineer (New York) position at Skyfall AI?
About the company
Skyfall is disrupting the entire AI ecosystem by building the first world model for the enterprise. The goal of the ‘Enterprise world Model’ is to overcome the severe limitations of LLMs (Safety, Hallucinations, Expensive training) in order to provide the enterprises significant value by having a comprehensive understanding of the complex interplay between data, people and processes with organizations.
The Skyfall founding team consists of Maluuba founders who were previously pioneers in the Deep learning revolution. Maluuba worked with AI pioneers such as Yoshua Bengio and Richard Sutton before it was acquired by Microsoft for $160M and became Microsoft’s AI research center in Canada.
Job Overview
Skyfall is hiring multiple ML Engineers to deploy and optimize large language models (LLMs) in production. You’ll be responsible for fine-tuned and RLHF-trained LLM deployment, optimizing inference for cost and latency, and building scalable training pipelines using DeepSpeed, Accelerate, and Ray. The role involves designing distributed training infrastructure, managing multi-cloud ML deployments, and implementing cutting-edge model compression techniques. Skyfall is hiring multiple ML Engineers across New York, Toronto and Bangalore.
Key Responsibilities
- Deploy post-trained LLMs (fine-tuned or RLHF-trained) into production environments.
- Optimize LLM inference for cost and latency, leveraging techniques like model quantization, FlashAttention, and vLLM.
- Develop scalable training and inference pipelines using DeepSpeed, Accelerate, and Ray.
- Build internal tools for the data science and research teams to enable multi-GPU training and large-scale experimentation.
- Design and maintain distributed training infrastructure, ensuring efficient resource allocation.
- Develop cluster management tools for external compute infrastructure, potentially spanning multiple cloud vendors.
- Implement continuous model evaluation pipelines to track model drift, inference performance, and cost efficiency.
- Research and implement state-of-the-art model compression and inference acceleration techniques.
Minimum Requirements