Demo

Senior AI Infrastructure Engineer

Signify Technology
San Francisco, CA Full Time
POSTED ON 1/17/2025
AVAILABLE BEFORE 4/17/2025

Job Title : Senior AI Infrastructure Engineer

Location : Remote but must be located in the Bay Area

Salary Range : $200,000-$250,000 Equity

About the Company

They are a fast-growing startup in the 3D generation space, focused on creating tools for 3D artists and game developers. With over 1 million users, their platform is at the forefront of revolutionizing the creation of 3D content using advanced AI and machine learning. Their products enable game developers to quickly generate high-quality 3D models. As they continue to expand, they are looking for an experienced Senior AI Infrastructure Engineer to help scale their AI and machine learning infrastructure.

About the Role

In this role, the engineer will be responsible for training and managing GPU clusters, scaling data processing workflows, and optimizing the performance of AI models on cloud infrastructure. They will work hands-on with large-scale datasets and GPUs to build and scale the infrastructure required to support cutting-edge AI applications such as Text-to-3D and Image-to-3D generation. The ideal candidate will have experience managing their own GPU clusters (8 GPUs), scaling workloads, and working with large image datasets in a cloud environment.

Responsibilities

  • GPU Cluster Management : Lead the training and inferencing processes for image-based AI models on GPU clusters. Manage and scale 8 GPUs, ensuring efficient operation and optimal performance across the cluster. This includes setup, monitoring, and troubleshooting of GPU resources.
  • Data Processing & Scaling : Work directly with large-scale data processing workflows. Ensure data is processed, cleaned, and ready for training. Scale data pipelines to support high throughput in cloud environments such as AWS or Azure.
  • Model Tuning & Training : Work with teams to fine-tune AI models on large image datasets. Train models from scratch or fine-tune pre-trained models for specific use cases, ensuring high performance and scalability. Fine-tuning multi-GPU setups will be a critical part of the role.
  • Cloud Infrastructure : Utilize cloud platforms like AWS or Azure to manage and scale GPU clusters. Optimize cloud resources for large-scale training jobs and ensure infrastructure supports the growing demands of their AI models.
  • Collaboration & Innovation : Collaborate closely with AI and ML teams to deploy new algorithms, experiment with distributed training, and enhance infrastructure. Play a key role in scaling their GenAI products and ensuring systems can handle millions of AI operations per month.

Required Skills

  • Experience with GPU Clusters : Proven hands-on experience managing and training models on GPU clusters of 8 GPUs, ideally managing the infrastructure independently (not via a company). Comfortable with both training and inferencing tasks on large-scale systems.
  • Large-Scale Data Experience : Experience processing large image datasets for machine learning tasks, including data preprocessing, scaling data workflows, and ensuring smooth pipelines for large training jobs.
  • Model Training & Tuning : Experience in training and fine-tuning deep learning models (primarily image-based models) using frameworks like PyTorch, TensorFlow, or similar. Proficiency in tuning models on GPUs to maximize performance.
  • Cloud Platforms & Tools : Experience working with cloud platforms like AWS or Azure to scale GPU clusters for deep learning workloads. Knowledge of cloud-based orchestration tools (e.g., Ray) is a plus.
  • Programming Skills : Proficiency in Python for developing and optimizing training pipelines. Experience with distributed computing and parallel processing tools is highly valued. Familiarity with JAX, PyTorch, or similar libraries for model training is beneficial.
  • Salary : $200,000 - $250,000

    If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
    Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

    What is the career path for a Senior AI Infrastructure Engineer?

    Sign up to receive alerts about other jobs on the Senior AI Infrastructure Engineer career path by checking the boxes next to the positions that interest you.
    Income Estimation: 
    $56,489 - $71,327
    Income Estimation: 
    $70,310 - $88,223
    Income Estimation: 
    $66,679 - $90,237
    Income Estimation: 
    $70,310 - $88,223
    Income Estimation: 
    $88,950 - $110,401
    Income Estimation: 
    $84,958 - $111,603
    Income Estimation: 
    $88,950 - $110,401
    Income Estimation: 
    $109,186 - $139,009
    Income Estimation: 
    $115,336 - $159,446
    Income Estimation: 
    $109,186 - $139,009
    Income Estimation: 
    $117,059 - $151,769
    Income Estimation: 
    $115,336 - $159,446
    View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

    Job openings at Signify Technology

    Signify Technology
    Hired Organization Address Santa Monica, CA Contractor
    Cloud Architect – Hybrid (Santa Monica, CA) Contract: 1 year (high likelihood of extension or conversion) Pay Rate: Up t...
    Signify Technology
    Hired Organization Address Santa Rosa, CA Full Time
    Job Title : Senior Software Engineer - Data & Machine Learning Location : San Fransisco Hybrid ( 2 Days onsite) Base Sal...
    Signify Technology
    Hired Organization Address San Francisco, CA Full Time
    Job Title : Senior Software Engineer - Data & Machine Learning Location : San Fransisco Hybrid ( 2 Days onsite) Base Sal...
    Signify Technology
    Hired Organization Address Alameda, CA Full Time
    Signify has the pleasure of helping a fast growing GenAI company expand their post sales org. They are a team of enginee...

    Not the job you're looking for? Here are some other Senior AI Infrastructure Engineer jobs in the San Francisco, CA area that may be a better fit.

    Senior Data Infrastructure Engineer

    Together AI, San Francisco, CA

    AI Assistant is available now!

    Feel free to start your new journey!