Demo

AI Inference Software Engineer

Signify Technology
Alameda, CA Full Time
POSTED ON 3/10/2025
AVAILABLE BEFORE 6/9/2025

Job Title : AI Inference Engineer – Real-Time Systems

Location : Hybrid, San Fransisco

Base Salary Range : $200,000-$400,000

Role Overview

We are looking for a talented AI Inference Engineer to help build and optimize real-time inference systems for advanced AI models, specifically designed for multimodal data processing such as text and audio and text to 3D or Video. This role will involve designing and implementing highly efficient and scalable inference engines that deliver low-latency, high-performance AI services. You will work with cutting-edge technologies in AI, backend engineering, and cloud infrastructure to enable seamless experiences across platforms.

Key Responsibilities

  • Inference Engine Development : Design and optimize real-time AI inference engines that support multimodal data processing, including both audio and text inputs.
  • Real-Time Systems : Develop high-throughput, low-latency pipelines for handling AI model inference, ensuring that performance and scalability meet the needs of production systems.
  • Technology Integration : Leverage technologies such as WebRTC , FastAPI , and cloud-native infrastructure to support real-time AI inference and communication.
  • Cross-Platform Support : Ensure the inference engine integrates efficiently with various platforms (iOS, Android, desktop), enabling smooth user experiences.
  • API Development : Build and maintain APIs to support scalable, real-time AI interactions, ensuring seamless communication between AI models and frontend applications.
  • Performance Optimization : Focus on optimizing AI inference systems to minimize latency, maximize throughput, and enhance overall system performance.
  • Collaborative Development : Work closely with product teams, engineers, and data scientists to ensure that the inference engine aligns with product goals and delivers optimal user experiences.
  • Infrastructure Management : Contribute to the management of cloud infrastructure, including GPU server clusters, CI / CD pipelines, and containerized environments (Docker, Kubernetes).
  • Fault Handling & Scalability : Implement effective fault tolerance strategies and design scalable systems to meet the demands of real-time AI inference at scale.

Required Skills & Qualifications

  • AI Inference Expertise : Proven experience building and optimizing inference engines for multimodal AI systems, particularly in real-time applications involving both audio and text.
  • Real-Time System Design : Strong knowledge of designing low-latency, high-performance systems capable of handling complex inference tasks in production environments.
  • Cloud Infrastructure : Experience working with AWS , GCP , or other cloud platforms, including managing server clusters and leveraging cloud-native technologies for scaling AI inference systems.
  • Backend Development : Proficiency in languages such as Python , Go , or similar for developing backend services. Experience with frameworks like FastAPI for building efficient APIs is a plus.
  • Performance Optimization : Expertise in optimizing inference pipelines, improving computation efficiency, and reducing system latency.
  • Cross-Platform Development : Familiarity with supporting mobile (iOS, Android) and desktop platforms for AI-powered applications.
  • Collaboration Skills : Excellent communication skills and the ability to collaborate effectively with cross-functional teams (product, AI, and engineering).
  • Continuous Integration / Deployment : Experience working with CI / CD tools (e.g., Jenkins, GitHub Actions) and managing software deployments in a production environment.
  • Preferred Qualifications

  • 4-5 years of experience in AI inference engine development or related fields.
  • Hands-on experience with WebRTC , LiveKit , or similar technologies for real-time communication.
  • Familiarity with containerization technologies like Docker and orchestration tools like Kubernetes for managing AI inference services.
  • Experience with GPU server management and optimizing workloads for AI models.
  • Ability to write clear technical documentation and maintain high coding standards.
  • Salary : $200,000 - $400,000

    If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
    Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

    What is the career path for a AI Inference Software Engineer?

    Sign up to receive alerts about other jobs on the AI Inference Software Engineer career path by checking the boxes next to the positions that interest you.
    Income Estimation: 
    $97,257 - $120,701
    Income Estimation: 
    $123,167 - $152,295
    Income Estimation: 
    $77,900 - $95,589
    Income Estimation: 
    $101,387 - $124,118
    Income Estimation: 
    $101,387 - $124,118
    Income Estimation: 
    $119,030 - $151,900
    Income Estimation: 
    $149,493 - $192,976
    Income Estimation: 
    $184,796 - $233,226
    Income Estimation: 
    $119,030 - $151,900
    Income Estimation: 
    $149,493 - $192,976
    View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

    Job openings at Signify Technology

    Signify Technology
    Hired Organization Address San Francisco, CA Full Time
    Signify is excited to help a fast growing GenAI company build out their customer success organization. The company is a ...
    Signify Technology
    Hired Organization Address Santa Clara, CA Full Time
    Job Title : Senior Software Engineer - Data & Machine Learning Location : San Fransisco Hybrid ( 2 Days onsite) Base Sal...
    Signify Technology
    Hired Organization Address San Jose, CA Full Time
    Job Title: Senior Marketing Specialist Salary: $130K - 150K Location: Remote US A well-funded startup in the gaming and ...
    Signify Technology
    Hired Organization Address Days Creek, OR Contractor
    Sr. Platform Engineer (IAM Team) Burbank, CA Onsite - 6 month contract on W2 with potential to extend and/or convert to ...

    Not the job you're looking for? Here are some other AI Inference Software Engineer jobs in the Alameda, CA area that may be a better fit.

    AI Inference Software Engineer

    Signify Technology, Hayward, CA

    Software Engineer - ML Inference

    Predibase, Hayward, CA

    AI Assistant is available now!

    Feel free to start your new journey!