What are the responsibilities and job description for the AI Inference Software Engineer position at Signify Technology?
Job Title : AI Inference Engineer – Real-Time Systems
Location : Hybrid, San Fransisco
Base Salary Range : $200,000-$400,000
Role Overview
We are looking for a talented AI Inference Engineer to help build and optimize real-time inference systems for advanced AI models, specifically designed for multimodal data processing such as text and audio and text to 3D or Video. This role will involve designing and implementing highly efficient and scalable inference engines that deliver low-latency, high-performance AI services. You will work with cutting-edge technologies in AI, backend engineering, and cloud infrastructure to enable seamless experiences across platforms.
Key Responsibilities
- Inference Engine Development : Design and optimize real-time AI inference engines that support multimodal data processing, including both audio and text inputs.
- Real-Time Systems : Develop high-throughput, low-latency pipelines for handling AI model inference, ensuring that performance and scalability meet the needs of production systems.
- Technology Integration : Leverage technologies such as WebRTC , FastAPI , and cloud-native infrastructure to support real-time AI inference and communication.
- Cross-Platform Support : Ensure the inference engine integrates efficiently with various platforms (iOS, Android, desktop), enabling smooth user experiences.
- API Development : Build and maintain APIs to support scalable, real-time AI interactions, ensuring seamless communication between AI models and frontend applications.
- Performance Optimization : Focus on optimizing AI inference systems to minimize latency, maximize throughput, and enhance overall system performance.
- Collaborative Development : Work closely with product teams, engineers, and data scientists to ensure that the inference engine aligns with product goals and delivers optimal user experiences.
- Infrastructure Management : Contribute to the management of cloud infrastructure, including GPU server clusters, CI / CD pipelines, and containerized environments (Docker, Kubernetes).
- Fault Handling & Scalability : Implement effective fault tolerance strategies and design scalable systems to meet the demands of real-time AI inference at scale.
Required Skills & Qualifications
Preferred Qualifications
Salary : $200,000 - $400,000