What are the responsibilities and job description for the AI Gateway Engineer position at Ascendum Solutions?
Description:
This role involves implementing an AI API service using Azure API Management, Google API Gateway, and Lite LLM-based services. The team will build scalable, low-latency, fault-tolerant API infrastructures for high-throughput AI model inference. Responsibilities include automating team onboarding to AI gateways, ensuring load balancing, caching, and auto-scaling for minimal latency and optimized throughput. The team will integrate Python SDKs for seamless model deployment and API consumption, build observability for API services, and maintain AI API deployment standards while contributing to the Center of Excellence. Additionally, the team will stay updated with advancements in Generative AI, applying best practices to improve model performance and deployment strategies. Collaboration with data scientists, software engineers, and product managers will be essential to develop innovative AI-driven solutions. This project will also require support of the implemented services, including deployments, change control, and troubleshooting of access and networking issues. Support requirements will progressively increase as services are delivered and teams onboarded to the services.
Core Experience:
- Building and supporting AI API gateways for applications and services.
- Proficient in Kubernetes (K8s), Docker, and Azure Container Services.
- Expertise in prompt engineering and content management.
- Managing API Gateways (Azure API Management, Google API Gateway).
- Integrating Python SDKs for Lite LLM model deployment.
- Implementing load balancing and caching (Redis, Google Cloud CDN, Azure Front Door).
- Managing cloud infrastructure (Azure, GCP, Terraform).
- Deploying cloud services across multiple regions for high availability.
- Operations and support of cloud services