What are the responsibilities and job description for the Site Reliability Engineer - API Integration position at Albano Systems, Inc.?
W2 ONLY - NO CORP TO CORP - NO VISA TRANSFER- NO 3RD PARTIES
API Enablement Site Reliability Engineer Senior Staff Engineer
Job Description:
We are seeking a highly skilled and experienced API Enablement SRE Senior Staff Engineer to join our Team. The ideal candidate will have a strong background in managing and optimizing complex systems, ensuring their reliability, scalability, and performance. This role focuses on enhancing our API Management Platforms and integrating SRE best practices.
Key Responsibilities:
• API Platform and Enablement Team:
o Design, implement, and maintain reliable and scalable SRE practices for API Management Platforms.
o Strong knowledge and experience in API solutions, platforms, API delivery, and API management.
o Strengthen the maturity of SRE practices by building on and executing improvements to observability, resiliency, and stability.
o Assess ecosystem changes to determine risk, impact, and checkout needs for API and Integration Platforms.
o Proactively consider SRE improvements and GenAI opportunities, create solutions, and successfully execute them.
o Create self-service capabilities to enable API provider teams to easily integrate with SRE API best practices.
• Incident Management and On-Call Rotation:
o Lead incident management, structured triage, and analysis, including the creation and management of incident runbooks.
o Participate in on-call rotations for incidents and changes, including evenings and weekends.
o Conduct problem analysis, remediation, and continuous improvement to enhance system reliability.
• Views, Dashboards, and Unified Views:
o Implement and maintain observability and monitoring solutions, including Splunk and Dynatrace.
o Create unified views, dashboards, and visualizations to provide a single pane of glass and information radiators for system health and performance.
o Create unified views that can be shared across stakeholders to quickly align on the issue root cause.
• Resiliency and Strengthening SRE Maturity:
o Design, implement, and maintain reliable and scalable systems and infrastructure.
o Lead the team in SRE and proactive risk mitigation, including resiliency and disaster recovery exercises, change management, and upgrades and patches.
o Level up SRE maturity and demonstrate it through the achievement of KPIs and operational metrics.
Performance and Automation:
o Monitor and optimize the performance, availability, and reliability of systems and applications.
o Develop and maintain automation tools and scripts to streamline operations and improve efficiency.
• Risk Management and Metrics:
o Define, operationalize, and integrate SRE-related KPIs, metrics, and ideas into day-to-day activities.
o Proactively manage risks, including assessment of findings, planning remediation, and executing to bring prompt closure to resolve risks.
Qualifications:
• Strong knowledge and experience in API solutions, platforms, API delivery, and API management.
• Knowledge and skills in API Platforms (e.g., API Connect, Apigee, AWS API Gateway) and API Management.
• 5 years of experience in site reliability engineering or a related field.
• Expertise in SRE best practices, including incident management, resiliency, monitoring, detection, diagnosis, remediation, and prevention.
• Demonstrated experience in being on call and resolving incidents, including incident management and root cause analysis.
• Experience with large-scale distributed systems.
• Knowledge of CI/CD pipelines and DevOps practices.
• Experience with cloud platforms (e.g., AWS, Azure, GCP).
• Proficiency in scripting languages (e.g., Python, Bash) and automation tools (e.g., Ansible, Terraform).
• Familiarity with monitoring and observability tools (e.g., Splunk, Dynatrace).
• Demonstrated ability to mature SRE practices and strengthen stability through proven KPIs and metrics.
• Excellent documentation, communication, problem-solving, and collaboration skills.
Preferred Qualifications:
• Bachelor's degree in Computer Science, Engineering, or a related field.
Salary : $90 - $100