What are the responsibilities and job description for the Site Reliability Engineer position at Embrace Pet Insurance?
Brief Description
We're looking for a Site Reliability Engineer who is passionate about maintaining high system availability and delivering scalable solutions. If you thrive in a fast-paced environment and are dedicated to enhancing system reliability and performance, we want you on our team.
Job Description
As a Site Reliability Engineer, you will be instrumental in managing our operational systems and ensuring the reliability and stability of our online and internal platforms. You will work closely with development and infrastructure teams to integrate software engineering practices into system operations, aiming for high availability, optimal performance, and scalability.
Responsibilities
Benefits:
Embrace Pet Insurance is an equal-opportunity employer committed to fostering an inclusive and diverse work environment. In accordance with applicable federal, state, and local laws, we do not discriminate on the basis of race, color, religion, gender, gender identity or expression, sexual orientation, age, national origin, ancestry, marital status, pregnancy, genetic information, physical or mental disability, veteran or military status, or any other protected characteristic under applicable law. Our hiring decisions are solely based on merit, qualifications, and the needs of the business. We are dedicated to ensuring a fair, reciprocal, and positive work experience for all employees, and we encourage applications from individuals with diverse backgrounds, perspectives, and abilities. If you have any questions regarding our equal employment opportunity policy, please contact the Human Resources department or an appropriate representative within the company. Additionally, if you require reasonable accommodations during the application process or while working as an employee, please submit a written request to the Human Resources Department. We take our commitment to equal employment opportunity seriously and strive to create a respectful and inclusive work environment for all team members.
We're looking for a Site Reliability Engineer who is passionate about maintaining high system availability and delivering scalable solutions. If you thrive in a fast-paced environment and are dedicated to enhancing system reliability and performance, we want you on our team.
Job Description
As a Site Reliability Engineer, you will be instrumental in managing our operational systems and ensuring the reliability and stability of our online and internal platforms. You will work closely with development and infrastructure teams to integrate software engineering practices into system operations, aiming for high availability, optimal performance, and scalability.
Responsibilities
- Monitor and analyze the performance of production systems using tools such as Datadog, Sentry, and Grafana.
- Proactively address system issues and anomalies before they become critical.
- Develop and maintain automated tools for system health monitoring, disaster recovery, and performance benchmarks.
- Work with cross-functional teams to design and implement enhancements and fixes to improve system reliability and performance.
- Document system design and procedures related to system maintenance and operations.
- Conduct post-incident reviews and lead efforts to implement effective solutions to prevent recurrence.
- Ensure all system operations comply with security standards and regulatory requirements.
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 3-5 years of experience as a Site Reliability Engineer or similar role, preferably in an ecommerce environment.
- Proficient with monitoring tools such as Datadog, Sentry, and Grafana.
- Strong experience with cloud services (AWS, Azure, Google Cloud) and understanding of cloud architecture.
- Experience with Docker, Kubernetes, or other container orchestration technologies.
- Proficiency in scripting languages (e.g., Python, Bash).
- Strong understanding of network systems, databases, and web services.
- Excellent problem-solving and communication skills.
- Ability to handle multiple tasks in a fast-paced environment.
Benefits:
- 401(k)
- Dental insurance
- Disability insurance
- Health insurance
- Paid time off
- Parental leave
- Tuition reimbursement
- Vision insurance
- Work from home
Embrace Pet Insurance is an equal-opportunity employer committed to fostering an inclusive and diverse work environment. In accordance with applicable federal, state, and local laws, we do not discriminate on the basis of race, color, religion, gender, gender identity or expression, sexual orientation, age, national origin, ancestry, marital status, pregnancy, genetic information, physical or mental disability, veteran or military status, or any other protected characteristic under applicable law. Our hiring decisions are solely based on merit, qualifications, and the needs of the business. We are dedicated to ensuring a fair, reciprocal, and positive work experience for all employees, and we encourage applications from individuals with diverse backgrounds, perspectives, and abilities. If you have any questions regarding our equal employment opportunity policy, please contact the Human Resources department or an appropriate representative within the company. Additionally, if you require reasonable accommodations during the application process or while working as an employee, please submit a written request to the Human Resources Department. We take our commitment to equal employment opportunity seriously and strive to create a respectful and inclusive work environment for all team members.