What are the responsibilities and job description for the Site Reliability Engineer position at Careerbuilder-US?

Job description :

We are a cutting edge biomedical startup that is preparing for our first product release. This is a unique opportunity to be on the ground floor of a rapidly growing biomedical company. We are a tight-knit, agile group with many capable engineering, medical, and business personnel on the team and board alike. We are looking to further expand our team by adding a strong software development arm to the company.

Current Project :

Client's Humero Tech C1 changes the way shoulder injuries are rehabilitated with our innovative strength-building and sensor based technology. Our rotator cuff machine tracks patients' efforts as they work through strength-based exercises. At the end of sessions, the user gets a set of in-depth metrics to help inform the next steps for recovery.

Client is at the very beginning of device rollout into the field, and thus Titin is searching for a talented Site Reliably Engineer to ensure customers have a smooth experience while working with our software and their data.

Additionally, a strong and positive personality is critical because this person will inevitably be communicating directly with our customers.

System Monitoring and Incident Management

Set up and maintain monitoring tools to track system performance, availability, and reliability.

Respond to incidents, troubleshoot issues, and ensure fast recovery to minimize downtime.

Implement alerting mechanisms to proactively identify potential issues before they impact end users.

Automation and Efficiency

Automate manual operations and repetitive tasks to improve system reliability and speed.

Write scripts and create tools to streamline deployment, monitoring, and scaling processes.

Work with Continuous Integration / Continuous Deployment (CI / CD) management tools.

Infrastructure Management

Manage cloud infrastructure to ensure system reliability and scalability.

Monitor and maintain these systems to comply with HIPPA and SOC 2 requirements.

Performance Optimization

Analyze system performance and work on tuning to meet predefined service level objectives (SLOs).

Optimize resource usage, including compute, memory, and storage, to ensure cost-efficiency without sacrificing performance.

Disaster Recovery and High Availability

Develop, test, and implement disaster recovery plans.

Ensure high availability by using redundancy, failover mechanisms, and geographical distribution of systems.

Security and Compliance

Implement security best practices to safeguard data and systems.

Ensure compliance with industry regulations and internal security policies.

Cooperate and respond with necessary compliance Audits.

Collaboration and Communication

Work closely with development teams to integrate reliability into the software development lifecycle.

Participate in post-incident reviews to identify root causes and prevent future occurrences.

Provide technical support to teams and help to build a culture of reliability across the organization.

Documentation

Document incident response processes, infrastructure architecture, and SRE best practices.

Maintain clear, accessible records for troubleshooting, deployments, and maintenance tasks.

Generate work instructions to document tasks and enable smooth team expansion.

Continuous Improvement

Identify opportunities for process improvements and performance enhancements.

Keep up to date with the latest technology trends and industry practices, and adopt relevant innovations.

Application Question(s) :

Past Projects Portfolio

Education :

High school or equivalent (Required)

Undergraduate or equivalent experience (Preferred)

AWS Certifications (Preferred)

Required Experience :

Experience with AWS

Experience with Python

Experience with SQL / Databases

Knowledge of managing cloud-based infrastructure, networking, and storage.

Ability to write automation scripts for deployment, monitoring, and scaling.

Preferred Experience :

Experience with Linux / Unix systems

Experience with version control systems like Git.

Understanding of AWS IAM

Expertise in system administration tasks, such as patching, user management, and system performance tuning.

Familiarity with securing infrastructure, including access control, encryption, and vulnerability management.

Apply for this job

Receive alerts for other Site Reliability Engineer job openings

Site Reliability Engineer

What are the responsibilities and job description for the Site Reliability Engineer position at Careerbuilder-US?

What is the career path for a Site Reliability Engineer?

Job openings at Careerbuilder-US

Not the job you're looking for? Here are some other Site Reliability Engineer jobs in the Boca Raton, FL area that may be a better fit.

We don't have any other Site Reliability Engineer jobs in the Boca Raton, FL area right now.

AI Assistant is available now!