What are the responsibilities and job description for the SRE and Observability Engineer position at VDart Inc?
Job Title :
SRE and Observability Engineer - Mid Level
Location : Dallas, Tx
Job Description : Skills :
- SRE and Observability Engineer Mid Level
- 5 8 years of experience in software development, including exposure to cloud-native environments.
- Proficiency in software development using Python or Java.
- Hands-on experience with modern SDLC tools and implementing CI / CD practices.
- Knowledge of monitoring tools like Prometheus and Grafana.
- Practical experience with AWS / GCP cloud services.
- Good understanding of Kubernetes and containerization.
- Experience building data pipelines using cloud-native tools.
- Our Observability & SRE Engineers are responsible for engineering the services and tools need to gain deeper insights to our applications that a deployed into the public cloud. In this role as an Observability or a Site Reliability Engineer in our Public Cloud group, you will be working on complex and difficult technical problems solving for scale, performance, and availability. The ideal candidate has experience gained in a software development environment and a deep appreciation of best practice for the design and deployment of fault tolerance solutions for cloud platforms. You will have worked on applications or infrastructure that handles large datasets in either streaming or batch mode and have dealt with some of the complexities of working with such applications and infrastructure that are deployed at scale
- 5 8 years of experience in software development, including exposure to cloud-native environments.
- Proficiency in software development using Python or Java.
- Hands-on experience with modern SDLC tools and implementing CI / CD practices.
- Knowledge of monitoring tools like Prometheus and Grafana.
- Practical experience with AWS / GCP cloud services.
- Good understanding of Kubernetes and containerization.
- Experience building data pipelines using cloud-native tools.
- Familiarity with TDD and automated testing frameworks."
- Collaborate in designing and developing scalable software using Python or Java.
- Implement CI / CD pipelines and integrate them with team workflows.
- Set up and maintain monitoring solutions using Prometheus and Grafana.
- Build and maintain distributed, cloud-based systems on AWS / GCP.
- Develop containerized applications using Kubernetes.
- Support the development of data pipelines leveraging cloud-native tools.
- Contribute to TDD practices and automated testing in project development."
- Certifications in AWS, GCP, or Kubernetes.
- Familiarity with distributed system design."
- Effective communication and collaboration skills.
- Strong problem-solving and analytical abilities.
- Willingness to learn and take ownership of tasks."
- Functional and efficient software solutions.
- CI / CD implementation and improvements.
- Reliable monitoring and observability setups.
- Contribution to data pipeline development.
- Documentation and participation in team growth."