What are the responsibilities and job description for the Site Reliability Engineer position at Altimetrik?
Location: Austin, Tx or Fort Mill, SC
Key Responsibilities
- Design, develop, and maintain ELK Stack solutions to ensure efficient log management, monitoring, and search capabilities.
- Implement, optimize, and troubleshoot data pipelines for telemetry, analytics, and observability using Logstash, Beats, Kafka, or other ETL tools.
- Customize Elasticsearch indexing, queries, and storage solutions to enhance system performance and scalability.
- Develop dashboards, visualizations, and alerting mechanisms in Kibana and other monitoring tools to improve system observability.
- Integrate ELK solutions with cloud environments (AWS, Azure, or GCP) and implement security best practices for data storage and access.
- Monitor and optimize system performance, resource utilization, and search efficiency to maintain high availability and reliability.
- Collaborate with DevOps, Security, and Software Engineering teams to enhance log processing, alerting, and data enrichment strategies.
- Automate deployments and configurations using tools like Ansible, Terraform, Kubernetes, and CI/CD pipelines.
- Stay updated with the latest ELK Stack developments and industry trends to implement best practices and new features.
Required Skills and Qualifications
- 6-10 years of hands-on experience with Elasticsearch, Logstash, Kibana (ELK Stack) in enterprise environments.
- Strong knowledge of log aggregation, indexing, and data parsing techniques.
- Proficiency in scripting and automation using Python, Bash, or Groovy.
- Experience with observability platforms and telemetry solutions, including Dynatrace.
- Knowledge of distributed systems, clustering, and high-availability architectures.
- Experience in tuning and scaling Elasticsearch clusters for performance optimization.
- Hands-on experience with Kafka, Fluentd, Prometheus, Grafana, OpenSearch (preferred).
- Strong background in Linux systems, networking, and security.
- Familiarity with CI/CD pipelines, Git, Kubernetes, and containerization (Docker).
- Experience with cloud services like AWS OpenSearch, Azure Elastic Stack, or Google Cloud Logging.
- Strong problem-solving skills and the ability to thrive in fast-paced environments.