What are the responsibilities and job description for the Senior Data Reliability Engineer position at ACCUSAGA?
Title: Sr. Data Reliability Engineer
Location: San Antonio, TX (Hybrid, 3 days from the office).
Exp - 10 years
Visa: USC, GC or GC EAD Only
Contract to Hire
We are currently seeking a highly skilled Data Reliability Engineer to join our team. This role requires a blend of expertise in data pipeline technologies and Site Reliability Engineering (SRE) principles to ensure the highest standards of data reliability and system performance.
Responsibilities:
- Design, build, and maintain the infrastructure and data pipelines to support data transformation, data structures, metadata, dependency, and workload management.
- Develop and maintain scalable, reliable, and cost-effective data solutions using AWS technologies and big data tools like Databricks, Airflow, and Dremio.
- Implement robust monitoring and alerting systems using tools such as Datadog to ensure proactive management of the production environments.
- Work closely with data scientists and analytics teams to engineer and optimize data models using DBT (Data Build Tool) and ensure seamless data flow across all segments.
- Enhance data validation and data quality metrics integration within data pipelines to ensure accuracy and reliability of data.
- Automate manual processes, optimize data delivery, and re-design infrastructure for greater scalability.
- Handle the deployment of additional AWS services such as Lambda functions and manage data storage solutions.
- Collaborate with IT and DevOps teams to enhance system performance and reliability through AWS solutions such as SNS, SQS, and Elastic Load Balancer.
- Engage in continuous improvement efforts to enhance performance and provide increased functionality across data platforms.
- Provide support and mentorship to offshore teams, ensuring best practices in coding, testing, and deployment are followed.
- Troubleshoot complex issues across multiple databases and work with various stakeholders to ensure robust architecture and operational standards are maintained.
Mandatory Skills:
- Strong proficiency in Apache Airflow, Python programming, and AWS cloud services.
- Experience with real-time monitoring tools such as Datadog.
- Expertise in data transformation tools like DBT and Dremio.
- Strong experience with Databricks and AWS Lambda.
- Excellent verbal and written communication skills, capable of working with cross-functional teams and managing relationships with business stakeholders.
Nice to Have Skills:
- Familiarity with AWS services such as SNS, SQS, Elastic Load Balancer, CodeBuild, CodePipeline, ECR, and EKS.
- Experience with both Linux and Windows operating systems.
- Knowledge of Microsoft Azure Data Lake, Synapse, and additional Python scripting.
Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Engineering, Information Technology, or a related field.
- Proven experience as a Data Engineer, Data Reliability Engineer, or in a similar role in an SRE environment.
- Demonstrable experience with end-to-end monitoring of data pipelines and implementing reliability in data systems.
Thanks & Regards,
Siri
US IT Recruiter
siri@accusaga.com