What are the responsibilities and job description for the Data Engineer position at The Wound Pros?
WHO ARE WE?
The Wound Pros is the nation’s largest wound care management company, with a presence in 19 states and counting. Our mission is to facilitate the standardization of evaluating and treating chronic wounds in long-term care facilities by leveraging the power of AI and technology. We offer various essential services, including Digital Wound Management, Telemedicine, Advanced EHR Systems, Mobile Vascular Assessment, Digital Supply Tracking, Advanced Wound Care Dressings, and participation as a Medicare Part B provider.
Kickstart your career by joining a growing number of professionals committed to healing wounds and saving lives. At The Wound Pros, we live and breathe diversity. We pride ourselves in our team of passionate professionals from all over the world. Our core values are: we Listen, Innovate, Never give up, and ‘Kultivate’ & grow (LINK) people in their careers.
JOB SUMMARY
We are seeking an experienced and forward-thinking Data Engineer to design, implement, and optimize our evolving data infrastructure. In this pivotal role, you will lead initiatives across AWS, Azure, and Databricks, ensuring our data ecosystems are secure, scalable, and performance-driven. You will collaborate closely with data scientists, DevOps, and software engineering teams to deliver robust data solutions that support advanced analytics, AI/ML workflows, and critical healthcare compliance requirements.
WHAT ARE YOUR ASSIGNMENTS
Data Architecture & Pipeline Development
- Architect & Implement: Design and implement scalable, multi-cloud data pipelines (AWS and Azure) that handle data ingestion, transformation, and integration across diverse sources.
- Data Warehouses & Data Lakes: Lead the development and maintenance of data lakehouses, warehouses, and lake architectures using platforms like Databricks (Delta Lake, Iceberg), Azure Data Lake Storage (ADLS), AWS S3, Redshift, and more.
- ETL/ELT Processes: Build and optimize end-to-end data pipelines using dbt, Airflow, Dagster, or similar orchestration tools to ensure reliability, consistency, and high performance.
Databricks & Advanced Data Solutions
- Spark Development: Leverage Databricks to manage large-scale data processing, batch/streaming jobs, and ML model deployments.
- Modern Table Formats: Implement and optimize Delta Lake or Iceberg for fast, ACID-compliant transactions and scalable data analytics.
- Collaboration & Best Practices: Promote best practices for Spark job creation, resource utilization, and distributed data processing to ensure efficient use of Databricks clusters.
Cloud Infrastructure & Integration
- Multi-Cloud Expertise: Utilize services from both AWS (RDS, Lambda, Glue, S3, Redshift) and Azure (Data Factory, Data Lake Storage, Synapse) to build resilient, cost-effective data solutions.
- Infrastructure as Code (IaC): Work with DevOps to implement and maintain infrastructure via Terraform, CloudFormation, or ARM/Bicep templates where applicable.
- Cross-Functional Collaboration: Partner with DevOps to ensure robust monitoring, logging, and alerting solutions are in place (e.g., Datadog, ELK Stack, CloudWatch, Azure Monitor) for all data pipelines.
Data Governance, Security & Compliance
- Data Governance & Quality: Develop and enforce data governance policies, standards, and frameworks—ensuring high data quality, lineage, and stewardship.
- Healthcare Compliance: Ensure compliance with healthcare data regulations (e.g., HIPAA), implementing robust access controls, audit trails, and encryption strategies.
- Security Monitoring: Implement advanced security measures and monitor data access patterns, collaborating with the InfoSec team to conduct scans, penetration tests, and incident response drills.
Performance Optimization & Troubleshooting
- Data Performance: Continuously evaluate and optimize data storage/queries for performance and cost efficiency at scale (SQL tuning, partitioning strategies, caching).
- Monitoring & Alerting: Set up proactive alerting and monitoring systems (e.g., Datadog, Prometheus, Grafana, CloudWatch Metrics) to promptly identify and address pipeline bottlenecks and failures.
- Incident Management: Investigate and resolve data-related issues, working closely with DevOps, Data Science, and Software Engineering teams to minimize downtime and maintain SLAs.
Technical Leadership & Mentorship
- Team Mentorship: Guide junior and mid-level data engineers, providing code reviews, technical direction, and professional development opportunities.
- Stakeholder Communication: Work closely with Product, Analytics, and AI teams to capture requirements, translate business needs into technical solutions, and communicate progress effectively.
- Innovation & Thought Leadership: Evaluate emerging technologies, tools, and frameworks; recommend and implement solutions that enhance the data platform’s capabilities.
WHAT YOU HAVE ALREADY ACHIEVED
- Education: Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
- Experience: 5 years in data engineering roles with a proven record of leading complex data projects in both AWS and Azure environments.
- Cloud Expertise: Advanced proficiency in AWS (RDS, Lambda, Glue, Redshift, S3) and Azure (Data Factory, ADLS).
- Databricks & Spark: Hands-on experience with Databricks, Spark, and modern table formats (Delta, Iceberg).
- ETL/ELT & Orchestration: Strong background in building and maintaining pipelines using dbt, Airflow, Dagster, or similar tools.
- SQL & Python: Expert-level skills in SQL and Python for data processing, transformation, and automation.
- Version Control: Proficiency with Git for version control and CI/CD workflows.
- Data Modeling: Deep understanding of data modeling (3NF, star schema) and database design principles.
- Healthcare Compliance: Familiarity with HIPAA or similar regulatory frameworks, ensuring data privacy and security.
- Communication & Collaboration: Excellent verbal and written communication skills, with the ability to work effectively in cross-functional teams.
ATTRIBUTES NEEDED FOR THE ROLE
- Infrastructure as Code: Experience with Terraform, CloudFormation, Azure Resource Manager (ARM) templates, or AWS CDK.
- AI/ML Pipelines: Exposure to AI/ML workflows, model training, and model hosting on Databricks or other platforms.
- Containerization & Orchestration: Familiarity with Docker, Kubernetes, or similar technologies for packaging and deploying data applications.
- Performance Tuning: Experience optimizing large-scale data warehouse/lake architectures for cost, speed, and reliability.
- Mentorship & Leadership: Previous experience in a senior or lead capacity, mentoring junior engineers and driving team-wide best practices.