What are the responsibilities and job description for the Principal Data Engineer (data modeling) position at HireStarter, Inc.?
Hirestarter's client is looking for a Principal Data Engineer to take ownership of their data architecture and lay the foundation for a robust, scalable analytics platform. You’ll be the primary architect of the data warehouse, designing schema and pipelines that support real-time business insights and long-term growth.
While the company generates ~1 billion rows of data per month, the real opportunity is in structuring and modeling that data effectively—so teams across the company can make faster, smarter decisions.
What You'll Do:
Own the Data Warehouse Architecture: Design and implement a modern, cloud-based data warehouse with an emphasis on analytical performance and usability.
Design for Analytics: Build fact and dimension tables using dimensional modeling techniques (e.g., star and snowflake schemas) to support reporting, dashboards, and self-service analytics.
Lead Data Modeling: Translate business requirements into scalable data models that are intuitive, performant, and maintainable.
Architect Transformations: Implement and maintain transformation logic in DBT, structuring raw data into usable, governed analytics layers.
Pipeline Engineering: Build reliable batch and (eventually) streaming pipelines, with a focus on traceability, performance, and long-term scalability.
Collaborate Cross-Functionally: Partner closely with analytics, product, and operations to deeply understand data needs and ensure models are easy to work with.
Establish Best Practices: Define standards for data governance, documentation, version control, and testing within the data stack.
Think Beyond ETL: Prioritize data consumption and usability over raw ingestion—your end-users are analysts, not just machines.
What We're Looking For
8 years in data engineering, with a strong focus on data modeling and data warehouse architecture.
Proven experience designing fact and dimension tables for analytics in Snowflake, BigQuery, Redshift, or similar cloud data warehouses.
Advanced SQL skills and a strong understanding of query performance optimization at scale.
Hands-on experience with DBT and modern orchestration tools (Airflow, Prefect, Dagster, etc.).
Familiarity with real-time data processing frameworks (Kafka, Flink, Spark Streaming) is a plus.
Experience in greenfield or early-stage data environments where you've built from scratch.
Strong communication skills and a user-first mindset—you care about how data is consumed, not just how it moves.