What are the responsibilities and job description for the Data Engineer position at Infyshine Inc?
Job Details
Job Title: Data Engineer
Location: Plano, TX
Job Type: Full-Time
Experience: 12 Years
Job Summary:
We are seeking an experienced Data Engineer to join our team. The ideal candidate will have a strong background in data technology, hands-on expertise in SQL development, and experience with MPP systems. You will play a critical role in designing, developing, and optimizing our data architecture, ensuring high data quality, and enabling efficient data processing and analytics.
Key Responsibilities:
Design, develop, and optimize data models (relational & dimensional) and schemas.
Implement and maintain data architecture, governance, and quality improvements to ensure reliability and consistency.
Develop and optimize SQL queries and curated datasets, ensuring high performance and scalability.
Work with Cloud-based Data Warehousing and ETL solutions (e.g., AWS, Azure, Google Cloud Platform, Snowflake, Redshift, BigQuery, etc.), Python, and ETL coding to build and maintain data pipelines.
Leverage Google Cloud Platform (Google Cloud Platform) services, Big Data, and streaming integrations for data processing.
Utilize PySpark, Pandas, and other data processing libraries for large-scale data transformations.
Implement best practices for data movement, handling large volumes of data, and reporting.
Develop and manage ETL workflows using tools such as Apache Airflow, Google Cloud Dataflow, etc.
Apply data warehouse methodologies like Kimball, Inmon, or Data Vault for effective data organization.
Ensure smooth integration and deployment using CI/CD tools like Git, Terraform, etc.
Troubleshoot and optimize database performance, query plans, and indexing strategies.
Qualifications & Experience:
Bachelor s degree in Computer Science or a related field, or an equivalent combination of education and experience.
12 years of experience in Data Engineering with strong hands-on SQL development.
Extensive experience with MPP (Massively Parallel Processing) systems.
Strong understanding of data modeling, data architecture, metadata, and governance best practices.
Proficiency in Google Cloud Platform (Google Cloud Platform), BigQuery, and cloud-based data solutions.
Expertise in database programming, performance tuning, and query optimization.
Experience with big data processing frameworks such as PySpark and Pandas.
Familiarity with data warehouse best practices and methodologies (Kimball, Inmon, Data Vault).
Hands-on experience with ETL tools and workflow automation (Airflow, Dataflow, etc.).
Knowledge of CI/CD practices and tools like Git, Terraform, Jenkins.
Preferred Skills:
Experience in streaming data processing and real-time analytics.
Strong problem-solving skills and the ability to work in fast-paced, collaborative environments.
Excellent communication and stakeholder management skills.
If you are passionate about building scalable, high-performance data solutions and want to be part of an innovative team, we d love to hear from you!
Please reach-out to