What are the responsibilities and job description for the DataHub Lead (contract) position at Capgemini?
The DataHub Lead will be responsible for leading the design, development, and implementation of the DataHub platform, ensuring data governance, quality, and accessibility across the organization. This role requires expertise in data management, data warehousing, and data integration, along with experience with data cataloging tools like DataHub or Amundsen. The ideal candidate will work closely with data engineers, scientists, and business stakeholders to enable self-service data access and literacy while ensuring compliance with regulatory requirements.
Key Responsibilities
DataHub Development & Implementation:
Technical Skills:
Key Responsibilities
DataHub Development & Implementation:
- Lead the design, development, and deployment of the DataHub platform, ensuring data quality, accessibility, and usability.
- Define data models, schemas, lineage tracking, and integration strategies to improve data operations.
- Define and enforce data quality standards, ensuring data integrity and accuracy.
- Manage data access controls and user roles to ensure secure and compliant data usage.
- Ensure compliance with GDPR, CCPA, and other data privacy regulations.
- Promote data literacy across teams by educating and empowering business users.
- Enable self-service data discovery and access for data scientists, analysts, and business users.
- Develop and maintain documentation and training materials for end-users.
- Work closely with data engineers, scientists, and business stakeholders to gather requirements and implement solutions.
- Translate business needs into technical solutions, ensuring alignment with organizational goals.
- Monitor and maintain the DataHub platform, ensuring optimal performance.
- Identify and resolve performance bottlenecks and continuously improve the platform’s functionality and usability.
- Stay updated on the latest data management technologies and best practices to enhance data governance strategies.
- Mentor and guide junior data professionals, providing technical guidance and knowledge sharing.
- Foster a culture of continuous learning within the data management team.
Technical Skills:
- 5 years of experience as a Data Engineer, Data Architect, or similar role.
- Strong understanding of data warehousing, data lake architectures, and data integration principles.
- Hands-on experience with data cataloging tools like DataHub, Amundsen, or similar (preferred).
- Proficiency in SQL, Python, and data modeling techniques.
- Experience with cloud platforms (GCP preferred, but AWS/Azure is acceptable).
- Hands-on experience with data lineage tracking, data quality assessment, and metadata management.
- Strong problem-solving and analytical skills.
- Excellent communication and collaboration abilities to work with technical and non-technical stakeholders.
- Ability to work independently and as part of a cross-functional team.
- Experience with GCP (Google Cloud Platform).
- Knowledge of ETL frameworks and data pipeline orchestration tools.
- Exposure to Machine Learning Ops (MLOps) or AI-driven data governance frameworks.
- Core Expertise: DataHub, Data Governance, Data Architecture, Data Cataloging, Data Lineage.
- Technical Skills: SQL, Python, Cloud Platforms (GCP preferred), Data Modeling, Data Integration.
- Soft Skills: Collaboration, Communication, Stakeholder Engagement, Mentorship.
- Preferred Technologies: DataHub, Amundsen, ETL frameworks, Machine Learning Ops (MLOps).