What are the responsibilities and job description for the Datahub Consultant position at TechIntelli Solutions?
Role: Datahub Consultant
Location: Austin, TX (Onsite)
Job Description:
- Directed projects involving data cataloging using the Datahub open-source framework, anomaly detection through machine learning models, and Spark based framework.
- Ingested metadata from multiple systems to pull metadata information of assets from data lake, upstream and downstream systems.
- Developed custom API solutions that can bring data of ETL pipelines as a push mechanism to Datahub. This enriched the impact analysis to identify the data pipelines reading/writing to a data asset.
- Provided a holistic picture of end-to-end lineage that helped with PII identification, Governance, Impact analysis.
- Improved the performance of Spark-based applications, ensuring seamless functionality.
- Provided recommendations on design and development of ETL pipelines using Spark. Developed and maintained Spark based client custom framework used for config-as-code mechanism of data enrichment and transfer.
- Successfully supported Spark version upgrades and executed AWS cost optimization initiatives for platform-wide efficiency.
- Worked with ML engineers to create features from profiled batch data; and identify anomalies in data patterns.