What are the responsibilities and job description for the Agentic Data Engineer position at Scepter Technologies, Inc?
Job Details
Title: Agentic Data Engineer
Location: Richmond, VA
Duration: 6 Months
Interview Mode: Telephonic / In-person
Required Education:
Bachelor's or master's degree in computer science, AI, Data Science, or a related field
Azure, Big Data, ETL, GIS Products, Python
Job Responsibilities:
Designing and developing data pipelines for agentic systems, develop Robust data flows to handle complex interactions between AI agents and Data sources
Ability to train and fine tune large language models
Design and build the data architecture, including databases, data lakes to support various data engineering tasks
Develop and manage Extract, Load, transform (ELT) processes to ensure data is accurately and efficiently moved from source systems to analytical platforms used in data science
Implement data pipelines that facilitate feedback loops, allowing human input to improve system performance in human-in-the-loop systems
Work with vector databases to store and retrieve embeddings efficiently
Collaborate with data scientists and engineers to preprocess data, train models, and integrate AI into applications
Optimize data storage and retrieval with high performance
Statistical analysis, trends, patterns to create data formats from multiple sources
Required Skills:
Understanding the Big data Technologies (1 Years)
Experience developing ETL and ELT pipelines (1 Years)
Experience with Spark, Graph DB, Azure Databricks (1 Years)
Expertise in Data Partitioning (1 Years)
Experience with Data conflation (3 Years)
Experience developing Python Scripts (3 Years)
Experience training LLMs with structured and unstructured data sets (2 Years)
Experience with GIS spatial data (3 Years)
Required Qualifications:
Strong Data engineering fundamentals
Utilize Big data frameworks like Spark/Databricks
Training LLMs with structed and unstructured data sets
Understanding of Graph DB
Experience with Azure Blob Storage, Azure Data Lakes, Azure Databricks
Experience implementing Azure Machine Learning, Azure Computer Vision, Azure Video Indexer, Azure OpenAI models, Azure Media Services, Azure AI Search?
Determine effective data partitioning criteria
Utilize data storage system spark to implement partition schemes
Understanding core machine learning concepts and algorithms
Familiarity with Cloud computing skills
Strong programming skills in Python and experience with AI/ML frameworks
Proficiency in vector databases and embedding models for retrieval tasks
Expertise in integrating with AI agent frameworks
Experience with cloud AI services (Azure AI)