What are the responsibilities and job description for the Data Engineer position at Real Advancement?
Job Details
Job Title: Data Engineer
Job type: W2, 6 months, may extend
Location: Hybrid in Philadelphia or Washington DC
Program Related:
Role Description
The Applied AI group in our client s organization is seeking a passionate and skilled Senior Software Engineer with expertise in designing and implementing scalable data pipelines powering a voice platform used by millions of people every day across the world. We create the technology that powers the voice remote and allows our customers to interact with their TV to do things like search for content, control connected devices or find entertainment-related information.
As a member of the Content Retrieval team, you will focus on building scalable, real-time data pipelines and feedback loops that help us train, tune and evaluate the models serving production traffic as well as the architecture that supports this. We own the full process for taking an idea from a prototype to production. You will collaborate with the team to identify the best solution to a problem, implement it, deploy it, and continuously monitor its performance. The ideal candidate has experience with building and maintaining data pipelines using industry-leading tools like Airflow and Databricks
- Core Responsibilities
- Assist in the development and implementation of information retrieval systems, including search algorithms, ranking models, and indexing strategies.
- Work as part of a team of software engineers in conjunction with product stakeholders to understand business objectives, define technical requirements and build features for a content retrieval platform.
- Collaborate with cross-functional teams to verify the quality of ingested data, ensuring it meets the standards necessary for effective information retrieval processes.
- Design and implement scalable cloud infrastructure using Docker and Kubernetes to support the deployment and operation of the information retrieval platform, ensuring high availability and performance for search services.
- Collaborate with teammates and contribute to design discussions, project planning and code reviews.
- Resolve technical issues through debugging, research, and investigation.
- Rely on experience and judgment to plan and accomplish goals.
- Demonstrate a keen sense of responsibility and accountability towards the team s work, its quality and timely delivery.
- Stay up-to-date with the latest trends and technologies in cloud infrastructure and apply that knowledge to improve internal systems
Qualifications
Bachelor s or Master s degree in Computer Science
- 5 or more years of relevant work
- Hands-on experience with containerization using Docker and orchestration using Kubernetes
- Proficiency in one or more backend programming languages such as Java, Koitlin, Go or C
- Experience with operational metrics and monitoring tools like Prometheus and Grafana
- Familiarity with command-line interfaces (CLI)
- Strong understanding of network fundamentals and experience with troubleshooting network issues
- Willingness to learn new skills and flexibility to fill different roles to support high-priority initiatives
Preferred Qualifications:
- Assist in the deployment and operational support of information retrieval systems, ensuring that search algorithms and indexing strategies are effectively served and optimized for user queries.
- Experience with Generative AI technologies and frameworks, including but not limited to natural language processing (NLP) or reinforcement learning.
- Familiarity with optimizing AI models in production environments.
- Familiarity with cloud infrastructure and experience with AWS
- Experience with infrastructure as code using Terraform or CloudFormation
- Background in continuous integration and willingness to learn GitHub Actions
- Experience with logging and tracing tools like ELK Stack, Zipkin, or OpenTracing
- Familiarity with service mesh technologies such as Istio or Linkerd
- Familiarity implementing infrastructure as code using Terraform or CloudFormation
- Familiarity implementing continuous integration and continuous deployment (CI/CD) pipelines using GitHub Actions
Top three skills:
- At least five years of experience as a Data Engineer or DevOps Engineer building out models and pipelines in Databricks, Kubernetes, Pyspark, Airflow, etc. Need to have experience on the infrastructure side not just building machine learning models.
- Need prior experience deploying models to production
- Nice to have - proven experience with backend programming languages such as Java, Kotlin, and Python.
Thank you.