What are the responsibilities and job description for the Data Engineer (Python) position at ENVISN INCORPORATED?
Job Title: Python Data Engineer
Location: Houston, TX (ONSITE ROLE)
Duration: Long term contract
Job Description:
We are looking for a talented Data Engineer with expertise in Python data processing. The ideal candidate will have a strong background in Python API development, parallel data processing, and distributed systems design. You will be responsible for building and maintaining systems that handle large-scale data processing tasks, ensuring high performance and scalability.
Key Responsibilities:
Python API Development:
o Develop and maintain RESTful APIs using Python web frameworks such as FastAPI or Django.
o Collaborate with front-end developers to integrate user-facing elements with server-side logic.
Parallel Data Processing:
o Utilize Pandas, NumPy, and other libraries to process large datasets efficiently.
o Implement multithreading, multiprocessing, and asynchronous programming techniques.
o Optimize data processing pipelines to handle millions of rows with minimal latency.
Distributed Systems Design:
o Design and implement distributed systems with a focus on scalability and reliability.
o Understand and apply core concepts such as load balancing and task queues.
o Use Docker to containerize applications and manage dependencies.
o (Preferred) Experience with Kubernetes for container orchestration.
Technical Communication:
o Clearly articulate complex technical concepts to team members and stakeholders.
o Document system designs, processes, and code effectively.
o Collaborate with cross-functional teams to align on project goals and deliverables.
Must-Have Qualifications:
Experience in Python Web Frameworks:
o Proficiency with FastAPI, Django, or similar frameworks.
O C# coding
o Understanding of RESTful API principles and best practices.
Docker Knowledge:
o Ability to create and manage Docker Files.
o Experience with containerization for deployment and development workflows.
Systems Design Understanding:
o Basic knowledge of load balancing, task queues, and distributed system concepts.
o Ability to design systems that are scalable and maintainable.
Concurrent and Parallel Computing Skills:
o Proficiency in multithreading and multiprocessing without relying solely on external libraries or frameworks.
o Familiarity with asynchronous programming, particularly asyncIO in Python.
Communication Skills:
o Excellent technical communication abilities.
o Experience collaborating in team environments and conveying complex ideas clearly.
Preferred Qualifications:
Education:
o BS or MS in Computer Science
Advanced Data Processing Tools:
o Experience with Polars, PySpark, or similar tools.
o Handling of large-scale data processing tasks efficiently.
Distributed Computing Experience:
o Hands-on experience with distributed architectures in Docker.
o Familiarity with concepts like task queuing, MapReduce, and saga patterns.
Kubernetes Experience:
o Knowledge of container orchestration using Kubernetes.
o Experience deploying and managing applications in a Kubernetes cluster.
Problem-Solving at Scale:
o Demonstrated ability to solve complex problems using parallel or distributed computing.
o Innovative thinking beyond single-threaded processes.
We are looking for a talented Data Engineer with expertise in Python data processing. The ideal candidate will have a strong background in Python API development, parallel data processing, and distributed systems design. You will be responsible for building and maintaining systems that handle large-scale data processing tasks, ensuring high performance and scalability.
Key Responsibilities:
Python API Development:
o Develop and maintain RESTful APIs using Python web frameworks such as FastAPI or Django.
o Collaborate with front-end developers to integrate user-facing elements with server-side logic.
Parallel Data Processing:
o Utilize Pandas, NumPy, and other libraries to process large datasets efficiently.
o Implement multithreading, multiprocessing, and asynchronous programming techniques.
o Optimize data processing pipelines to handle millions of rows with minimal latency.
Distributed Systems Design:
o Design and implement distributed systems with a focus on scalability and reliability.
o Understand and apply core concepts such as load balancing and task queues.
o Use Docker to containerize applications and manage dependencies.
o (Preferred) Experience with Kubernetes for container orchestration.
Technical Communication:
o Clearly articulate complex technical concepts to team members and stakeholders.
o Document system designs, processes, and code effectively.
o Collaborate with cross-functional teams to align on project goals and deliverables.
Must-Have Qualifications:
Experience in Python Web Frameworks:
o Proficiency with FastAPI, Django, or similar frameworks.
O C# coding
o Understanding of RESTful API principles and best practices.
Docker Knowledge:
o Ability to create and manage Docker Files.
o Experience with containerization for deployment and development workflows.
Systems Design Understanding:
o Basic knowledge of load balancing, task queues, and distributed system concepts.
o Ability to design systems that are scalable and maintainable.
Concurrent and Parallel Computing Skills:
o Proficiency in multithreading and multiprocessing without relying solely on external libraries or frameworks.
o Familiarity with asynchronous programming, particularly asyncIO in Python.
Communication Skills:
o Excellent technical communication abilities.
o Experience collaborating in team environments and conveying complex ideas clearly.
Preferred Qualifications:
Education:
o BS or MS in Computer Science
Advanced Data Processing Tools:
o Experience with Polars, PySpark, or similar tools.
o Handling of large-scale data processing tasks efficiently.
Distributed Computing Experience:
o Hands-on experience with distributed architectures in Docker.
o Familiarity with concepts like task queuing, MapReduce, and saga patterns.
Kubernetes Experience:
o Knowledge of container orchestration using Kubernetes.
o Experience deploying and managing applications in a Kubernetes cluster.
Problem-Solving at Scale:
o Demonstrated ability to solve complex problems using parallel or distributed computing.
o Innovative thinking beyond single-threaded processes.
Compensation: $50.00 - $55.00 per hour
Salary : $50 - $55