What are the responsibilities and job description for the Data Pipeline Engineer position at Near Space Labs?
Data Pipeline Engineer
At Near Space Labs we design, build and operate a fleet of stratospheric robots to image the earth. We launch our proprietary, balloon-based imaging robots to heights between 40,000 and 60,000 feet. From this vantage point, we capture petabytes of imagery for a variety of use cases.
We seek a proactive Data Engineer to drive the evolution of our petabyte-scale geospatial imagery pipeline. This role will be instrumental in ensuring the seamless flow of data, collaborating closely with data and software engineers You will provide ongoing support by actively monitoring and troubleshooting data pipeline performance and reliability. Your contributions will directly impact our ability to deliver high-quality imagery to customers with optimal speed and reliability.
Your responsibilities will include:
- Build and maintain our proprietary data pipeline for batch geospatial data processing.
- Monitor and support data pipeline performance and reliability.
- Develop new data processing tasks to support evolving data requirements.
- Design and implement data storage solutions using cloud-based storage services.
- Develop and optimize data workflows using distributed systems.
- Rapid generation of functional prototypes
Location
This is a remote position open to anyone authorized to work in the United States.
Our Stack
- Python & Typescript
- Distributed system, gRPC services, Protobuf, REST
- Kubernetes, Istio
- Google Cloud Platform, Docker, Pub/sub
- PostgreSQL, MongoDb, Google Cloud Storage
- GDAL, Shapely, Spatio-Temporal Asset Catalogs
Your Skills
- Proficient in Python: 5 years of experience developing robust data pipelines and applications, with a strong understanding of data structures, algorithms and geospatial data.
- Data Pipeline Expertise: Proven experience designing, building, and maintaining scalable data pipelines using distributed processing frameworks (e.g., Airflow, Spark, etc).
- Containerization and Kubernetes: Demonstrated ability to interact and manage containerized data applications using Docker and Kubernetes.
- Cloud Storage and Data Warehousing: Experience working with cloud-based storage solutions (e.g., AWS S3, Google Cloud Storage, Azure Blob Storage) and data warehousing technologies.
- Understanding of distributed systems: Knowledge of how distributed systems work and the challenges that they present.
- Problem-Solving and Troubleshooting: Strong ability to diagnose and resolve complex data infrastructure and pipeline issues.
- Self-Starter and Collaborative: Ability to work independently and collaboratively in a fast-paced environment, managing projects and deliverables effectively.
What We Offer
- An exciting startup culture where you will have the opportunity to play a critical role in building and scaling a one-of-a-kind technology and organization.
- You will be part of an enthusiastic, international and motivated team of professionals who are committed to building unique technologies, being rigorous, and finding novel solutions to interesting problems.
- A diverse and inclusive workplace where we welcome people of different backgrounds, experiences and perspectives.
- A commitment that you will never be bored.
Equal Employment Opportunity
Near Space Labs is committed to diversity in our organization and building an equitable and inclusive environment for people of all backgrounds and experiences. Near Space Labs provides equal employment opportunities to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability or genetics.