What are the responsibilities and job description for the Software Engineer, Data Ingestion position at Torch.AI?
Become Part of the Torch.AI Journey!
Torch.AI is a defense focused AI-software company. Unlike traditional government contractors, our team of experts takes calculated risks to self-fund R&D behind the scenes and then sells complete products "off-the-shelf" to mission owners for flexible, user-defined configuration completed by our solutions engineering teams. We conduct deep research to understand how AI and new data infrastructures can address a growing array of national defense needs. This allows us to go from ideation to full capability deployment in weeks and months, instead of years.
Today, we help improve national security, protect our warfighters, eliminate fraud, reduce risk, and enable better customer experiences. We’re passionate about solving complex problems. Join us in our mission to help organizations Unlock Human Potential.
The U.S. defense and national security industry offers an unparalleled opportunity to contribute to the safety and well-being of the nation while engaging with cutting-edge technologies. As a vital sector that shapes global stability, it offers a dynamic environment to tackle complex challenges across multidisciplinary domains. With substantial investment in innovation, the industry is at the forefront of developing AI, autonomous systems, and advanced national security solutions, each founded on the premise that information is the new battlefield. If this type of work is of interest, we’d love to hear from you.
The Role: Unlock Your Potential
As a Software Engineer specializing in Data Ingestion at Torch.AI, you will tackle the challenge of identifying, researching, and acquiring publicly available information (PAI) and open-source intelligence (OSINT) data, which serve as critical sources of intelligence for our customers. You will leverage existing collection capabilities and web crawlers within the Torch.AI platform, enhance related technologies and ML models to support customers at scale, and introduce innovative approaches to data collection.
Each of our customers requires unique technical solutions to remove obstacles such as manual-intensive data analysis and cognitive burden to ensure mission success. Our modular end-to-end data infrastructure platform supports a wide variety of military functions and operations. We configure enterprise-grade solutions to meet the specialized needs of our customers and encourage company-wide collaboration to share context, skills, and expertise across a variety of tools, technologies, and development practices. Our platform enables customer-specific solution teams to rapidly design, configure, and deploy data architectures tailored to meet specific operational and analytical requirements.
You’ll work autonomously while driving coordinated, collaborative decisions across cross-functional solutions and product development teams, which include a mix of defense and national security experts, U.S. veterans, and experienced AI/ML, software, and data engineers. Your work will focus on developing and implementing efficient and creative methods for acquiring data, creating data pipelines, managing complex schemas, and optimizing data systems for improved collection capabilities. Successful candidates thrive in a fast-paced, entrepreneurial, and mission-driven environment. You’ll be encouraged to think creatively, challenge conventional approaches, and identify alternative approaches to delivering customer value across complex problem sets. Your day-to-day workflow will vary, adapting to the requirements of our customer and technical needs of respective use cases. One day, you might be researching how to source PAI and OSINT data on a specific topic; another, you might be tasking Torch.AI’s collection capabilities to retrieve information for ingestion into data ecosystems; and the next, you might be working directly with customers to understand their data needs with deep intellectual curiosity.
What Sets This Role Apart
- Our decentralized operating model puts every employee at the forefront of our customers’ missions. You’ll work within and across small customer-account solutions teams and product development teams.
- We value customer intimacy, unique perspectives, and dedication to delivering lasting impact and results. You’ll have the opportunity to work on the frontlines of major customer programs and influence lasting success for Torch.AI and your teammates.
- You’ll have the opportunity to work on a wide range of projects, from designing and demonstrating early capabilities and prototypes to deploying large-scale production systems.
- You’ll directly contribute in helping Torch.AI continue to position as a leader in data infrastructure AI in the market and compete against multi-billion dollar incumbents and high-tech AI companies.
- We develop solutions directly supporting our nation’s warfighters and national security and prosperity; the impact of your work is directly visible.
Critical Skills
- B.S. degree in a related field or an equivalent combination of training and experience.
- Extensive experience with Python and JSON.
- Ability to extract data from open-access, publicly available data sources.
- Ability to develop and maintain large scale web crawlers to access internet-wide data.
- Ability to build data pipelines for the rapid ingestion of data.
- Extract and clean data from websites, ensuring data accuracy and consistency.
- Optimize system performance through indexing, partitioning, and other techniques.
- Familiarity with Infrastructure-as-Code (IaC) tools and configuration management systems.
What We Value
- An Entrepreneurial mindset.
- Ability to create targeted, deep collection tools to acquire data from high-value sources and maximize recall.
- Design and manage an internet-scale data retrieval system to collect accessible online information.
- Ability to translate data between various formats including JSON, Parquet, Avro.
- Knowledge in Spark, Kafka, and Airflow.
- Proficient in utilizing cloud-based computing and storage platforms.
- Design and maintain database schemas to ensure efficient data storage and retrieval.
- Build tools to enhance the monitoring and troubleshooting of the collection system.
- Work closely with team members to optimize data acquisition workflows.
- Awareness of ethical considerations and responsible AI practices.
- Ensure data quality aligns with established standards and criteria.
- Excellent problem-solving skills, attention to detail, and ability to thrive in a fast-paced, collaborative environment.
- Eligible for Top Secret security clearance.
Professional Ambiance
- This role thrives in a cutting-edge, high-performance workspace.
- Our operations base is in Leawood, KS, with occasional opportunities for flexible hybrid or remote work arrangements.
Equity Program
- Employee Equity Pool: All employees are eligibility to participate in the company equity incentive program within their first 12 months of employment. We are proud to say that we have a 100% participation rate amongst existing employees.
Incentives and Advantages
- Competitive salary, performance bonus, and benefits package.
- Opportunity to participate in Torch.AI’s employee equity incentive program.
- Unlimited PTO.
- 11 paid holidays each year.
- Dynamic and energetic teammates.
- Incredible chance for professional advancement in a rapidly scaling high-tech environment.
- Weekly in-office catering in our Leawood HQ.
- Access to company entertainment suite at the Kansas City T-Mobile Center, with tickets to all major events and concerts.
- Exceptional medical, dental, and vision insurance.
- Company sponsored life and disability coverage.
- Relocation benefits.
Torch.AI is an Equal Opportunity /Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, age, sex, national origin, protected veteran status or status as an individual with a disability.
These positions are being reviewed and filled on a rolling basis, and multiple openings may be available for each role.