Demo

Data Engineer

O3 Technology Solutions
San Francisco, CA Full Time
POSTED ON 3/6/2025
AVAILABLE BEFORE 5/6/2025

Job Details

Job Description:

We are seeking a skilled Data Engineer with expertise in building and optimizing data pipelines and infrastructure to support ML and AI applications. The ideal candidate should have strong programming skills in Python and Scala, with hands-on experience in Apache Spark and big data processing frameworks.

Key Responsibilities:

Data Pipeline Development

  • Design, build, and maintain scalable ETL/ELT data pipelines for structured and unstructured data.
  • Develop real-time and batch data processing pipelines using Apache Spark (PySpark, Scala).
  • Optimize data workflows for performance, reliability, and cost efficiency.

Big Data & Cloud Engineering

  • Work with distributed data processing frameworks such as Apache Spark, Hadoop, or Kafka.
  • Implement data lake, data warehouse, and data marts architectures.
  • Leverage cloud-based data solutions (AWS, Azure, or Google Cloud Platform) for storage, transformation, and analytics.

ML & AI Infrastructure Support

  • Design data pipelines for ML model training, evaluation, and deployment.
  • Support feature engineering, data validation, and model inference processes.
  • Collaborate with Data Scientists and ML Engineers to ensure high-quality data availability for AI models.

Database & Storage Optimization

  • Work with SQL and NoSQL databases (e.g., PostgreSQL, Redshift, Snowflake, BigQuery, Cassandra, MongoDB).
  • Optimize database performance, indexing, and query execution for large datasets.

Security & Compliance

  • Implement data security best practices, including encryption, access controls, and auditing.
  • Ensure compliance with GDPR, CCPA, and PCI-DSS data privacy regulations.

Required Skills & Experience:

Programming Languages:

  • Proficient in Python & Scala (Scala preferred).
  • Experience with PySpark & Apache Spark for distributed data processing.

Big Data & Cloud Technologies:

  • Apache Spark (PySpark, Scala), Hadoop, Hive, Kafka.
  • Experience with Cloud Data Platforms such as AWS (Glue, EMR, Redshift), Azure (Databricks, Synapse), or Google Cloud Platform (BigQuery, Dataflow).
  • Working knowledge of containerization (Docker, Kubernetes).

Data Engineering & Pipeline Development:

  • Strong experience in ETL/ELT development using Spark, Airflow, or Dataflow.
  • Familiarity with orchestration tools (Apache Airflow, Prefect, Dagster, or AWS Step Functions).

Database & Query Optimization:

  • Proficiency in SQL (PostgreSQL, Snowflake, Redshift, BigQuery, or MySQL).
  • Experience with NoSQL databases (MongoDB, Cassandra, DynamoDB, or HBase).

ML & AI Data Infrastructure:

  • Understanding of data preprocessing for ML models.
  • Experience working with ML pipelines, feature stores, and model serving frameworks.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Data Engineer?

Sign up to receive alerts about other jobs on the Data Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$92,929 - $122,443
Income Estimation: 
$122,257 - $154,284
Income Estimation: 
$92,929 - $122,443
Income Estimation: 
$122,257 - $154,284
Income Estimation: 
$71,122 - $96,652
Income Estimation: 
$92,929 - $122,443
Income Estimation: 
$122,257 - $154,284
Income Estimation: 
$143,391 - $179,890
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at O3 Technology Solutions

O3 Technology Solutions
Hired Organization Address Reston, VA Full Time
Job Title: Ab Initio Developer Location: Reston, VA - Hybrid onsite 1X per week Role Type: 12 months contract Local only...
O3 Technology Solutions
Hired Organization Address Reston, VA Full Time
Please share resumes to jobs@o3tsi.com; Job Title: Ab Initio Developer Client: Health Care Location: Reston, VA (Hybrid ...

Not the job you're looking for? Here are some other Data Engineer jobs in the San Francisco, CA area that may be a better fit.

Data Engineer

Entertainment Data Oracle (EDO), San Francisco, CA

Data Science Solutions Engineer

Entertainment Data Oracle (EDO), South San Francisco, CA

AI Assistant is available now!

Feel free to start your new journey!