What are the responsibilities and job description for the Data Engineer (Databricks & Spark Specialist) position at Harmonia Holdings Group, LLC?
Harmonia Holdings Group, LLC, is an award-winning, rapidly growing federal government contractor committed to providing innovative, high-performing solutions to our government clients and focused on fostering a workplace that encourages growth, initiative, creativity, and employee satisfaction.
This role involves designing, building, and maintaining scalable data pipelines using Apache Spark and Databricks. The Data Engineer will work with cloud platforms like Azure or AWS, handling data ingestion, transformation, and storage. Strong programming skills in Python, Scala, and SQL and expertise in ETL processes and data integration are essential. Additional responsibilities include ensuring data security, implementing DevOps practices, and understanding machine learning concepts. A Databricks Certified Data Engineer Professional certification is required.
Technical Skills:
- Apache Spark: In-depth knowledge of Apache Spark, including Spark SQL, Spark Streaming, and Spark MLlib.
- Databricks: Hands-on experience with Databricks, including Databricks Runtime, Databricks Notebooks, and Databricks Jobs.
- Cloud Computing: Familiarity with cloud platforms such as Azure or AWS and experience with cloud-based data services like Azure Data Lake Storage (ADLS) or Amazon S3.
- Data Engineering: Knowledge of data engineering principles, including data ingestion, processing, and storage, as well as experience with data pipelines and workflows.
- Programming Languages: Proficiency in programming languages such as Python and Scala, and experience with SQL and data querying languages like Hive or Spark SQL.
- Data Integration: Experience with data integration tools and techniques, including data ingestion, transformation, and loading (ETL).
- Data Analysis: Familiarity with data analysis techniques, including data visualization, reporting, and statistical analysis.
- Data Security: Knowledge of data security principles, including data encryption, access control, and authentication.
Additional Skills:
- DevOps: Familiarity with DevOps practices, including continuous integration and continuous deployment (CI/CD).
- Machine Learning: Knowledge of machine learning concepts and techniques, including model development, training, and deployment.
- Data Governance: Understanding of data governance principles, including data quality, data security, and data compliance.
- Cloud Security: Familiarity with cloud security principles, including identity and access management, network security, and data encryption.
Education: Bachelor's degree in Computer Science, Information Technology, or a related field.
Required Certification: Databricks Certified Data Engineer Professional | Databricks