Demo

Staff Software Engineer - Distributed Data Systems

Databricks Inc.
San Francisco, CA Full Time
POSTED ON 1/10/2025
AVAILABLE BEFORE 3/28/2025

At Databricks, we are obsessed with enabling data teams to solve the world's toughest problems, from security threat detection to cancer drug development. We do this by building and running the world's best data and AI infrastructure platform, so our customers can focus on the high value challenges that are central to their own missions.

Maximise your chances of a successful application to this job by ensuring your CV and skills are a good match.

Founded in 2013 by the original creators of Apache Spark, Databricks has grown from a tiny corner office in Berkeley, California to a global organization with over 1000 employees. Thousands of organizations, from small to Fortune 100, trust Databricks with their mission-critical workloads, making us one of the fastest growing SaaS companies in the world.

Our engineering teams build highly technical products that fulfill real, important needs in the world. We constantly push the boundaries of data and AI technology, while simultaneously operating with the resilience, security and scale that is critical to making customers successful on our platform.

We develop and operate one of the largest scale software platforms. The fleet consists of millions of virtual machines, generating terabytes of logs and processing exabytes of data per day. At our scale, we regularly observe cloud hardware, network, and operating system faults, and our software must gracefully shield our customers from any of the above.

Modern data analysis employs sophisticated methods such as machine learning that go well beyond the roll-up and drill-down capabilities of traditional SQL query engines. As a software engineer on the Runtime team at Databricks, you will be building the next generation distributed data storage and processing systems that can outperform specialized SQL query engines in relational query performance, yet provide the expressiveness and programming abstractions to support diverse workloads ranging from ETL to data science.

Below are some example projects :

  • Apache Spark : Develop the de facto open source standard framework for big data.
  • Data Plane Storage : Deliver reliable and high performance services and client libraries for storing and accessing humongous amount of data on cloud storage backends, e.g., AWS S3, Azure Blob Store.
  • Delta Lake : A storage management system that combines the scale and cost-efficiency of data lakes, the performance and reliability of a data warehouse, and the low latency of streaming. Its higher level abstractions and guarantees, including ACID transactions and time travel, drastically simplify the complexity of real-world data engineering architecture.
  • Delta Pipelines : It's difficult to manage even a single data engineering pipeline. The goal of the Delta Pipelines project is to make it simple and possible to orchestrate and operate tens of thousands of data pipelines. It provides a higher level abstraction for expressing data pipelines and enables customers to deploy, test & upgrade pipelines and eliminate operational burdens for managing and building high quality data pipelines.
  • Performance Engineering : Build the next generation query optimizer and execution engine that's fast, tuning free, scalable, and robust.

What we look for :

  • BS in Computer Science, related technical field or equivalent practical experience.
  • Optional : MS or PhD in databases, distributed systems.
  • Comfortable working towards a multi-year vision with incremental deliverables.
  • Driven by delivering customer value and impact.
  • 8 years of production level experience in either Java, Scala or C .
  • Strong foundation in algorithms and data structures and their real-world use cases.
  • Experience with distributed systems, databases, and big data systems (Spark, Hadoop).
  • About Databricks

    Databricks is the data and AI company. More than 5,000 organizations worldwide — including Comcast, Condé Nast, H&M, and over 40% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify their data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe. Founded by the original creators of Apache Spark™, Delta Lake and MLflow, Databricks is on a mission to help data teams solve the world's toughest problems.

    Our Commitment to Diversity and Inclusion

    At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.

    Pay Range Transparency

    Databricks is committed to fair and equitable compensation practices. The pay range(s) for this role is listed below and represents base salary range for non-commissionable roles or on-target earnings for commissionable roles. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to job-related skills, depth of experience, relevant certifications and training, and specific work location.

    Local Pay Range :

    192,000 — $260,000 USD

    J-18808-Ljbffr

    Salary : $192,000 - $260,000

    If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
    Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

    What is the career path for a Staff Software Engineer - Distributed Data Systems?

    Sign up to receive alerts about other jobs on the Staff Software Engineer - Distributed Data Systems career path by checking the boxes next to the positions that interest you.
    Income Estimation: 
    $97,257 - $120,701
    Income Estimation: 
    $123,167 - $152,295
    Income Estimation: 
    $97,257 - $120,701
    Income Estimation: 
    $123,167 - $152,295
    Income Estimation: 
    $123,167 - $152,295
    Income Estimation: 
    $146,673 - $180,130
    Income Estimation: 
    $101,952 - $131,428
    Income Estimation: 
    $161,645 - $210,079
    Income Estimation: 
    $125,425 - $164,196
    Income Estimation: 
    $130,162 - $165,530
    Income Estimation: 
    $146,673 - $180,130
    Income Estimation: 
    $176,149 - $220,529
    View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

    Job openings at Databricks Inc.

    Databricks Inc.
    Hired Organization Address Portland, OR Full Time
    GAQ126R89 Location : While we're ideally looking to hire locally to our San Francisco or Mountain View offices, we will ...
    Databricks Inc.
    Hired Organization Address Denver, CO Full Time
    SLSQ226R171 We are looking for a Sales Leader to join our outstanding organization as a Director of Sales on our Healthc...
    Databricks Inc.
    Hired Organization Address Columbus, OH Full Time
    FEQ425R193 Americas Emerging Business Location: This is a fully remote position. Office Policy: Fully remote with expect...
    Databricks Inc.
    Hired Organization Address New York, NY Full Time
    P-1214 Company Description At Databricks, we are obsessed with enabling data teams to solve the world's toughest problem...

    Not the job you're looking for? Here are some other Staff Software Engineer - Distributed Data Systems jobs in the San Francisco, CA area that may be a better fit.

    Staff Software Engineer, Distributed Systems

    Amplitude, San Francisco, CA

    AI Assistant is available now!

    Feel free to start your new journey!