Senior Data Engineer

Hybrid · San FranciscoDataposted 15d agoSenior

Baselayer seeks a Senior Data Engineer to design and operate data infrastructure powering analytics and ML, with a focus on reliability and data quality. This hybrid role in SF offers a salary range of $135k-$220k, equity, unlimited vacation, and fully paid health benefits.

Annual salary$135,000 – $220,000+ equity · health · 401k match

Responsibilities

Design, build, and maintain robust ETL and ELT pipelines for analytics and ML
Own and improve architecture and tooling for large-scale datasets in cloud data platforms
Implement orchestration and automation for data workflows using Airflow, dbt, or similar
Build and maintain reusable data models for faster experimentation and reliable reporting
Implement data quality checks, observability, and alerting to ensure data integrity

About the role

About Baselayer Trusted by 2,200+ financial institutions, Baselayer is the intelligent business identity platform that helps verify any business, automate KYB, and monitor real-time risk. Baselayer’s B2B risk solutions and identity graph network leverage state and federal government filings and proprietary data sources to prevent fraud, accelerate onboarding, and lower credit losses.

About the Role We are looking for a Data Engineer to design, build, and operate the data infrastructure that powers Baselayer’s analytics and machine learning capabilities. You will own robust, scalable pipelines that ingest, transform, and validate structured and unstructured data from internal systems and external sources, with a strong focus on reliability, observability, and data quality.

This is a hands-on role for someone who thrives in complexity, cares deeply about correctness, and wants to work close to AI and product workflows in a regulated domain.

What You’ll Do Design, build, and maintain robust ETL and ELT pipelines that power analytics and machine learning use cases Own and improve the architecture and tooling for storing, processing, and querying large-scale datasets in cloud data platforms Implement orchestration and automation for data workflows using tools such as Airflow, dbt, or similar Build and maintain reusable data models to enable faster experimentation and reliable reporting Implement data quality checks, observability, and alerting to ensure integrity and reliability across environments Partner with Data Science, ML Engineering, Product, and Engineering to ensure reliable data delivery and feature readiness for modeling Optimize warehouse and query performance, scalability, and cost as data volumes grow Maintain clear documentation, runbooks, and operational processes for pipelines and datasets Partner with security and compliance stakeholders to ensure pipelines and access controls meet regulatory and internal standards About You You want to learn fast, take ownership, and build systems that other teams can rely on.

You are not just doing this for the win. You are doing it because you have something to prove and want to be great. You care about data integrity and reliability, you enjoy turning messy inputs into clean systems, and you are comfortable operating without a playbook. You are curious about AI and ML infrastructure and want to build the foundation that powers it.

About Baselayer

Visit job-boards.greenhouse.io for more.

Similar roles

MongoDB

Principal Analyst, GTM Operations

—10d ago

hellofresh

Staff Product Analyst, Growth

—10d ago

Roku

Sr. Analyst, Ad Forecasting & Inventory Yield

$140k–$150k11d ago