Data Foundations & Lineage Engineer

Remote
$50-60/hr.
Contract

Job Details

Confidential client of Biblioso
Remote
$50-60/hr.
Apply now

To apply for this job email your details to resumes@biblioso.com

Benefits

While benefits may vary based on work location and the nature of the job, in general our employees have access to a 401(k)-retirement plan, disability coverage, an Employee Assistance Program (EAP), life insurance, health insurance, paid vacation and sick time, and paid holidays.

Job Description

Team Environment

In this role, the nature of the work is dynamic and requires a collaborative attitude. While you have specific duties, it's important to understand that the entire team is responsible for the final delivery, and this may occasionally involve taking on additional tasks outside your primary responsibilities. The ability to adapt and contribute wherever needed is key to succeeding in this environment.

Our Project

We are searching for a Data Foundations & Lineage Engineer to build, document, and maintain the core data ecosystem that drives Learning Data Intelligence. This role involves defining the structure, lineage, quality, and meaning of datasets within the Learning Lake. The engineer will ensure every dataset is discoverable, well-documented, and trustworthy.

This position requires hands-on work across the lakehouse, including mapping schemas, tracing lineage, profiling quality, eliminating manual dependencies, and constructing a durable documentation layer that serves engineering, analytics, AI agents, and business stakeholders.

Data Discovery & Documentation

Perform deep, brute‑force exploration of all Learning Lake schemas and tables to understand their meaning, business purpose, and dependencies.

Build a comprehensive documentation repository describing dataset definitions, column‑level semantics, business logic, refresh cadences, source systems, and downstream consumption patterns.

Translate implicit, tribal‑knowledge data flows into explicit, searchable documentation consistent with guidance.

Data Lineage & Architecture Clarity

Develop end‑to‑end lineage for Learning datasets, mapping sources, transformations, pipelines, and consumption (Power BI, semantic models, AI agents, etc.).

Identify and eliminate manual or undocumented data feeds, aligning with the Manual Dependency Elimination initiative.

Collaboration & Stakeholder Alignment

Work closely with the DRI team as subject‑matter partners; escalate questions and validate assumptions.

Partner with analytics, engineering, content, and program teams to ensure data design supports downstream reporting, modeling, and AI use cases.

Enablement & Self‑Service

Build the foundational metadata that powers data discovery, semantic models, and self‑service analytics.

Produce guides, readme files, and onboarding materials for all teams relying on Learning Lake.

Required Qualifications

4+ years of experience in data engineering, data analysis, data governance, or related fields.

Expert SQL and data‑profiling skills, with the ability to reverse‑engineer undocumented or ambiguous datasets. Hands‑on experience with Azure Data Lake, Microsoft Fabric, Databricks, or Synapse in production environments.

Familiarity with metadata systems, data cataloging, lineage tooling, and orchestration best practices.

Demonstrated ability to operate effectively in ambiguous, poorly documented, and fast‑changing data environments.

Preferred Qualifications

Experience working across large‑scale data ecosystems with shifting taxonomies and inconsistent data quality, combined with strong foundations in data modeling, documentation systems, data product ownership, or semantic model design.

Proven ability to partner with engineering teams on data governance, lineage, metadata standards, and quality frameworks to improve reliability and trust.

Exposure to Learning or HR data domains (e.g., HCM, HRDP, Finance, Skills/Learning datasets), including familiarity with soft‑skilling, competency models, or employee capability frameworks.

Experience or working knowledge of data architecture concepts (lakehouse, domain‑driven design, data contracts, schema governance).

A strategic thinker who can link data foundations to business impact and AI‑driven outcomes, with strong prioritization and cross‑functional influence.

Ready to apply?

Apply now

To apply for this job email your details to resumes@biblioso.com