Skip to main content

How to Define Product Lineage

Prerequisites

  • You can log in to Witboost.
  • You have access to view and edit the product descriptor for your product.

Overview

Lineage represents the dependencies between products and shows how data flows through the organization.

For product producers, defining lineage correctly is crucial for:

  • Providing transparency to data consumers about where data comes from and where it goes.
  • Supporting impact analysis when making changes to a product.
  • Enabling governance and observability features in the Marketplace.

In the product descriptor, lineage is defined through two types of relationships:

  • readsFrom: represents strong, operational dependencies tied to real, physical data flows.
  • logicallyReadsFrom: represents design-level or high-level dependencies, either planned for the future or where not all details are documented.

These relations are visualized in the Marketplace Lineage Graph as solid lines (readsFrom connections) and dashed lines (logicallyReadsFrom connections).

tip

Learn more about:

Step by step

  1. Understand the relation types

    Before editing the descriptor, decide which relation best describes each dependency:

    • Use readsFrom when there is a concrete, operational data flow between two products.
    • Use logicallyReadsFrom when you want to represent a design-level or high-level dependency, without (yet) modeling the full physical pipeline.

    In the Marketplace Lineage Graph:

    • readsFrom appears as a solid line.
    • logicallyReadsFrom appears as a dashed line.
  2. Configure readsFrom for strong, physical dependencies

    Use readsFrom to describe real, operational data flows between two products. It indicates that the consuming product actively reads data from a specific published output port, through one of its components (usually a workload).

    Constraints

    • Source (Consumer)
      • Must always be a component or subcomponent, typically a workload (e.g., a service or pipeline).
      • Represents the element actively consuming data.
    • Target (Producer)
      • Must always be a consumable component or subcomponent, such as a published output port.
      • Represents the element exposing data for consumption.

    Best practices

    • Use readsFrom only when the physical flow is established and operational.
    • Be as specific as possible, linking directly to the exact consumable interface (output port).
    • Avoid using readsFrom for future or conceptual dependencies – those should be modeled with logicallyReadsFrom.
  3. Configure logicallyReadsFrom for logical or high-level dependencies

    Use logicallyReadsFrom for weaker relationships, when you want to capture intent or high-level flows without a fully defined physical connection.

    When to use it

    • Future dependencies: the data flow does not exist yet, but is planned for the future.
    • Simplified documentation: a real data flow exists, but you do not want to model every intermediate step.
    • Group-level relationships: the dependency is on a whole product or group of outputs (for example, a component that is the parent of multiple consumable subcomponents), not a single output port.

    Constraints

    • Source (Consumer)
      • Can be a system, component, or subcomponent.
      • Represents the consumer at any level of granularity.
    • Target (Producer)
      • Can be:
        • A group of consumables, such as a whole product or parent component containing multiple outputs.
        • A specific single consumable, like an output port.
    tip

    Prefer readsFrom over logicallyReadsFrom when defining a relationship from a component toward a consumable component or subcomponent. While logicallyReadsFrom can technically be used, readsFrom is recommended because it provides a stronger, more accurate representation of an actual operational data flow.

  4. Verify lineage in the Marketplace

    Once the descriptor is updated and your product is deployed and published:

    • Open the Marketplace Lineage Graph for your Product.
    • Check that solid and dashed lines match the readsFrom and logicallyReadsFrom relations you defined.
    • Adjust the descriptor if any dependency is missing or modeled at the wrong level (physical vs logical).

Result

When lineage is correctly defined in the descriptor:

  • Data consumers can clearly see where data comes from and where it goes.
  • Impact analysis becomes easier when you change or deprecate a product.
  • Governance and observability features in the Marketplace have accurate information to work with.