[SL-2777] Reproduce error with conversion metric and missing group by #1492

Jstein77 · 2024-10-30T17:28:33Z

@courtney.holcomb said:

We've been seeing this error come up in the logs a lot lately, and I was finally able to repro. It's triggered by querying a conversion metric with a where filter but with no group by. The where filter is only applied to one of the measures, so then we aren't able to join the two aggregated measures. I'm not super familiar with conversion metric behavior. What do we want the experience to be in that scenario to fix this bug? cc @willymwai.deng @jordan.stein

[August 21st, 2024 11:30 AM] diego.fernandez: Hey all, we've seen this error a few times in MFS recently:
All parent nodes should have the same set of linkable instances since all values are coalesced.
They all seem to come from the same account, so maybe it's a configuration error on their side, but it shows up as errors on our side <https://dbtlabsmt.datadoghq.com/logs?query=%40extra.job.error.message%3A"All parent nodes should have the same set of linkable instances since all values are coalesced."&agg_m=count&agg_m_source=base&agg_t=count&cols=host%2Cservice&context_event=AZFy6QYvAAC7NuVb965U8ACb&fromUser=false&graphType=flamegraph&historicalData=true&hostGroup=*&lens=Performance&messageDisplay=inline&page=1&panelFrom=1724209200000&panelStorageType=hot&panelTo=1724210400...

_{From SyncLinear.com | SL-2777}

The text was updated successfully, but these errors were encountered:

Jstein77 · 2024-10-30T17:28:37Z

As a follow up:

Add validators when someone adds a filter on a conversion measure.

Jstein77 · 2024-10-30T17:28:38Z

we do not have anything that tracks what type of metric is being used. We could see if it gets logged so that we can make that calculation on DD

Jstein77 · 2024-10-30T17:28:38Z

Do we have any visibility into how often conversion metrics are being queried, and what % hit this error? I checked in our snowflake tables but didn't see anything. I'm trying to prioritize this bug.

Jstein77 · 2024-10-30T17:28:39Z

Commented on the PR. I don't know if we can quick-fix this, but if you've got ideas I'd be happy to be wrong about that.

Jstein77 · 2024-10-30T17:28:39Z

That error isn't happening in the pushdown optimizer, it's happening on plan conversion. A query of this form apparently doesn't convert to a sql query plan:

    parsed_query = query_parser.parse_and_validate_query(
        metric_names=("visit_buy_conversion_rate_7days",),
        group_by_names=("metric_time", "user__home_state_latest"),
        where_constraint=PydanticWhereFilter(where_sql_template="{{ Dimension('visit__referrer_id') }} = '123456'"),
    )

Jstein77 · 2024-10-30T17:28:39Z

Hmm, I made that change and it seems to break the predicate pushdown tests, but other tests seems to be fine

FAILED tests_metricflow/query_rendering/test_predicate_pushdown_rendering.py::test_conversion_metric_query_filters - RuntimeError: Expected exactly one matching instance for LinklessEntitySpec(element_name='user', entity_links=()) in instance set, but found: []. All entity instances: ()

@tomkit.lento mind lending a hand? 👀

Jstein77 · 2024-10-30T17:28:40Z

This only addresses the bug from Courtney's post and also the issue 1210, but 1199 is a separate issue

Jstein77 · 2024-10-30T17:28:41Z

Cause this base_measure_recipe.required_local_linkable_specs contains the specs needed to do the joins to get the converted events, but it contains the specs from the where filter too since it was built via __get_required_and_extraneous_linkable_specs and in the final aggregate measures node, we should just use the queried_specs as we don't want any of the specs from the filter to be there which is causing that bug

Jstein77 · 2024-10-30T17:28:41Z

So trying to remember the context of all this again, but it seems like you can repro this by filtering with a dim that exists in the base measure's semantic model, but that dimension isn't in the group by? I'm looking at the error, it seems like that error is coming from when we cross join the agg'd base measure set and the base measure set filtered by only converted where the base measure set filtered by converted rows
seems to have the linkable element (ie., the dimension) so it's erroring out during the cross join rendering since it should not have the spec from the filter.

I did some testing, it seems like this fixes that bug? #1381

Jstein77 · 2024-10-30T17:28:42Z

More details here:

#1199

Jstein77 · 2024-10-30T17:28:42Z

@jordan.stein no, we would not.

Jstein77 · 2024-10-30T17:28:43Z

Taking a look 👀

Jstein77 · 2024-10-30T17:28:43Z

I thought if we filter on the base measure set, then join in the conversion set we should only be matching users in the base measure set so wouldn't we be safe from false conversion matches?

Jstein77 · 2024-10-30T17:28:45Z

I can also repro if i run dbt sl query --metrics mql_to_seller_conversion_rate_base --where "{{Dimension('mql__origin')}} = 'direct_traffic'". In this case mql__origin is from the same semantic model as the base measure

Jstein77 · 2024-10-30T17:28:45Z

If you only apply to the filter to the base measure you may get inappropriate conversion matches.

Jstein77 · 2024-10-30T17:28:46Z

I feel like that was intentional but maybe I'm misremembering? @willymwai.deng would know

Jstein77 · 2024-10-30T17:28:46Z

I think we should apply the filter to the base measure, similar to how to do it for categorical dimensions. i.e The filter only gets applied here to limit the base measure set.

 (
    SELECT
      metric_time__day
      , SUM(mqls) AS mqls
    FROM (
      SELECT
        DATE_TRUNC('day', first_contact_date) AS metric_time__day
        , CASE WHEN mql_id IS NOT NULL THEN 1 ELSE 0 END AS mqls
      FROM ANALYTICS.dbt_jstein.olist_marketing_qualified_leads olist_mqls_src_10000
    ) subq_2
    WHERE metric_time__day > '2010-01-01'
    GROUP BY
      metric_time__day
  ) subq_4

Jstein77 · 2024-10-30T17:28:47Z

Yep that's what I would expect

Jstein77 · 2024-10-30T17:28:47Z

I get the error when filtering for metric time

Encountered an error error querying against the semantic layer: All join data sets should have the same set of linkable instances as the from dataset since all values are coalesced.
From dataset instance set: InstanceSet(measure_instances=(MeasureInstance(defined_from=(SemanticModelElementReference(semantic_model_name='olist_mqls', element_name='mqls'),), associated_columns=(ColumnAssociation(column_name='mqls', single_column_correlation_key=SingleColumnCorrelationKey(PYDANTIC_BUG_WORKAROUND=True)),), spec=MeasureSpec(element_name='mqls', non_additive_dimension_spec=None, fill_nulls_with=None), aggregation_state=AggregationState.COMPLETE),), dimension_instances=(), time_dimension_instances=(), entity_instances=(), group_by_metric_instances=(), metric_instances=(), metadata_instances=())
Join dataset instance sets: [InstanceSet(measure_instances=(MeasureInstance(defined_from=(SemanticModelElementReference(semantic_model_name='olist_closed_deals', element_name='sellers'),), associated_columns=(ColumnAssociation(column_name='sellers', single_column_correlation_key=SingleColumnCorrelationKey(PYDANTIC_BUG_WORKAROUND=True)),), spec=MeasureSpec(element_name='sellers', non_additive_dimension_spec=None, fill_nulls_with=None), aggregation_state=AggregationState.COMPLETE),), dimension_instances=(), time_dimension_instances=(TimeDimensionInstance(defined_from=(SemanticModelElementReference(semantic_model_name='olist_mqls', element_name='ds'),), associated_columns=(ColumnAssociation(column_name='metric_time__day', single_column_correlation_key=SingleColumnCorrelationKey(PYDANTIC_BUG_WORKAROUND=True)),), spec=TimeDimensionSpec(element_name='metric_time', entity_links=(), time_granularity=TimeGranularity.DAY, date_part=None, aggregation_state=None)),), entity_instances=(), group_by_metric_instances=(), metric_instances=(), metadata_instances=())]

Polling for queried result set
(cloud-env) jordanstein@Jordan-Stein jaffle-sl-template % dbt sl query --metrics mql_to_seller_conversion_rate_base --where "{{TimeDimension('metric_time')}} > '2010-01-01'"

Jstein77 · 2024-10-30T17:28:48Z

Actually, wait, I’m thinking of something else. It is a bit odd that it’s only time filters.

But the linked GH issue has a repro case in it, and that uses a metric_time filter as well.

Jstein77 · 2024-10-30T17:28:48Z

It should only happen with metric_time, for Reasons

Jstein77 assigned WilliamDee Oct 30, 2024

Jstein77 added Low priority Created by Linear-GitHub Sync Medium Created by Linear-GitHub Sync Metricflow Created by Linear-GitHub Sync labels Oct 30, 2024

Jstein77 added this to the v0.207.0 milestone Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SL-2777] Reproduce error with conversion metric and missing group by #1492

[SL-2777] Reproduce error with conversion metric and missing group by #1492

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

[SL-2777] Reproduce error with conversion metric and missing group by #1492

[SL-2777] Reproduce error with conversion metric and missing group by #1492

Comments

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024

Jstein77 commented Oct 30, 2024