Thursday, March 22, 2012

Distinct Count Confusion

I have an issue with our cube. In the DW we have a column leadId. Performing a distinct count with no filters gives us about 18000 on that leadId. We have a measure using a Distinct Count on the leadId but the value only gives us about 15000 with no filters. Am I missing something with how the behavior of distinct count should act?Please, check queries that are issued from SSAS to the datasource database.|||

So I have found the reason the distinct values are being filtered, but I'm not sure what in our cube design is dictating the behavior to act this way. Here is the query issued to the DW... What confuses me is that planId is a nullable field. So I'm not sure why it would issue the statement like this. Any ideas?

SELECT [Facts_CurrentStatus].[Facts_CurrentStatusleadId0_0] AS [Facts_CurrentStatusleadId0_0],[Facts_CurrentStatus].[Facts_CurrentStatusapplicationId0_1] AS [Facts_CurrentStatusapplicationId0_1],[Facts_CurrentStatus].[Facts_CurrentStatusapplicationCustomerCount0_2] AS [Facts_CurrentStatusapplicationCustomerCount0_2],[Facts_CurrentStatus].[Facts_CurrentStatusapplicationPremium0_3] AS [Facts_CurrentStatusapplicationPremium0_3],[Facts_CurrentStatus].[Facts_CurrentStatuspolicyId0_4] AS [Facts_CurrentStatuspolicyId0_4],[Facts_CurrentStatus].[Facts_CurrentStatuspolicyCustomerCount0_5] AS [Facts_CurrentStatuspolicyCustomerCount0_5],[Facts_CurrentStatus].[Facts_CurrentStatuspolicyPremium0_6] AS [Facts_CurrentStatuspolicyPremium0_6],[Facts_CurrentStatus].[Facts_CurrentStatusleadArrivalToApplicationCompleteLagDays0_7] AS [Facts_CurrentStatusleadArrivalToApplicationCompleteLagDays0_7],[Facts_CurrentStatus].[Facts_CurrentStatusleadArrivalToPolicyIssueLagDays0_8] AS [Facts_CurrentStatusleadArrivalToPolicyIssueLagDays0_8],[Facts_CurrentStatus].[Facts_CurrentStatusapplicationCompleteToPolicyIssueLagDays0_9] AS [Facts_CurrentStatusapplicationCompleteToPolicyIssueLagDays0_9],[Facts_CurrentStatus].[Facts_CurrentStatusrateUp0_10] AS [Facts_CurrentStatusrateUp0_10],[Facts_CurrentStatus].[Facts_CurrentStatusapplicationOriginatedOnDate0_11] AS [Facts_CurrentStatusapplicationOriginatedOnDate0_11],[Facts_CurrentStatus].[Facts_CurrentStatussetByEmployeeId0_12] AS [Facts_CurrentStatussetByEmployeeId0_12],[Facts_CurrentStatus].[Facts_CurrentStatusutcDate0_13] AS [Facts_CurrentStatusutcDate0_13],[Facts_CurrentStatus].[Facts_CurrentStatuscarrierId0_14] AS [Facts_CurrentStatuscarrierId0_14],[Facts_CurrentStatus].[Facts_CurrentStatuspartnerId0_15] AS [Facts_CurrentStatuspartnerId0_15],[Facts_CurrentStatus].[Facts_CurrentStatusplanId0_16] AS [Facts_CurrentStatusplanId0_16],[Facts_CurrentStatus].[Facts_CurrentStatusleadOriginatedOnDate0_17] AS [Facts_CurrentStatusleadOriginatedOnDate0_17],[Facts_CurrentStatus].[Facts_CurrentStatuspolicyOriginatedOnDate0_18] AS [Facts_CurrentStatuspolicyOriginatedOnDate0_18],[Facts_CurrentStatus].[Facts_CurrentStatuspolicyIssuedOnDate0_19] AS [Facts_CurrentStatuspolicyIssuedOnDate0_19],[Facts_CurrentStatus].[Facts_CurrentStatuszipCodeId0_20] AS [Facts_CurrentStatuszipCodeId0_20],[Facts_CurrentStatus].[Facts_CurrentStatusapplicationCompletedOnDate0_21] AS [Facts_CurrentStatusapplicationCompletedOnDate0_21],[Facts_CurrentStatus].[Facts_CurrentStatusenrollerId0_22] AS [Facts_CurrentStatusenrollerId0_22],[Facts_CurrentStatus].[Facts_CurrentStatusagentId0_23] AS [Facts_CurrentStatusagentId0_23],[Facts_CurrentStatus].[Facts_CurrentStatuscustomerId0_24] AS [Facts_CurrentStatuscustomerId0_24],[Facts_CurrentStatus].[Facts_CurrentStatusstatusId0_25] AS [Facts_CurrentStatusstatusId0_25],[Facts_CurrentStatus].[Facts_CurrentStatuscampaignId0_26] AS [Facts_CurrentStatuscampaignId0_26],[Dimensions_Plan_20].[carrierId] AS [Dimensions_PlancarrierId3_0]

FROM

(

SELECT [leadId] AS [Facts_CurrentStatusleadId0_0],[applicationId] AS [Facts_CurrentStatusapplicationId0_1],[applicationCustomerCount] AS [Facts_CurrentStatusapplicationCustomerCount0_2],[applicationPremium] AS [Facts_CurrentStatusapplicationPremium0_3],[policyId] AS [Facts_CurrentStatuspolicyId0_4],[policyCustomerCount] AS [Facts_CurrentStatuspolicyCustomerCount0_5],[policyPremium] AS [Facts_CurrentStatuspolicyPremium0_6],DateDiff("d", leadOriginatedOnDate, applicationCompletedOnDate) AS [Facts_CurrentStatusleadArrivalToApplicationCompleteLagDays0_7],DateDiff("d", leadOriginatedOnDate, policyIssuedOnDate) AS [Facts_CurrentStatusleadArrivalToPolicyIssueLagDays0_8],DateDiff("d", applicationCompletedOnDate, policyIssuedOnDate) AS [Facts_CurrentStatusapplicationCompleteToPolicyIssueLagDays0_9],policyPremium - applicationPremium AS [Facts_CurrentStatusrateUp0_10],[applicationOriginatedOnDate] AS [Facts_CurrentStatusapplicationOriginatedOnDate0_11],[setByEmployeeId] AS [Facts_CurrentStatussetByEmployeeId0_12],[utcDate] AS [Facts_CurrentStatusutcDate0_13],[carrierId] AS [Facts_CurrentStatuscarrierId0_14],[partnerId] AS [Facts_CurrentStatuspartnerId0_15],[planId] AS [Facts_CurrentStatusplanId0_16],[leadOriginatedOnDate] AS [Facts_CurrentStatusleadOriginatedOnDate0_17],[policyOriginatedOnDate] AS [Facts_CurrentStatuspolicyOriginatedOnDate0_18],[policyIssuedOnDate] AS [Facts_CurrentStatuspolicyIssuedOnDate0_19],[zipCodeId] AS [Facts_CurrentStatuszipCodeId0_20],[applicationCompletedOnDate] AS [Facts_CurrentStatusapplicationCompletedOnDate0_21],[enrollerId] AS [Facts_CurrentStatusenrollerId0_22],[agentId] AS [Facts_CurrentStatusagentId0_23],[customerId] AS [Facts_CurrentStatuscustomerId0_24],[statusId] AS [Facts_CurrentStatusstatusId0_25],[campaignId] AS [Facts_CurrentStatuscampaignId0_26]

FROM [Facts].[CurrentStatus]

)

AS [Facts_CurrentStatus],[Dimensions].[Plan] AS [Dimensions_Plan_20]

WHERE

(

(

[Facts_CurrentStatus].[Facts_CurrentStatusplanId0_16] = [Dimensions_Plan_20].[surrogatePlanId] --Filters out the leads with no plan interest.

)

)

ORDER BY [Facts_CurrentStatus].[Facts_CurrentStatusleadId0_0]

ASC

|||

but I'm not sure what in our cube design is dictating the behavior to act this way.
Here is the query issued to the DW... What confuses me is that planId is a nullable field.

I don't know your cube design, therefore I can't say what is wrong.
My advice, try to avoid nullable field in the DW. It is not only my opinion.

|||Yes we already started to refactor the DW so the nullable fields would no longer exist. One thing though, I wasn't necessarily asking you to tell me how our design was wrong. I was more confused as to why AS would create a query that joins on a nullable field. Regardless, problem solved.

No comments:

Post a Comment