Model Performance

Model features and prediction accuracy

Model Performance can be found in your Dataro account under the Predictions menu item.

Start by selecting the relevant Dataro model from the list available.

Model Summary and Data Sources

The first two sections, Model Summary and Data Sources, detail what type of model is being used, what it predicts, and the data it was trained on, and the data sources used. Next, in the Sample of Top Predictors section, you will find a sample of 10 top predictors for a donor taking the action predicted by the model.

ModelPerformancetop10-ezgif.com-video-to-gif-converter

Model Performance and Incidence Rate

This section details how often the predicted event occurs per 100 donors, relative to their Dataro score and rank.

In the example below:

Top Ranked Donors (rank 1 - 1,000): The 74.7% incidence rate means that for the top 1,000 donors selected by the model, 747 made an appeal donation in the 90 days after the prediction.
Low Ranked Donors (rank 10,001-11,000): The 0.1% incident rate means that only 1 out of every 1,000 donors in this group participated in the event.

For new clients, the charts generated will be green and the data will be simulated retroactively to show what Dataro scores/ranks would have been, had they been used to predict historical donor behavior and/or campaigns then.

Once there is enough data and/or time has lapsed, typically 6 months, the chart generated will be in purple and show how your Dataro predictions have performed during this period.

The chart illustrates the relationship between donor ranks, incidence rates, and revenue per donor. It helps you understand how well your predictive model orders donors by their likelihood to donate.

Toggle between the Dataro Ranks and Dataro Scores to see how your predictions performed by Rank or by Score.

We use two key metrics to evaluate our models: PRC AUC and Top-N. These metrics indicate how well a model balances precision and recall. Higher values mean better performance.

PRC AUC (Area Under the Precision-Recall Curve): This metric ranges from 0 to 1, with higher values indicating better performance. It evaluates how well the model balances precision (accurate positive predictions) and recall (capturing all true positives), which is particularly important for datasets with imbalanced outcomes, such as having many more non-donors than donors.
Top-N: This metric involves counting the number of positive cases within a specific timeframe (eg. donations through direct mail in three months). We rank the predictions and review the top 'n' ranks to find the actual positive cases. For instance, if there were 150 direct mail donors starting from January 1, 2024, and the model accurately identified 120 of them within the top 150 ranks, the Top-N score would be 0.8.

Inspect Individual Cases

To dive deeper, you can inspect individual donor cases. This feature allows you to identify specific donor profiles and understand the factors influencing their predictive scores. Toggle between Highest rank donors, Mid rank donors, Lowest rank donors, and Custom rank donors to 'sense-check' the model across different predictive bands. Click 'See more' to open Contact View.

Historical Predictions

If you need to download historical scores and ranks as a reference (e.g., for a campaign run in the past), you can use this tool to access .csv files containing ranks and scores from previous dates.

Model Settings

Dataro enables a degree of customisation for some models.

Mid-Level Giving, Gift-in-Will and Major Giving model settings

Every gift you receive is allocated a Channel & Category tag in the Dataro system, depending upon the campaign with which the gift is associated. Read more about Categories and Channels here. The Model Settings feature in Model Performance allows you to select which Categories and Channels should be included in Dataro's predictions for certain events, like mid-level, major of gift-in-will predictions. This means that you can select which types of gifts should be included or excluded from Dataro's calculations when generating predictions for these propensities.

For example, we would typically recommend that your remove Peer-to-peer gifts from mid-level and major gift modelling, as these types of gifts may suggest a donor has personally contributed more than they actually have.

Model Performance FAQ

What does the Training Data Size number mean?
The training data size is the number of samples in the training data (positive and negative cases, i.e. people who did and didn't do the action).

Model Performance & Incidence Rate graphs are not available?

Occasionally you may see a notification that says "Insufficient events have been detected to display useful prediction accuracy results." This occurs where Dataro has not found enough positive example of the relevant behaviour (e.g. major giving) to be able to provide a meaningful historic analysis. This is more common for rare event, like major giving.