What personally identifiable information (PII) does Dataro use, and can this be customized for our organization?

Dataro never collects or uses health-related or other sensitive information. Dataro’s default configuration for personally identifiable information (PII) is that we collect: contact_id, date of birth, gender, postcode and first name for each contact. 


Strictly speaking these data points are not personally identifiable, as you cannot reliably re-identify individuals in the dataset using just these fields. However, some organizations wish to exclude some of these fields to balance their use of the product with their preferences for data sharing. Conversely, many Dataro users also opt to include additional PII fields (i.e. email, phone, address, full name, etc.) in order to support more seamless audience building and generation of mail / telemarketing files.


The Dataro platform is fairly flexible in what data it can work with. The most important data for Dataro’s predictive models is non-PII metadata related to historical donations and communications. However, if you choose to exclude some or all PII from your Dataro account, there will be some impact:

  • contact_id is essential as it is used to link to the other tables in the dataset, but it has no relation to anything outside of your database. This data point is required and cannot be excluded. 
  • First name is used to estimate age and gender when that data is not present for a contact. Age and gender have significant impact on the Gift-in-Will and Major Giving models, and as such, excluding first name data will reduce the accuracy of these models.
  • Internally, Dataro only resolves to age for contacts, so you may choose to supply just year of birth or current age instead of date of birth.
  • Gender has a marginal impact on the models and can be excluded if you do not want it for later segmentation in the app.
  • Postcode has a marginal impact on the models and can be excluded if you do not want it for later segmentation in the app.