AI/ML: Transparency and Choice
Learn about Sentry's approach to AI/ML
Sentry processes your service data, the data you configure to be collected and reported to your Sentry instance, to provide our service to you. As Sentry's service has evolved, however, prior heuristics-based approaches cannot deliver the product value we've come to expect. To train and validate models for grouping, notifications, and workflow improvements, Sentry will need access to additional service data to deliver a better user experience.
You can update these settings within the "Service Data Usage" section of the Legal & Compliance page in Sentry, which is located within the "Usage & Billing" Settings.
In accordance with our Terms of Service, Sentry may use non-identifying elements of your service data for product improvement. For example, we may aggregate web vitals data to show your site's performance against a Sentry-built benchmark. The data accessed for the benchmark cannot be linked back to any particular project or customer, making it non-identifying.
For upcoming features like priority alerts or ML-based grouping, if authorized by you, Sentry may access the following forms of service data for product improvement:
- Error messages
- Stack traces
- Spans
- DOM interactions
For generative AI features like Seer, Sentry is asking for access to the following forms of service data to provide insights, analysis, and solutions for your review. Your data will not be used to train any generative AI models without your express consent, and AI-generated output from your data is shown only to you, not other customers.
- Error messages
- Stack traces
- Sentry spans
- DOM interactions
- Profiles
- Relevant code from linked repositories
To ensure that data is stored in your selected region, we will disable generative AI features in Sentry for EU region customers by default where data storage in the EU region is not available.
Access Type | Is the underlying data identifiable? | Who will this data (or any output) be shared with? | Will this data be used for training Sentry models? | Will this data be used to train 3rd party models? |
---|---|---|---|---|
Non-identifying data | No | Other Sentry customers* | Yes | No |
Aggregated identifying data | Yes | Approved subprocessors | Yes | No |
Identifying data for generative AI features | Yes | Approved subprocessors | No | No |
*In these cases we don't share the underlying data, only aggregations or output generated from the data.
In addition to the consent mechanisms mentioned above:
- We'll continue to encourage all customers to use our various data scrubbing tools so that service data is sanitized before we receive it.
- We'll apply the same deletion and retention rules to our training data as we do to the underlying service data. This means that if you delete service data, it will also be removed from our machine learning models automatically.
- We'll scrub data for PII before it goes into any training set.
- We'll ensure that the only service data presented in the output of any generative AI feature belongs to the customer using the feature.
- We'll only use generative AI models built in-house, deployed in our production cloud, or provided by our existing trusted third-party subprocessors who have made contractual commitments that are consistent with the above.
We're confident that with these controls in place, we'll be able to use service data to improve and provide our products through AI while at the same time protecting that data.
Our documentation is open source and available on GitHub. Your contributions are welcome, whether fixing a typo (drat!) or suggesting an update ("yeah, this would be better").