|
| 1 | +# Training Data Quality Analyzer for ServiceNow Predictive Intelligence |
| 2 | + |
| 3 | +## Overview |
| 4 | +This script analyzes the quality of incident data in ServiceNow to determine readiness for Predictive Intelligence (PI) model training. It provides detailed statistics and quality metrics to help ServiceNow developers and admins identify and address data issues before starting ML training jobs. |
| 5 | + |
| 6 | +## Purpose |
| 7 | +- Assess completeness and quality of key fields in incident records |
| 8 | +- Identify common data issues that could impact PI model performance |
| 9 | +- Provide actionable insights for improving training data |
| 10 | + |
| 11 | +## Features |
| 12 | +- Checks completeness of important fields (e.g., short_description, description, category, subcategory, close_notes, assignment_group) |
| 13 | +- Analyzes text quality for description and close notes |
| 14 | +- Evaluates category diversity and resolution times |
| 15 | +- Calculates an overall data quality score |
| 16 | +- Outputs results to the ServiceNow system logs |
| 17 | + |
| 18 | +## Setup Requirements |
| 19 | +1. **ServiceNow Instance** with Predictive Intelligence plugin enabled |
| 20 | +2. **Script Execution Permissions**: Run as a background script or Script Include with access to the `incident` table |
| 21 | +3. **No external dependencies**: Uses only standard ServiceNow APIs (GlideRecord, GlideAggregate, GlideDateTime) |
| 22 | +4. **Sufficient Data Volume**: At least 50 resolved/closed incidents recommended for meaningful analysis |
| 23 | + |
| 24 | +## How It Works |
| 25 | +1. **Field Existence Check**: Dynamically verifies that each key field exists on the incident table or its parent tables |
| 26 | +2. **Statistics Gathering**: Collects counts for total, resolved, and recent incidents |
| 27 | +3. **Completeness Analysis**: Calculates the percentage of records with each key field filled |
| 28 | +4. **Text Quality Analysis**: Measures average length and quality of description and close notes |
| 29 | +5. **Category Distribution**: Reports on the spread and diversity of incident categories |
| 30 | +6. **Resolution Time Analysis**: Evaluates how quickly incidents are resolved |
| 31 | +7. **Quality Scoring**: Combines all metrics into a single overall score |
| 32 | +8. **Log Output**: Prints all results and warnings to the ServiceNow logs for review |
| 33 | + |
| 34 | +## Customization |
| 35 | +- Adjust the `keyFields` array in the config section to match your organization's data requirements |
| 36 | +- Modify thresholds for text length, resolution time, and completeness as needed |
| 37 | +- Increase `sampleSize` for more detailed analysis if you have a large dataset |
| 38 | + |
| 39 | +## Security & Best Practices |
| 40 | +- Do not run in production without review |
| 41 | +- Ensure no sensitive data is exposed in logs |
| 42 | +- Validate script results in a sub-production environment before using for model training |
0 commit comments