-
-
Notifications
You must be signed in to change notification settings - Fork 778
fix: update risk scoring thresholds to align with CVSS v3.x/v4.0 #6132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… in documentation and codebase
|
Note
|
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. | You can run @coderabbitai generate docstrings to improve docstring coverage. |
✅ Passed checks (2 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | The title accurately summarizes the main change: updating risk scoring thresholds to align with CVSS v3.x/v4.0 standards, which is reflected across all modified files. |
| Description check | ✅ Passed | The description is directly related to the changeset, explaining that risk score thresholds are being adjusted to match CVSS standards with a relevant reference link. |
✨ Finishing touches
- 📝 Generate docstrings
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Post copyable unit tests in a comment
- Commit unit tests in branch
fix/risk-score-thresholds
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
site/docs/red-team/risk-scoring.md (1)
109-120: Update Scenario 3 label to “Medium”With the new thresholds, a score of 5.65 now falls into the medium band (<7.0). Please adjust the bolded classification (and any surrounding text) so the example matches the updated severity ranges.
🧹 Nitpick comments (2)
CHANGELOG.md (2)
7-12: Add PR number to comply with changelog rules.The entry meets scope/prefix/category expectations, but it’s missing the PR reference required by our changelog rules. Append (#6132) at the end.
Apply this minimal diff:
- - refactor(redteam): update risk score thresholds to match CVSS v3.x/v4.0 standards (Critical: 9.0-10.0, High: 7.0-8.9, Medium: 4.0-6.9, Low: 0.1-3.9) + - refactor(redteam): update risk score thresholds to match CVSS v3.x/v4.0 standards (Critical: 9.0-10.0, High: 7.0-8.9, Medium: 4.0-6.9, Low: 0.1-3.9) (#6132)As per coding guidelines.
7-12: Add a Documentation note if docs were updated.If this PR also updated risk scoring docs/SLA mappings, add a single-line entry under “Documentation” with a conventional prefix and scope.
Example to add under Unreleased → Documentation:
+### Documentation + +- docs(redteam): align risk scoring docs and examples with CVSS v3.x/v4.0 severity ranges (#6132)Based on learnings.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
CHANGELOG.md(1 hunks)site/docs/red-team/risk-scoring.md(2 hunks)src/redteam/riskScoring.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
src/redteam/**/*.ts
📄 CodeRabbit inference engine (src/redteam/CLAUDE.md)
src/redteam/**/*.ts: Always sanitize when logging test prompts or model outputs by passing them via the structured metadata parameter (second argument) to the logger, not raw string interpolation
Use the standardized risk severity levels: critical, high, medium, low when reporting results
Files:
src/redteam/riskScoring.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)
Prefer not to introduce new TypeScript types; reuse existing interfaces where possible
**/*.{ts,tsx}: Maintain consistent import order (Biome handles sorting)
Use consistent curly braces for all control statements
Prefer const over let and avoid var
Use object shorthand syntax when possible
Use async/await for asynchronous code
Use consistent error handling with proper type checks
**/*.{ts,tsx}: Use TypeScript with strict type checking enabled
Follow consistent import order (Biome will sort imports)
Use consistent curly braces for all control statements
Prefer const over let; avoid var
Use object property shorthand when possible
Use async/await for asynchronous code instead of raw promises/callbacks
When logging, pass sensitive data via the logger context object so it is auto-sanitized; avoid interpolating secrets into message strings
Manually sanitize sensitive objects with sanitizeObject before storing or emitting outside logging contexts
Files:
src/redteam/riskScoring.ts
src/**
📄 CodeRabbit inference engine (AGENTS.md)
Place core application/library logic under src/
Files:
src/redteam/riskScoring.ts
site/docs/**/*.md
📄 CodeRabbit inference engine (.cursor/rules/docusaurus.mdc)
site/docs/**/*.md: Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
Structure content to reveal information progressively: begin with essential actions and information, then provide deeper context as necessary; organize information from most important to least important.
Use action-oriented language: clearly outline actionable steps users should take, use concise and direct language, prefer active voice over passive voice, and use imperative mood for instructions.
Use 'eval' instead of 'evaluation' in all documentation; when referring to command line usage, use 'npx promptfoo eval' rather than 'npx promptfoo evaluation'; maintain consistency with this terminology across all examples, code blocks, and explanations.
The project name can be written as either 'Promptfoo' (capitalized) or 'promptfoo' (lowercase) depending on context: use 'Promptfoo' at the beginning of sentences or in headings, and 'promptfoo' in code examples, terminal commands, or when referring to the package name; be consistent with the chosen capitalization within each document or section.
Each markdown documentation file must include required front matter fields: 'title' (the page title shown in search results and browser tabs) and 'description' (a concise summary of the page content, ideally 150-160 characters).
Only add a title attribute to code blocks that represent complete, runnable files; do not add titles to code fragments, partial examples, or snippets that aren't meant to be used as standalone files; this applies to all code blocks regardless of language.
Use special comment directives to highlight specific lines in code blocks: 'highlight-next-line' highlights the line immediately after the comment, 'highligh...
Files:
site/docs/red-team/risk-scoring.md
{site,examples}/**
📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)
Documentation-only changes (touching only site/ or examples/) must use docs: prefix in PR title
Files:
site/docs/red-team/risk-scoring.md
site/**
📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)
For feature changes, update relevant documentation under site/
Files:
site/docs/red-team/risk-scoring.md
CHANGELOG.md
📄 CodeRabbit inference engine (.cursor/rules/changelog.mdc)
CHANGELOG.md: Document all user-facing changes in CHANGELOG.md
Every pull request must add or update an entry in CHANGELOG.md under the [Unreleased] section
Follow Keep a Changelog structure under [Unreleased] with sections: Added, Changed, Fixed, Dependencies, Documentation, Tests, Removed
Each changelog entry must include the PR number formatted as (#1234) or temporary placeholder (#XXXX)
Each changelog entry must use a Conventional Commit prefix: feat:, fix:, chore:, docs:, test:, or refactor:
Each changelog entry must be concise and on a single line
Each changelog entry must be user-focused, describing what changed and why it matters to users
Each changelog entry must include a scope in parentheses, e.g., feat(providers): or fix(evaluator):
Use common scopes for consistency: providers, evaluator, webui or app, cli, redteam, core, assertions, config, database
Place all dependency updates under the Dependencies category
Place all test changes under the Tests category
Use categories consistently: Added for new features, Changed for modifications/refactors/CI, Fixed for bug fixes, Removed for removed features
After a PR number is assigned, replace (#XXXX) placeholders with the actual PR number
Be specific, use active voice, include context, and avoid repeating the PR title in changelog entries
Group related changes with multiple bullets in the same category when needed; use one entry per logical change
CHANGELOG.md: All user-facing changes require a CHANGELOG.md entry before creating a PR
Add entries under [Unreleased] in appropriate category (Added, Changed, Fixed, Dependencies, Documentation, Tests)
Each changelog entry must include PR number (#1234) or placeholder (#XXXX)
Use conventional commit prefixes in changelog entries (feat:, fix:, chore:, docs:, test:, refactor:)
CHANGELOG.md: Document all user-facing changes in CHANGELOG.md
Changelog entries must include the PR number in format (#1234)
Use conventional commit prefixes in changelog entries: feat:,...
Files:
CHANGELOG.md
🧠 Learnings (10)
📓 Common learnings
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/redteam/CLAUDE.md:0-0
Timestamp: 2025-10-05T16:59:20.507Z
Learning: Applies to src/redteam/**/*.ts : Use the standardized risk severity levels: critical, high, medium, low when reporting results
📚 Learning: 2025-10-05T16:59:20.507Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/redteam/CLAUDE.md:0-0
Timestamp: 2025-10-05T16:59:20.507Z
Learning: Applies to src/redteam/**/*.ts : Use the standardized risk severity levels: critical, high, medium, low when reporting results
Applied to files:
src/redteam/riskScoring.tssite/docs/red-team/risk-scoring.mdCHANGELOG.md
📚 Learning: 2025-10-24T22:41:44.088Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: .cursor/rules/gh-cli-workflow.mdc:0-0
Timestamp: 2025-10-24T22:41:44.088Z
Learning: Applies to CHANGELOG.md : Add entries under [Unreleased] in appropriate category (Added, Changed, Fixed, Dependencies, Documentation, Tests)
Applied to files:
CHANGELOG.md
📚 Learning: 2025-10-24T22:42:38.674Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-24T22:42:38.674Z
Learning: Applies to CHANGELOG.md : Add new entries under the 'Unreleased' section
Applied to files:
CHANGELOG.md
📚 Learning: 2025-10-24T22:41:09.485Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: .cursor/rules/changelog.mdc:0-0
Timestamp: 2025-10-24T22:41:09.485Z
Learning: Applies to CHANGELOG.md : Follow Keep a Changelog structure under [Unreleased] with sections: Added, Changed, Fixed, Dependencies, Documentation, Tests, Removed
Applied to files:
CHANGELOG.md
📚 Learning: 2025-10-27T08:53:44.103Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-10-27T08:53:44.103Z
Learning: Applies to CHANGELOG.md : Use standardized scopes: providers, webui, cli, assertions, api, config, deps, docs, tests, examples, redteam, site
Applied to files:
CHANGELOG.md
📚 Learning: 2025-10-24T22:41:09.485Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: .cursor/rules/changelog.mdc:0-0
Timestamp: 2025-10-24T22:41:09.485Z
Learning: Applies to CHANGELOG.md : Every pull request must add or update an entry in CHANGELOG.md under the [Unreleased] section
Applied to files:
CHANGELOG.md
📚 Learning: 2025-10-24T22:42:38.674Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-24T22:42:38.674Z
Learning: Applies to CHANGELOG.md : Document all user-facing changes in CHANGELOG.md
Applied to files:
CHANGELOG.md
📚 Learning: 2025-10-27T08:53:44.103Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-10-27T08:53:44.103Z
Learning: Applies to CHANGELOG.md : Keep entries concise, one line, user-focused (what changed, not how) under the correct category
Applied to files:
CHANGELOG.md
📚 Learning: 2025-10-24T22:41:09.485Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: .cursor/rules/changelog.mdc:0-0
Timestamp: 2025-10-24T22:41:09.485Z
Learning: Applies to CHANGELOG.md : Use categories consistently: Added for new features, Changed for modifications/refactors/CI, Fixed for bug fixes, Removed for removed features
Applied to files:
CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: webui tests
- GitHub Check: Generate Assets
- GitHub Check: Build Docs
- GitHub Check: Share Test
- GitHub Check: Redteam (Production API)
- GitHub Check: Build on Node 24.x
- GitHub Check: Redteam (Staging API)
- GitHub Check: Test on Node 24.x and ubuntu-latest
- GitHub Check: Build on Node 20.x
- GitHub Check: Test on Node 24.x and windows-latest
- GitHub Check: Build on Node 22.x
- GitHub Check: Test on Node 20.x and macOS-latest
- GitHub Check: Test on Node 20.x and windows-latest
- GitHub Check: Test on Node 22.x and macOS-latest
- GitHub Check: Test on Node 22.x and windows-latest
- GitHub Check: Test on Node 22.x and ubuntu-latest
- GitHub Check: Test on Node 20.x and ubuntu-latest
- GitHub Check: Style Check
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (1)
src/redteam/riskScoring.ts (1)
181-188: Threshold update aligns with CVSS bandsThe new cutoffs (≥9.0 critical, ≥7.0 high, ≥4.0 medium) match the qualitative ranges defined in CVSS v3.x/v4.0, so the mapping now reflects the standard terminology precisely. (first.org)
|
⏩ No test execution environment matched (1f57bf4) View output ↗ Tip New to Tusk? Learn more here. |
Summary
Adjusting risk score thresholds to match the thresholds defined by CVSS: https://nvd.nist.gov/vuln-metrics/cvss