-
Notifications
You must be signed in to change notification settings - Fork 2.7k
feat: Add error ratio-based circuit breaking policy to api-breaker plugin #12765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: Add error ratio-based circuit breaking policy to api-breaker plugin #12765
Conversation
…ugin - Add new 'unhealthy-ratio' policy that triggers circuit breaker based on error rate within sliding time window - Implement three-state circuit breaker: CLOSED -> OPEN -> HALF_OPEN -> CLOSED - Add configurable parameters: error_ratio, min_request_threshold, sliding_window_size, permitted_number_of_calls_in_half_open_state, success_ratio - Maintain full backward compatibility with existing 'unhealthy-count' policy as default - Add comprehensive test coverage for new functionality - Update documentation in both Chinese and English - Follow APISIX coding standards and testing conventions This enhancement provides more intelligent circuit breaking for microservices architectures by considering error rates rather than just consecutive failure counts.
Baoyuantop
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! Based on the current configuration, we need to add some test cases:
-
After the sliding window time (sliding_window_size) expires, are the statistics (total number of requests, number of failures) correctly cleared?
-
Failure fallback in half-open state (Half-Open -> Open)
-
Sending more requests than permitted_number_of_calls_in_half_open_state in half-open state
t/plugin/api-breaker2.t
Outdated
| === TEST $((${1}+1)): hit route (return 200) | ||
| --- request | ||
| GET /api_breaker | ||
| --- response_body | ||
| hello world | ||
|
|
||
|
|
||
|
|
||
| === TEST $((${1}+1)): hit route and return 500 (first failure) | ||
| --- request | ||
| GET /api_breaker?code=500 | ||
| --- error_code: 500 | ||
| --- response_body | ||
| fault injection! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make multiple requests in a single case; you can refer to the tests in api-breaker.t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask if there is an official test image of apisix? It is very difficult to set up the environment for testing .t files locally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @HaoTien, can the test run in the environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I encountered some issues when build development environment with Dev Containers on macOS:
Installing https://luarocks.org/penlight-1.14.0-3.src.rock
Missing dependencies for penlight 1.14.0-3:
luafilesystem (not installed)
penlight 1.14.0-3 depends on luafilesystem (not installed)
Installing https://luarocks.org/luafilesystem-1.8.0-1.src.rock
Error: LuaRocks 3.12.0 bug (please report at https://github.com/luarocks/luarocks/issues).
Arch.: linux-aarch64
/usr/local/share/lua/5.1/luarocks/fetch.lua:139: attempt to concatenate local 'name' (a nil value)
stack traceback:
/usr/local/share/lua/5.1/luarocks/fetch.lua:196: in function 'fetch_url'
/usr/local/share/lua/5.1/luarocks/fetch.lua:85: in function 'fetch_caching'
/usr/local/share/lua/5.1/luarocks/fetch.lua:243: in function 'fetch_url_at_temp_dir'
/usr/local/share/lua/5.1/luarocks/fetch.lua:347: in function 'fetch_and_unpack_rock'
/usr/local/share/lua/5.1/luarocks/cmd/build.lua:66: in function 'build_rock'
/usr/local/share/lua/5.1/luarocks/cmd/build.lua:125: in function 'do_build'
/usr/local/share/lua/5.1/luarocks/cmd/build.lua:171: in function 'installer'
/usr/local/share/lua/5.1/luarocks/deps.lua:237: in function 'fulfill_dependency'
/usr/local/share/lua/5.1/luarocks/deps.lua:332: in function 'process_dependencies'
/usr/local/share/lua/5.1/luarocks/build.lua:404: in function 'build_rockspec'
...
/usr/local/share/lua/5.1/luarocks/cmd/build.lua:125: in function 'do_build'
/usr/local/share/lua/5.1/luarocks/cmd/build.lua:171: in function 'installer'
/usr/local/share/lua/5.1/luarocks/deps.lua:237: in function 'fulfill_dependency'
/usr/local/share/lua/5.1/luarocks/deps.lua:332: in function 'process_dependencies'
/usr/local/share/lua/5.1/luarocks/build.lua:404: in function 'do_build'
/usr/local/share/lua/5.1/luarocks/cmd/build.lua:171: in function </usr/local/share/lua/5.1/luarocks/cmd/build.lua:138>
[C]: in function 'xpcall'
/usr/local/share/lua/5.1/luarocks/cmd.lua:794: in function 'run_command'
/usr/local/bin/luarocks:38: in main chunk
[C]: at 0xb537f8c98d94
make: *** [Makefile:134: deps] Error 99
[64668 ms] postCreateCommand from devcontainer.json failed with exit code 2. Skipping any further user-provided commands.
…open_state; insert some test cases
feat: Add error ratio-based circuit breaking policy to api-breaker plugin
What this PR does / why we need it
This PR implements error ratio-based circuit breaking (
unhealthy-ratiopolicy) for theapi-breakerplugin, providing more intelligent and adaptive circuit breaking behavior based on error rates within a sliding time window, rather than just consecutive failure counts.Closes #12763
Types of changes
Description
Current Limitations
New Features Added
unhealthy-ratiopolicy that triggers circuit breaker based on error rate within a sliding time windowNew Configuration Parameters
policy"unhealthy-count"unhealthy.error_ratio0.5unhealthy.min_request_threshold10unhealthy.sliding_window_size300unhealthy.permitted_number_of_calls_in_half_open_state3healthy.success_ratio0.6Example Configuration
{ "plugins": { "api-breaker": { "break_response_code": 503, "policy": "unhealthy-ratio", "max_breaker_sec": 60, "unhealthy": { "http_statuses": [500, 502, 503, 504], "error_ratio": 0.5, "min_request_threshold": 10, "sliding_window_size": 300, "permitted_number_of_calls_in_half_open_state": 3 }, "healthy": { "http_statuses": [200, 201, 202], "success_ratio": 0.6 } } } }How Has This Been Tested?
Test Results
Files Modified
apisix/plugins/api-breaker.lua- Core plugin logic with new ratio-based policyt/plugin/api-breaker2.t- New comprehensive test file for ratio-based circuit breakingdocs/en/latest/plugins/api-breaker.md- Updated English documentationdocs/zh/latest/plugins/api-breaker.md- Updated Chinese documentationChecklist
Additional Notes
This implementation:
The feature addresses real-world use cases for:
Ready for review and feedback!