|
1 | | -# Incident Response Recommendations |
2 | | - |
3 | | -In this section, we provide recommendations for formulating a robust incident response plan. |
4 | | - |
5 | | -- [ ] **Identify specific individuals or roles responsible for carrying out the mitigations (deploying smart contracts, pausing contracts, upgrading the front end, etc.).** |
6 | | - - Defining these roles will enhance the incident response plan and facilitate the execution of mitigation actions when necessary. |
7 | | -- [ ] **Document internal processes in cases where deployed remediation fails or introduces new bugs.** |
8 | | - - Consider developing a fallback plan that outlines an action strategy for failed remediation attempts. |
9 | | -- [ ] **Provide a clear description of the intended contract deployment process.** |
10 | | -- [ ] **Consider whether and under what circumstances your company will compensate affected users in the event of certain issues.** |
11 | | - - Some situations to consider include individual or aggregate losses, losses resulting from user error, contract flaws, and third-party contract flaws. |
12 | | -- [ ] **Outline a plan for staying informed about new issues, so as to inform future development and enhance the security of the deployment toolchain and the external on-chain and off-chain services your system depends on.** |
13 | | - - For each language and component, identify reputable sources of vulnerability news. Subscribe to updates for each source. Consider creating a private Discord or Slack channel with a bot that posts the latest vulnerability news to help your team stay informed in a centralized location. Additionally, consider assigning specific team members to track vulnerability news for particular system components. |
14 | | -- [ ] **Examine scenarios involving issues that would indirectly affect the system.** |
15 | | -- [ ] **Decide when and how the team should seek assistance from or collaborate with external parties (auditors, affected users, other protocol developers, etc.).** |
16 | | - - Some problems may necessitate cooperation with external parties for efficient resolution. |
17 | | -- [ ] **Define abnormal contract behavior for off-chain monitoring purposes.** |
18 | | - - Consider implementing more robust detection and mitigation solutions, including specific alternate endpoints, queries for diverse data, status pages, and support contacts for impacted services. |
19 | | -- [ ] **Combine issues to evaluate whether new detection and mitigation scenarios are necessary.** |
20 | | -- [ ] **Conduct periodic dry runs of specific scenarios in the incident response plan to identify gaps and improvement opportunities, and build muscle memory.** |
21 | | - - Establish intervals for performing dry runs for each scenario. Conduct more frequent dry runs for scenarios with higher likelihoods of occurrence. Create a template to document improvements required after each dry run for the incident response plan. |
| 1 | +# Incident Response Guidelines |
| 2 | + |
| 3 | +How you respond during an incident is a direct reflection of your preparatory efforts. Shift from a reactive approach to a **proactive** one by planning with the assumption that incidents are inevitable. To fully leverage the following guidelines, consider them during the application development, and not at the final stage. |
| 4 | + |
| 5 | +## Application design |
| 6 | + |
| 7 | +- **Identify the components that should/should not to be** |
| 8 | + - **Pausable**. While pausing a component can be beneficial during an incident, you must assess its potential impact on other contracts. |
| 9 | + - **Migrable or upgradeable**. Discovering a bug might necessitate a [migration strategy](https://blog.trailofbits.com/2018/10/29/how-contract-migration-works/) or contract upgradeable to fix the issue. However always be aware that upgradeability has its own [sets of risks](https://blog.trailofbits.com/2020/12/16/breaking-aave-upgradeability/). Making all contracts upgradeable might not be the best approach. |
| 10 | + - **Decentralized**. Using decentralized components can sometimes restrict rescue measures. |
| 11 | +- **Evaluate what events are needed**. A missed event in a critical spot might result in unnoticed incidents. |
| 12 | +- **Evaluate what components must be on-chain and off-chain**. On-chain components are generally more at risk, but off-chain components push the risks to the off-chain owner. |
| 13 | +- **Use an access control that allows fine-grained access**. Avoid setting all access controls to be available to an EOA. Opt for multisig wallets/MPC, and segregate access (e.g., the key responsible for setting fees shouldn't have access to the upgradeability feature). |
| 14 | + |
| 15 | +## Documentation |
| 16 | + |
| 17 | +- **Document how to interpret abnormal events emission**. Only emitting events isn't sufficient; proper documentation is crucial, and users should be empowered to decode them. |
| 18 | +- **Document how to access the wallets**. Clearly outline how to access wallets. Both the location and access procedures for every wallet should be clear and straightforward. |
| 19 | +- **Document the deployment and upgrade process**. Deployment and upgrade are risky processes, and must be thoroughly documented. This should include how to test the deployment/upgrade (ex: using fork testing) and how to validate it. |
| 20 | +- **Document how to contact the users and external dependencies**. Define guidelines regarding which stakeholders to contact, including the timing and mode of communication in case of incidents. |
| 21 | + |
| 22 | +## Process |
| 23 | + |
| 24 | +- **Conduct periodic training and incident response exercises**. Regularly organize training sessions and incident response exercises. Such measures ensure that employees remain updated and can help highlight any flaws in the current incident response protocol. |
| 25 | +- **Identify incident owners, with at least**: |
| 26 | + - **A technical lead**. Responsible for gathering and centralizing technical data. |
| 27 | + - **A communication lead**. Tasked with internal and external communication. |
| 28 | + - **A legal lead**. Either provides legal advice or ensures the right legal entities are contacted. It might also be worth considering liaison with appropriate law enforcement agencies. |
| 29 | +- **Use automated monitoring tools**. Whether you opt for an in-house solution or third-party products, automation is key. While considering automated responses like pausing the system in the event of irregular activities, exercise caution. Without careful configuration, automatic responses might inadvertently facilitate denial-of-service (DOS) exploits. |
| 30 | + |
| 31 | +## Threat Intelligence |
| 32 | + |
| 33 | +- **Identify similar protocols, and stay informed of related compromises**. Being aware of vulnerabilities in similar systems can help preemptively address potential threats in your own. |
| 34 | +- **Identify dependencies, and monitor their behaviors to be alerted in case of compromise.** Follow twitter, discord, newsletter, etc. |
| 35 | +- **Maintain open communication lines with your dependencies owners**. This will help you to stay informed if one of your dependency is compromised. |
| 36 | +- **Subscribe to https://newsletter.blockthreat.io/**. Block threat will help you to know about recent incidents |
| 37 | + |
| 38 | +Additionally, consider conducting a threat modeling exercise. This will identify the risks an application faces at both structural and operational levels. If you're interested in undertaking such an exercise, [contact us](https://www.trailofbits.com/contact/). |
22 | 39 |
|
23 | 40 | ## Incident Response Plan Resources |
24 | 41 |
|
25 | 42 | - [How to Hack the Yield Protocol](https://docs.yieldprotocol.com/#/operations/how_to_hack) |
26 | 43 | - [Emergency Steps – Yearn](https://github.com/yearn/yearn-devdocs/blob/master/docs/developers/v2/EMERGENCY.md) |
| 44 | +- [Monitoring & Incident Response - Heidi Wilder (DSS 2023)](https://www.youtube.com/watch?v=TDlkkg8N0wc) |
27 | 45 |
|
28 | | -## Examples of Well-Handled Incidents |
| 46 | +### Examples of incidents retrospective |
29 | 47 |
|
30 | 48 | - [Yield Protocol](https://medium.com/yield-protocol/post-mortem-of-incident-on-august-5th-2022-7bb70dbb9ada) |
0 commit comments