Axway-API-Management-Plus
diff --git a/‎CHANGELOG.md‎
Lines changed: 2 additions & 1 deletion b/‎CHANGELOG.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎README.md‎
Lines changed: 99 additions & 13 deletions b/‎README.md‎
Lines changed: 99 additions & 13 deletions
diff --git a/‎apibuilder4elastic/conf/default.js‎
Lines changed: 3 additions & 0 deletions b/‎apibuilder4elastic/conf/default.js‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎apibuilder4elastic/custom_flow_nodes/api-builder-plugin-elk-solution-utils/src/actions.js‎
Lines changed: 75 additions & 1 deletion b/‎apibuilder4elastic/custom_flow_nodes/api-builder-plugin-elk-solution-utils/src/actions.js‎
Lines changed: 75 additions & 1 deletion
diff --git a/‎apibuilder4elastic/custom_flow_nodes/api-builder-plugin-elk-solution-utils/src/flow-nodes.yml‎
Lines changed: 43 additions & 1 deletion b/‎apibuilder4elastic/custom_flow_nodes/api-builder-plugin-elk-solution-utils/src/flow-nodes.yml‎
Lines changed: 43 additions & 1 deletion
@@ -21,13 +21,14 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
   - `LOGSTASH_ELASTICSEARCH_SSL_VERIFICATIONMODE` to configure Logstash to Elasticsearch certificate validation [#156](https://github.com/Axway-API-Management-Plus/apigateway-openlogging-elk/issues/156)
   - `FILEBEAT_ELASTICSEARCH_SSL_VERIFICATIONMODE` to configure Filebeat to Elasticsearch certificate validation [#156](https://github.com/Axway-API-Management-Plus/apigateway-openlogging-elk/issues/156)
   - Helm chart supports this by the new new parameter: `validateElasticsearchCertificate` per component [#156](https://github.com/Axway-API-Management-Plus/apigateway-openlogging-elk/issues/156)
+- Support to configure the data retention period of indexed data instead of using hardcoded ILM-Settings [#160](https://github.com/Axway-API-Management-Plus/apigateway-openlogging-elk/issues/160)
 
 ### Fixed
 - APIBuilder4Elastic - The Swagger for this service is invalid - Duplicate operationId renamed [#158](https://github.com/Axway-API-Management-Plus/apigateway-openlogging-elk/issues/158)
 
 ### Security
 - Custom-Flow-Nodes dependencies updated to solve security issue https://github.com/advisories/GHSA-74fj-2j2h-c42q
-- API-Builder version update to version Exeter
+- API-Builder version update to version Exeter to solve security issues
 
 ## [4.0.3] 2021-12-20
 ### Changed
 
@@ -832,23 +832,109 @@ So your alerts should report a critical alert before 90%. For more information,
 ## Lifecycle Management
 
 Since new data is continuously stored in Elasticsearch in various indexes, these must of course be removed after a certain period of time.  
-Since version 2.0.0, the solution uses the Elasticsearch [ILM](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-lifecycle-management.html) feature for this purpose, which defines different lifecycle stages per index. The so-called ILM policies are automatically configured by the solution using [configuration files](apibuilder4elastic/elasticsearch_config) and can be reviewed in Kibana.  
+Since version 2.0.0, the solution uses the Elasticsearch [ILM](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-lifecycle-management.html) feature for this purpose, which defines different lifecycle stages per index. The so-called ILM policies are automatically configured by the solution with default values using [configuration files](apibuilder4elastic/elasticsearch_config) and can be reviewed in Kibana. Beginning with version 4.1.0, you can also configure the lifecycle of the data yourself according to your requirements.  
 The indices pass through stages such as Hot, Warm, Cold which can be used to deploy different performance hardware per stage. This means that traffic details from two weeks ago no longer have to be stored on high-performance machines.  
 
-The configuration is defined here per data type (e.g. Summary, Details, Audit, ...). The following table gives an overview.  
+The configuration is defined here per data type (e.g. Summary, Details, Audit, ...). The following table gives an overview about the default values. The number of days that is crucial for the retention period is the delete days. This gives the guaranteed number of days that the data is guaranteed to be available. More information on how the lifecycle works can be found later in this section. You can use the further phase, for example, to allocate more favorable resources accordingly.
 
-| Data-Type              | Description                                                            | Hot (Size/Days) | Warm    | Cold    | Delete  | Total   |
-| :---                   |:---                                                                    | :---            | :---    | :---    | :---    | :---    |
-| **Traffic-Summary**    | Main index for traffic-monitor overview and primary dashboard          | 30GB / 7 days   | 5 days  | 3 days  | 0 days  | 15 days |
-| **Traffic-Details**    | Details in Traffic-Monitor for Policy, Headers and Payload reference   | 30GB / 7 days   | 5 days  | 3 days  | 0 days  | 15 days |
-| **Traffic-Trace**      | Trace-Messages belonging to an API-Request shown in Traffic-Monitor    | 30GB / 7 days   | 5 days  | 3 days  | 0 days  | 15 days |
-| **General-Trace**      | General trace messages, like Start- & Stop-Messages                    | 30GB / 7 days   | 5 days  | 3 days  | 0 days  | 15 days |
-| **Gateway-Monitoring** | System status information (CPU, HDD, etc.) from Event-Files            | 30GB / 60 days  | 30 days | 15 days | 0 days  | 105 days|
-| **Domain-Audit**       | Domain Audit-Information as configured in Admin-Node-Manager           | 10GB / 270 days | 270 days| 720 days| 30 days | >3 years|
+| Data-Type              | Description                                                            | Hot (Rollover) | Warm    | Cold    | __Delete__  |
+| :---                   |:---                                                                    | :---           | :---    | :---    | :---        |
+| **Traffic-Summary**    | Main index for traffic-monitor overview and primary dashboard          | 30GB / 7d      | 0d      | 12d     | __15d__     |
+| **Traffic-Details**    | Details in Traffic-Monitor for Policy, Headers and Payload reference   | 30GB / 7d      | 0d      | 12d     | __15d__     |
+| **Traffic-Trace**      | Trace-Messages belonging to an API-Request shown in Traffic-Monitor    | 30GB / 7d      | 0d      | 12d     | __15d__     |
+| **General-Trace**      | General trace messages, like Start- & Stop-Messages                    | 30GB / 7d      | 0d      | 12d     | __15d__     |
+| **Gateway-Monitoring** | System status information (CPU, HDD, etc.) from Event-Files            | 30GB / 60d     | 0d      | 90d     | __105d__    |
+| **Domain-Audit**       | Domain Audit-Information as configured in Admin-Node-Manager           | 10GB / 270d    | 270d    | 720d    | __750d__    |
 
-Please note:  
-:point_right: It's optional to use different hardware per stage  
-:point_right: Do not change the ILM/Modify the ILM-Policies manually, as they are configured automatically. In a later version, the solution will provide options to customize the time range as needed without breaking updates.  
+### Configure the lifecycle
+
+As of version 4.1.0, you can configure how long the indexed data should be kept in Elasticsearch. Before starting, you should read and understand the following information thoroughly, because once deleted, data cannot be recovered.  
+Individual API transactions are stored as documents in Elasticsearch Indices. However, it is not the case that individual documents are ultimately deleted again, instead it is always an entire index with millions of transactions/documents. Therefore, you can only control the retention period for an entire index, not per document.  
+When API transactions are stored in an index, the size of the index increases accordingly. To prevent an index from growing infinitely, it can be rolled over after a certain time. A new active index is created, which is used to write the data. This replaces the old index, which is only used for reading. This process is called [rollover](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-rollover.html).  
+
+In order not to have to control this process manually, there are so-called [Index Lifecycle Management](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-lifecycle-management.html) (ILM) policies in Elasticsearch, which perform the rollover based on defined rules and then send the index through further phases for various purposes.  
+
+These ILM policies are configured automatically by the solution with default values and are stored and managed for each index in Elasticsearch. The default values result in the data being available for at least 2 weeks.  
+
+If you would like to customize the lifecycle, then you can provide a corresponding configuration file from version 4.1.0 and use the parameter: `RETENTION_PERIOD_CONFIG`. This is used to adapt the ILM policies accordingly.  
+
+Here is an example:  
+```json
+{
+    "retentionPeriods": {
+        "apigw-traffic-summary": {
+            "rollover": {
+                "max_age": "7d",
+                "max_size": "15gb"
+            }, 
+            "retentionPeriod": "7d"
+        }, 
+        "apigw-traffic-details": {
+            "rollover": {
+                "max_age": "7d",
+                "max_size": "15gb"
+            }, 
+            "retentionPeriod": "6d"
+        }, 
+        "apigw-traffic-trace": {
+            "rollover": {
+                "max_age": "7d",
+                "max_primary_shard_size": "15gb"
+            }, 
+            "retentionPeriod": "5d"
+        }
+    }
+}
+```
+
+The configuration is defined per index and is divided into two areas. When should the rollover happen and how many days after the rollover should the data still be available.  
+The following figure illustrates the process:  
+
+![Lifecycle details](imgs/index-ilm-details.png)  
+
+__1. Create your rention period config file__  
+
+Create a new file for your retention period configuration. For example: `config/custom-retention-period.json`. As a template, you can use the file: `config/my-retention-period-sample.json`. 
+
+__2. Define the rollover__  
+
+It is important to understand that the time period until the rollover of an index is not exactly fixed.  
+For example, if you specify a maximum age and size for an index, then the index will be rolled over as soon as a condition is met.  
+
+- If the maximum size is too small for your transaction volume, then an index can meet the size condition in less than 24 hours and will be rolled over. 
+- If the maximum size is too large, the index will be rolled when it reaches the maximum age (e.g. after 7 days).  
+
+So how long the data is available from the very beginning to the end of an index is the sum of the period from the index's initial creation to the rollover __plus__ the period until the delete. As the rollover date cannot be defined exactly, you need to monitor your system accordingly and adjust the lifecycle accordingly to get the desired retention time.  
+
+You can use the following conditions for the rollover:
+ - `max_age`: Defines the maximum age of an index until it is rolled over
+ - `max_size`: The maxium index size. As an index has a Primary and Replica the required disk space is doubled (max_size: 30gb turns it 60gb disk space used)
+ - `max_primary_shard_size`: Starting with an Elasticsearch version 7.13, you can also define the maximum shard size of an index. All indexes , except apigw-management-kpis and apigw-domainaudit, have 5 shards. So you have to multiply the specified size by 5.
+
+For more information please read: [ILM Rollover options](https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-rollover.html#ilm-rollover-options)
+
+__3. Define the retention period__  
+
+With the parameter: `retentionPeriod` you define the time period for which the data is guaranteed to be available. As already described, the time until the rollover of the index adds to this. You can specify only days here.
+
+__4. Apply the configuration__  
+
+The last step is to reference your configuration file in your `.env` file with the parameter: `RETENTION_PERIOD_CONFIG=./config/custom-retention-period.json` and restart API Builder.
+
+`docker-compose stop apibuilder4elastic`
+`docker-compose up`
+
+You can check in Kibana whether the ILM policy has been adjusted accordingly. To do this, go to Stack Management --> Index Lifecycle Policies - Open the corresponding policy here and check the phase.
+
+__Further notes:__
+
+- Changes to the ILM-Policy have no influence on indices that have already been rolled over, as these have already entered lifecycle management
+- Indexes should not be too small, as this increases the load on Elasticsearch too much. 
+  - For each active index there are 5 Primary- and 5 Replica-Shards. 
+  - Each shard corresponds to a Lucene instance, which consumes corresponding resources. 
+  - The smaller an index, the more indexes, the more shards, the more resources are needed. 
+  - Elastic's recommendation is 30GB. The solution does not allow index size below 5GB.
+- It's optional to use different hardware per stage  
 
 <p align="right"><a href="#table-of-content">Top</a></p>
 
 
@@ -33,6 +33,9 @@ module.exports = {
 	managementKPIsInterval: process.env.MANAGEMENT_KPIS_INTERVAL || '3600000',
 	managementKPIsEnabled: ("false" == process.env.MANAGEMENT_KPIS_ENABLED) ? false : true, 
 
+	// This path is optional and if given used to adjust the ILM-Configuration.
+	retentionPeriodConfigFile: process.env.RETENTION_PERIOD_CONFIG || 'NotSet',
+
 	// These version are used, that Filebeat and Logstash are configured as required 
 	// by the API-Builder release
 	versions: {
 
@@ -52,6 +52,8 @@ async function getIndexConfig(params, options) {
 	if(indexConfig.ilm == undefined || indexConfig.ilm.config == undefined) {
 		indexConfig.ilm = { config: "NotSet" } ;
 	}
+	// Additionally add the name to the indexConfig
+	indexConfig.name = indexName;
 	return indexConfig;
 }
 
@@ -245,6 +247,77 @@ async function getPayloadFilename(params, options) {
 	return extractedFileName;
 }
 
+async function setupILMRententionPeriod(params, options) {
+	const { indexConfig, ilmConfig, rententionPeriodConfig } = params;
+	const { logger } = options;
+	if (!indexConfig) {
+		throw new Error('Missing required parameter: indexConfig');
+	}
+	if (!ilmConfig) {
+		throw new Error('Missing required parameter: ilmConfig');
+	}
+	if (!rententionPeriodConfig) {
+		logger.debug(`No retentionPeriodConfig is given. Using standard retention periods.`);
+		return options.setOutput('notChanged', ilmConfig);
+	}
+	if (!indexConfig.name) {
+		throw new Error('The name of the index is missing in the IndexConfig');
+	}
+	// Trying to read the retentionPeriodConfig file
+	if (!rententionPeriodConfig.retentionPeriods) {
+		throw new Error('rententionPeriodConfig must contain retentionPeriods object.');
+	}
+	const indexName = indexConfig.name;
+	// Check if a retentionPeriod is defined for the given index
+	if(!rententionPeriodConfig.retentionPeriods[indexName]) {
+		logger.debug(`No retention period configured for index: ${indexName}. Using default ILM-Configuration.`);
+		return options.setOutput('notChanged', ilmConfig);
+	} else {
+		var periodConfig = rententionPeriodConfig.retentionPeriods[indexName];
+		// Defines when an index should be rolled over which means it enters the WARM, COLD, DELETE lifecycle
+		if(periodConfig.rollover) {
+			logger.info(`Setup ILM rollover configuration for index: ${indexName} with config: ${JSON.stringify(periodConfig.rollover)}`);
+			var maxAge = parseInt(periodConfig.rollover.max_age);
+			if(periodConfig.rollover.max_age) {
+				ilmConfig.policy.phases.hot.actions.rollover.max_age = `${maxAge}d`;
+			}
+			if(periodConfig.rollover.max_size) {
+				const maxSize = parseInt(periodConfig.rollover.max_size);
+				if(isNaN(maxSize)) {
+					throw new Error(`The given max_size: ${periodConfig.rollover.max_size} for index: ${indexName} is not a valid number.`);
+				}
+				if(maxSize<5) {
+					throw new Error(`The given max_size: ${maxSize} for index: ${indexName} is too small. Please configure at least 5GB.`);
+				}
+				ilmConfig.policy.phases.hot.actions.rollover.max_size = `${maxSize}gb`;
+			}
+			if(periodConfig.rollover.max_primary_shard_size) {
+				const maxPrimaryShardSize = parseInt(periodConfig.rollover.max_primary_shard_size);
+				if(isNaN(maxPrimaryShardSize)) {
+					throw new Error(`The given max_primary_shard_size: ${periodConfig.rollover.max_primary_shard_size} for index: ${indexName} is not a valid number.`);
+				}
+				if(maxPrimaryShardSize<5) {
+					throw new Error(`The given max_primary_shard_size: ${maxPrimaryShardSize} for index: ${indexName} is too small. Please configure at least 5GB.`);
+				}
+				ilmConfig.policy.phases.hot.actions.rollover.max_primary_shard_size = `${maxSize}gb`;
+			}
+			
+		}
+		// The single value period is distributed across the lifecycle stages COLD AND DELETED. WARM is not considered for now, as an rolled over index should 
+		// move to WARM immediatly after roll-over. This might be enhanced later if needed with extra config options instead of days only
+		if(periodConfig.days) {
+			var givenDays = parseInt(periodConfig.days);
+			logger.info(`Setup ILM retention period for index: ${indexName} based on ${givenDays} number of days.`);
+			// The given number of days is distrbuted evenly for stages COLD & DELETE
+			var coldDays = Math.round(givenDays / 2); // It stay for a while warm before it goes to COLD
+			var deleteDays = givenDays; // It stay for a while in COLD before delete
+			ilmConfig.policy.phases.cold.min_age = `${coldDays}d`;
+			ilmConfig.policy.phases.delete.min_age = `${deleteDays}d`;
+		}
+	}
+	return ilmConfig;
+}
+
 async function getHostname(params, options) {
 	const hostname = os.hostname();
 	options.logger.debug(`API-Builder process is running on host: ${hostname}`);
@@ -257,5 +330,6 @@ module.exports = {
 	createIndices, 
 	updateRolloverAlias,
 	getPayloadFilename,
-	getHostname
+	getHostname,
+	setupILMRententionPeriod
 };
@@ -207,4 +207,46 @@ flow-nodes:
             type: object
             properties:
               message:
-                type: string
+                type: string
+
+      setupILMRententionPeriod:
+        name: Setup ILM Rentention-Period
+        description: "Configures the given ILM-Policy according to the provided number of days."
+        parameters:
+          indexConfig:
+            name: Index config
+            description: "Index configuration as defined in elasticsearch_config/index_config.json. It also contains the index name."
+            required: true
+            schema:
+              type: object
+          rententionPeriodConfig:
+            name: Rentention period config
+            description: "Contains the path of the retention period config file. If not given, the standard ILM is used."
+            required: true
+            schema:
+              type: string
+          ilmConfig:
+            name: ILM-Config
+            description: "The ILM Config object that is supposed to be send to Elasticsearch to create or update the ILM-Policy. It is read from the $.indexConfig.ilm.config file and converted into an object."
+            required: true
+            schema:
+              type: object
+        outputs:
+          next:
+            name: Next
+            description: Returns the updated ILM-Configuration object.
+            context: $.ilmPolicyBody
+            schema:
+              type: object
+          notChanged:
+            name: Not changed
+            description: The given ILM-Policy has not changed, because either no rention period parameter is given or it defaults to 15 days.
+            context: $.ilmPolicyBody
+            schema:
+              type: object
+          error:
+            name: Error
+            description: An unexpected error happened
+            context: $.error
+            schema:
+              type: object