metrics-manager Installation Guide

This guide describes how to install the yuuvis® RAD metrics-manager.

For a successful installation of yuuvis® RAD metrics-manager, you need to have one installation on each machine that hosts a yuuvis® RAD core-service or service-manager instance that you want to monitor. Also you need to identify a system where the metrics-manager Elasticsearch database, kibana, logstash and optionally elastalert2 will be hosted. This machine should have sufficient free resources to handle this extra load (at least 4 CPUs and 12 GB RAM). 


Where to Find the Installers

The setup for yuuvis® RAD metrics-manager is included in the regular product release folder.

Activating the Metrics Log Files

In order for yuuvis® RAD metrics-manager to work properly, the core-service and the service-manager must be configured to write their metrics information to a metrics log file. To do this, follow these steps:

  • core-service
    • Navigate to the logging configuration of the REST-WS gui page at http://<gateway>/rest-ws/#PAGE:monitor/logging
    • Set the logger "com.os.ecm.ws.metrics" to the log level "TRACE".
      Make sure that "use parent handler" is not checked.

    • The change takes immediate effect.
  • service-manager
    • Edit the file <service-manager>\config\application-prod.yml
    • Set the parameter "monitoring.trace.enabled" to true
    • Save the file and restart the service-manager

Main Installation (including Elasticsearch, logstash and kibana)

  • Double-click to start the setup.
  • Click next to start the setup procedure.
  • Choose the installation directory. 
    Attention: Do not use an installation path containing spaces!
  • At HTTP port, you can configure the metrics-manager servicewatcher port. This port has no special significance but should only be changed if the default port (8283) is already being used.
  • Choose the IP address that metrics services should bind to. This should be the IP address visible to other machines in the LAN/WAN.
  • Click next to accept the installation of elasticsearch, kibana and logstash.
  • If you have a core-service and/or service-manager installation on this machine, keep filebeat marked for installation and enter the paths to the core-service metrics log file and/or service-manager metrics log file (or use the buttons to the right to open a file selection dialog).
    If you have a distributed system, just leave the field(s) of the component(s) located on other machines empty. If you have neither a core-service nor a service-manager installed on this machine, you will still need filebeat to send the metricbeat information to logstash. If you do not want system metrics of this machine in your collected data, then uncheck the filebeat as well as metricbeat checkboxes.
    The path to the metricbeat log file is predefined for you. If for some reason, this is incorrect or you want to change it, you can do so now.
    Under the metricbeat checkbox, make sure that the prefilled IP address is the address of the machine running the logstash service. If not, change it to that address.
  • If you chose to install metricbeat, you can now choose if you want to install optional metrics modules for collecting metrics of the relational database, elasticsearch, and ActiveMQ. 
    • If you chose to install the database metrics module, enter the jdbc connection-string (see the file <core-service>\standalone\configuration\jas-app.xml for a reference), the username and the password for the database connection.
    • If you chose to install the elasticsearch metrics module, enter the host-address(es) of the elasticsearch server(s) and the username and password to access elasticsearch. You can refer to the file <service-manager>\config\application-es.yml for these values.
    • If you chose to install the ActiveMQ metrics module, enter the host address of the messaging-service (within the service-manager) and the path to the jolokia endpoints (the predefined value should be correct already).
  • Optionally you can choose to install the Network Share Monitor by cheking the "Install network share monitor" checkbox. Then enter the URI(s) for the network share(s) and the credentials, i.e. the Windows-Domain name, the Windows-Domain username and the password. The path of the datafile can be adjusted if desired. It is automatically added to the list of watched files in the configuration of the filebeat component.
  • Optionally you can choose to install elastalert2 which will provide alerting for critical and error situations. You can choose if you want to be alerted by email or via MS Teams notification, or both.
  • Setup now has all required information. Click next to start the installation.
  • If you like, you can start the service right away. 
  • Click finish to end the setup procedure.
  • If metricbeat is installed it will automatically try to collect JVM runtime information from all microservices and the core-service. For it to work with the core-service the wildfly-hawtio adapter needs to be deployed to the core-service. To do so, copy the file <metrics-manager>\tools\hawtio-wildfly-2.15.0.war to <core-service>\standalone\deployments. It will automatically be deployed right away (or at next start of the core-service)

Log-Shipping Installation (only sending log information to Elasticsearch)

  • Double-click to start the setup.
  • Click next to start the setup procedure.
  • Choose the installation directory.
  • At HTTP port, you can configure the metrics-manager servicewatcher port. This port has no special significance but should only be changed if the default port (8283) is already being used.
  • Choose the IP address that metrics services should bind to. This should be the IP address visible to other machines in the LAN / WAN.
  • Uncheck all check boxes. (Elasticsearch, Kibana and Logstash are only for the main installation.)
  • If you have a core-service and/or service-manager installation on this machine, keep filebeat marked for installation and enter the paths to the core-service metrics log file and/or service-manager metrics log file (or use the buttons to the right to open a file selection dialog).
    If you have a distributed system, just leave the field(s) of the component(s) located on other machines empty. If you have neither a core-service nor a service-manager installed on this machine, you will still need filebeat to send the metricbeat information to logstash. If you do not want system metrics of this machine in your collected data, then uncheck the filebeat as well as metricbeat checkboxes.
    The path to the metricbeat log file is predefined for you. If for some reason, this is incorrect or you want to change it, you can do so now.
    Under the metricbeat checkbox, make sure that the prefilled IP address is the address of the machine running the logstash service. If not, change it to that address.
  • If you chose to install metricbeat, you can now choose if you want to install optional metrics modules for collecting metrics of the relational database, elasticsearch, and ActiveMQ. 
    • If you chose to install the database metrics module, enter the jdbc connection-string (see the file <core-service>\standalone\configuration\jas-app.xml for a reference), the username and the password for the database connection.
    • If you chose to install the elasticsearch metrics module, enter the host-address(es) of the elasticsearch server(s) and the username and password to access elasticsearch. You can refer to the file <service-manager>\config\application-es.yml for these values.
    • If you chose to install the ActiveMQ metrics module, enter the host address of the messaging-service (within the service-manager) and the path to the jolokia endpoints (the predefined value should be correct already).
  • Optionally you can choose to install the Network Share Monitor by cheking the "Install network share monitor" checkbox. Then enter the URI(s) for the network share(s) and the credentials, i.e. the Windows-Domain name, the Windows-Domain username and the password. The path of the datafile can be adjusted if desired. It is automatically added to the list of watched files in the configuration of the filebeat component.
  • Keep elastalert unchecked.
  • Setup now has all required information. Click next to start the installation.
  • If you like, you can start the service right away. 
  • Click finish to end the setup procedure.
  • If metricbeat is installed it will automatically try to collect JVM runtime information from all microservices and the core-service. For it to work with the core-service the wildfly-hawtio adapter needs to be deployed to the core-service. To do so, copy the file <metrics-manager>\tools\hawtio-wildfly-2.15.0.war to <core-service>\standalone\deployments. It will automatically be deployed right away (or at next start of the core-service)

Starting Kibana

  • On the machine containing the main installation, open a browser and navigate to http://<local-ip-address>:5601
  • The kibana frontend will show up.
  • Login with your credentials. The default is elastic / optimal.
  • You are automatically forwarded to the predefined metrics manager dashboard.
  • You can watch the data coming in and start to explore it. If you want to, you can also define your own visualizations and/or dashboards.

Elastalert2 (optional)

If you chose to use Elastalert2 to receive e-mail/Teams notifications about critical and error situations, these are the predefined rules that trigger an alert :

  • System:

    Rule FilenameCondition description
    server_down.yamlless than 5 documents with host.name.keyword = Servername for 3 minutes
    high_cpu_load.yamlsystem.cpu.total.normalized.pct > 90% for 15 minutes per host.name.keyword
    high_ram_utilization.yamlsystem.memory.used.pct > 95% for 10 minutes per host.name.keyword
    network_error_spike.yamlsystem.network.in.errors + system.network.out.errors > 100 || 2 * previous timeframe value over 2x 1 minute
    hdd_utilization_warning.yamlsystem.filesystem.used.pct > 95% for 10 minutes per host.name.keyword and system.filesystem.device_name.keyword
    hdd_full.yamlsystem.filesystem.free < 1GB for 10 minutes per host.name.keyword and system.filesystem.device_name.keyword
    jvm_heap_usage_warning.yamljolokia.services.memory.heap_usage.used_pct > 0.90 for 3 minutes
    jvm_heap_usage_overload.yamljolokia.services.memory.heap_usage.used_pct > 0.98 for 3 minutes
  • yuuvis:

    Rule FilenameCondition description
    failed_logins_warning.yamldata.headers.response.x-os-autherror: "USERNAME_PASSWORD_INVALID" > 3x in 10 minutes per data.authdetails.user.keyword
    brute_force_warning.yamldata.headers.response.x-os-autherror: "USERNAME_PASSWORD_INVALID" > 50x in 15 minutes
    login_try_to_locked_account.yamldata.headers.response.x-os-autherror: "ACCOUNT_LOCKED" > 1x in 5 minutes per data.authdetails.user.keyword
    http_5xx_spike.yamldata.headers.response.status = 500..599 > 100 || 2 * previous timeframe value over 2x 1 minute per data.headers.response.status
    http_5xx_percentage.yamldata.headers.response.status = 500..599 for > 2% of all stati, at least 20, in 5 minutes per data.headers.response.status
    http_4xx_spike.yamldata.headers.response.status = 400..499 > 20 || 1.3 * previous timeframe value over 2x 1 minute per data.headers.response.status
    http_4xx_percentage.yamldata.headers.response.status = 400..499 for > 2% of all stati, at least 20, in 5 minutes per data.headers.response.status
    search_latency.yamlduration_ms > 300 AND servicename:search.72 AND data.path:"/search" NOT data.path:"/search/aggregate" NOT data.path:"/search/storedqueries" for 5 minutes, at least 15 searches
    microservice_went_down.yamlservice.name appeared at least once and then not anymore within 2 minutes per servicename
    activemq_broker_down.yamlactivemq.broker.name.keyword appeared at least once and then not anymore within 3 minutes
    activemq_error_queue_size.yamlactivemq.queue.messages.size.avg > 1 for 6 hours in queues "errors" and "ActiveMQ.DLQ"
    activemq_queue_size_congestion.yamlactivemq.queue.messages.size.avg > 100 for 12 hours
    metricsdata_missing.yaml"host.name" and "log.file.path" appeared at least once and then not anymore within 5 minutes
  • Elasticsearch:

    Rule FilenameCondition description
    elasticsearch_cluster_down.yamlelasticsearch.cluster.name appeared at least once and then not anymore within 3 minutes
    elasticsearch_cluster_state_change.yamlelasticsearch.cluster.stats.status changed its value in the last 1 minute
    elasticsearch_index_state_change.yamlelasticsearch.index.status changed its value in the last 1 minute
    elasticsearch_shard_state_change.yamlelasticsearch.shard.state changed its value in the last 1 minute
  • Relational Database

    Rule FilenameCondition description
    mssql_database_down.yamlmssql.database.name appeared at least once and then not anymore within 3 minutes
    mssql_transaction_log_full.yamlmssql.transaction_log.space_usage.used.pct > 90% for 10 minutes
    mssql_transaction_log_utilization_warning.yamlmssql.transaction_log.space_usage.used.pct > 98% for 10 minutes
    oracle_database_down.yamloracle.tablespace.name.keyword appeared at least once and then not anymore within 3 minutes
    oracle_tablespace_full.yamloracle.tablespace.space.used.pct > 95% for 10 minutes
    postgresql_database_down.yamlpostgresql.database.name appeared at least once and then not anymore within 3 minutes
    postgresql_database_rollback_spike.yamlpostgresql.database.transactions.rollback > 2 * previous timeframe value over 2x 1 minute

You can find these files in <metrics-manager>\config\elastalert\elastalert-rules\. If you want to add new rules or adapt the values of the existing rules, simply add new files or edit the existing ones. Changes take effect within 1-2 minutes during the runtime - no restart necessary.
You can find the documentation here: https://elastalert2.readthedocs.io/en/latest/ruletypes.html

The list of e-mail recipients is globally defined in the <metrics-manager>\config\elastalert\elastalert.yaml file in the 'email' field. The value can either be a single address or an array of addresses in the form ["recipient@one", "recipient@two", ...]. You can also overwrite this list within the rule files.


Enabling HTTPS Communication

To enable HTTPS communication for Kibana (external) and/or for Elasticsearch, Logstash, Metricbeat and Filebeat (internal) follow the below instructions:

Kibana

  • For Kibana a SSL/TLS certifcate in .cer / .crt and .key format is required. Place these two files in the <mertrics-manager>\config folder
  • Open the <metrics-manager>\config\kibana.yml file for editing 
    • Uncomment the following three lines and replace certificate.cer and certificate.key with the file names of your certificate files
      server.ssl.enabled: true
      server.ssl.certificate: ../../../config/certificate.cer
      server.ssl.key: ../../../config/certificate.key

    • Find the below lines and replace the hostname with the exact hostname defined in the certificate
      server.host: "metrics.optimal-systems.de"
      server.name: "metrics.optimal-systems.de"

    • Find the below line and change the protocol from http to https:
      server.publicBaseUrl: "https://metrics.optimal-systems.de:5601"

      (warning) Note: Do the following only if you're also enabling HTTPS for elasticsearch:

    • Find the below line and change the protocol from http to https:
      elasticsearch.hosts: ["https://metrics.optimal-systems.de:5200"]

    • Find the below line and uncomment it. If the used certificate is self-signed, set the value to none, otherwise leave it at full
      elasticsearch.ssl.verificationMode: none
  • Save the file and restart Kibana. It is now accessible via https://metrics.optimal-systems.de:5601.
  • For ElastAlert2 to generate links to the new HTTPS address, edit the <metrics-manager>\config\elastalert\elastalert.yaml file and set the URL of the kibana_discover_app_url: parameter to a) use HTTPS and b) use the hostname defined in the certificate instead of an IP address.


  • If you already have a certificate in .p12 format (for example for the gateway microservice) then you can generate the .cer and .key certificate files using the Keystore Explorer tool by the following steps:
    • Open the .p12 certificate in the Keystore Explorer.
    • Right click on the certificate and choose Export → Export certificate chain. This will create the .cer file
    • Right click on the certificate and choose Export → Export private key. Choose PKCS #8 as the format. In the following dialog uncheck encryption. This will generate the .key file.
    • These files can also be use for Logstash (see below).

Elasticsearch

To enable HTTPS in elasticsearch, a certificate in .p12 format (the same as for the gateway microservice) can be used. If Elasticsearch is set to HTTPS communication, the configuration of Kibana, Logstash and ElastAlert2 needs to be changed so that https is used for communication with Elasticsearch. This can be done by following the below steps:

  • Elasticsearch
    • Place the certificate file in the <metrics-manager>\config\elasticsearch folder.
    • Edit the <metrics-manager>\config\elasticsearch\elasticsearch.yml file.
    • Add the following lines at the end of the file. Replace certificate.p12 with the filename of your certificate and 'password' with the password for your certificate
      xpack.security.http.ssl.enabled: true
      xpack.security.http.ssl.verification_mode: certificate
      xpack.security.http.ssl.keystore.path: certificate.p12
      xpack.security.http.ssl.keystore.password: password

    • Save the file and restart Elasticsearch. It is now available at https://<certificate-hostname>:5200

  • Kibana 
    • If not already configured in the above steps (Kibana), follow these steps to use https communication with Elasticsearch.
    • Find the below line and change the protocol from http to https:
      elasticsearch.hosts: ["https://<certificate-hostname>:5200"]

    • Find the below line and uncomment it. If the used certificate is self-signed, set the value to none, otherwise leave it at full
      elasticsearch.ssl.verificationMode: none

  • Logstash
    • Edit the file <metrics-manager>\config\logstash\logstash.conf.
    • Find the following lines and change the url from http://<ip>:5200 to https://<certificate-hostname>:5200
      output {
          elasticsearch {
              hosts => ["https://metrics.optimal-systems.de:5200"]

  • ElastAlert2
    • Edit the file <metrics-manager>\config\elastalert\elastalert.yaml
    • Uncomment the line 'use_ssl: True'
    • Uncomment the line 'verify_certs: True' and set the value to 'False'

Logstash

  • For Logstash a SSL/TLS certifcate in .cer / .crt and .key format is required. The .key file needs to be in unencrypted PKCS8 format. Place these two files in the <mertrics-manager>\config\logstash folder.
  • Open the file <metrics-manager>\config\logstash\logstash.conf and expand the input section to look like below:

    input {
      # input from filebeat
      beats {
        # the port to listen on
        port => 5044
        ssl => true
        ssl_certificate => "D:\yuuvis\metrics-manager\config\logstash\certificate.cer"
        ssl_key => "D:\yuuvis\metrics-manager\config\logstash\certificate.key"
        ssl_verify_mode => "none"
      }
    }

    Replace the certificate.cer and certificate.key file names with the actual names of the certificate files. If the certificate is self-signed use ssl_verify_mode with value none (as shown above). Else, use force_peer as value. Only absolute paths are valid.


Filebeat

  • Open the file <metrics-manager>\config\logstash\filebeat.yml file and find the section 'output.logstash'
  • change the hosts parameter to contain the hostname of the certificate instead of an IP.
    output.logstash:
    # The Logstash hosts
    hosts: ["schmittberger.optimal-systems.de:5044"]

  • Add the following two lines right below the hosts parameter (if you have a self-signed certificate use the verification_mode 'none like below else use 'full'):
    ssl.enabled: true
    ssl.verification_mode: none