Openstack telemetry: Ceilometer, Gnocchi and Aodh -

In Openstack Ceilometer is the component that gathers data from the cloud and pre-processes it. It distinguishes between samples (CPU time) and events (creation of an instance). Resources, Meters and Samples are fundamental concepts in Ceilometer.

Samples are retrieved at regular intervals and if Ceilometar fails to get the sample it can be estimated by interpolation. Events are retrieved as they happen and cannot be estimated.

Ceilometer sends events to the a storage service, while samples are sent to a service named Gnocchi, which is optimized to handle large amount of time-series data.

Aodh gets measures from Gnocchi, checks whether certain conditions are met and triggers actions. This is the foundation for application auto-scaling.

Other uses for Gnocchi data are monitoring the health of the cloud and billing.

Ceilometer has three ways retrieving samples and events:

Services may voluntarily provide them by sending Ceilometer notification via Openstack’s messaging system. This is preferred way since it is based on internal knowledge that the service has about it’s resources and it is fast without much overhead and stress on the systems.
Ceilometer actively retrieves data via APIs which is a costly method for billing and alarming.
Ceilometer can get data by accessing sub-components of services such as the hypervisor that run the instances.

Second and third method are referred also as methods where Ceilometer “polls” the samples.

More details on Openstack telemetry can be found on this link:
https://docs.openstack.org/ceilometer/latest/admin/telemetry-measurements.html

While Ceilometer has resources, meters and samples Gnocchi has resources, metrics and measures. Gnocchi resource corresponds to Ceilometer resource. Metric is roughly equivalent to a meter in Ceilometer. Gnocchi does not store every metric value it receives from Ceilomter, but rather it combines values and stores the results at regular intervals according to Archive policy.

Listing gnocchi resources, metrics, measures:
Resources:
`gnocchi resource list gnocchi resource show UUID`
Metrics:
`gnocchi metric list* gnocchi metric show cpu --resource UUID`
Measures:
`gnocchi measures show cpu --resource UUID --start YYYY-MM-DDTHH:MM:SS+00:00`

* Output will be empty for non-admin users.

Listing resources, metrics, measures with openstack client:
Resources:
`openstack metric resource list openstack metric resource show UUID`
Metrics:
`openstack metric metric list* openstack metric metric show cpu --resource UUID`
Measures:
`openstack metric measures show cpu --resource UUID --start YYYY-MM-DDTHH:MM:SS+00:00`

* Output will be empty for non-admin users.

Aggregation:
Server grouping:
`openstack server create --property metering.server_group=Mail*`
Metrics aggregation:
`gnocchi measures aggregation --query server_group=Mail --resource-type=instance --aggregation mean -m cpu_util`

* For gnocchi all servers with ‘–property metering.server_group=Mail’ can be considered tagged.

Listing Ceilometer Events:
Event types are defined in a YAML type:
`/etc/ceilometer/event_definitions.yaml`
List event types, events and event details:
`ceilometer event-type-list` `ceilometer event-list` `ceilometer event-show EVENT_ID`

* For gnocchi all servers with ‘–property metering.server_group=Mail’ can be considered tagged.
** There is no option in horizon GUI to view events or statictics, but gnocchi visualisation can be provided by Grafana.

Example generating CPU and disk load and showing gnocchi measures:
Create two instances:
`openstack server create --image cirros-image --flavor 1 --nic net-id=... --user-data cpu.sh cpu-user` `openstack server create --image cirros-image --flavor 1 --nic net-id=... --user-data disk.sh disk-user`
Let the instances finish creating and leave them running for a while:
`openstack server list`
Show cpu usage measures from cpu-user server:
`gnocchi measures show cpu.delta --resource-id SERVER_UUID 795 gnocchi measures show cpu_util --resource-id SERVER_UUID`
Show disk usage measures from disk-user server:
`gnocchi measures show disk.read.requests --resource-id SERVER_UUID gnocchi measures show disk.read.requests.rate --resource-id SERVER_UUID`

* Files cpu.sh and disk.sh are your bash scripts to generate CPU load and disk load.
** Don’t forget to stop cpu-server and disk-server after you have finished, since they will continue to generate CPU and disk load.

Alarms:
An alarm has:
`Type Condition: depends on type Evaluation window State: OK/Alarm/Insufficient Data Actions for state transitions`
Condition example:
`mean cpu_util > 60 all resources tagged server_group=Mail`
Single Resource Threshold Alarm:
`openstack alarm create --name cpuhigh \ --type gnocchi_resources_threshold \ --aggregation-method mean --metric cpu_util \ --comparison-operator gt --threshold 30 \ --resource-type instance \ --resource-id INSTANCE_UUID \ --granularity 60 --evaluation-periods 2 \ --alarm-action http://127.0.0.1:1234 \ --ok-action http://127.0.0.1:1234`
Alarm based on resource aggregates:
`openstack alarm create --name cpuhigh \ --type gnocchi_aggregation_by_resources_threshold \ --aggregation-method mean --metric cpu_util \ --comparison-operator gt --threshold 30 \ --resource-type instance \ --query '{ "=": { "server_group" : "Mail" }}' \ --granularity 60 --evaluation-periods 2 \ --alarm-action http://127.0.0.1:1234 \ --ok-action http://127.0.0.1:1234`
Alarm commands:
`openstack alarm list openstack alarm show ALARM_ID openstack alarm-history show ALARM_ID openstack alarm state get ALARM_ID openstack alarm update ALARM_ID ...`

Leave a Reply Cancel reply