Skip to main content

节点类指标

使用node-exporter实现。

组件源码仓库:https://github.com/prometheus/node_exporter

Sample文件:nodes.txt

名称类型描述
node_boot_time_secondsgaugenode_boot_time_seconds Unix time of last boot, including microseconds.
node_cpu_seconds_totalcounternode_cpu_seconds_total Seconds the CPUs spent in each mode.
node_disk_read_bytes_totalcounternode_disk_read_bytes_total The total number of bytes read successfully.
node_disk_read_errors_totalcounternode_disk_read_errors_total The total number of read errors.
node_disk_read_retries_totalcounternode_disk_read_retries_total The total number of read retries.
node_disk_read_sectors_totalcounternode_disk_read_sectors_total The total number of sectors read successfully.
node_disk_read_time_seconds_totalcounternode_disk_read_time_seconds_total The total number of seconds spent by all reads.
node_disk_reads_completed_totalcounternode_disk_reads_completed_total The total number of reads completed successfully.
node_disk_write_errors_totalcounternode_disk_write_errors_total The total number of write errors.
node_disk_write_retries_totalcounternode_disk_write_retries_total The total number of write retries.
node_disk_write_time_seconds_totalcounternode_disk_write_time_seconds_total This is the total number of seconds spent by all writes.
node_disk_writes_completed_totalcounternode_disk_writes_completed_total The total number of writes completed successfully.
node_disk_written_bytes_totalcounternode_disk_written_bytes_total The total number of bytes written successfully.
node_disk_written_sectors_totalcounternode_disk_written_sectors_total The total number of sectors written successfully.
node_exporter_build_infogaugenode_exporter_build_info A metric with a constant '1' value labeled by version, revision, branch, goversion from which node_exporter was built, and the goos and goarch for the build.
node_filesystem_avail_bytesgaugenode_filesystem_avail_bytes Filesystem space available to non-root users in bytes.
node_filesystem_device_errorgaugenode_filesystem_device_error Whether an error occurred while getting statistics for the given device.
node_filesystem_filesgaugenode_filesystem_files Filesystem total file nodes.
node_filesystem_files_freegaugenode_filesystem_files_free Filesystem total free file nodes.
node_filesystem_free_bytesgaugenode_filesystem_free_bytes Filesystem free space in bytes.
node_filesystem_readonlygaugenode_filesystem_readonly Filesystem read-only status.
node_filesystem_size_bytesgaugenode_filesystem_size_bytes Filesystem size in bytes.
node_load1gaugenode_load1 1m load average.
node_load15gaugenode_load15 15m load average.
node_load5gaugenode_load5 5m load average.
node_memory_active_bytesgaugenode_memory_active_bytes Memory information field active_bytes.
node_memory_compressed_bytesgaugenode_memory_compressed_bytes Memory information field compressed_bytes.
node_memory_free_bytesgaugenode_memory_free_bytes Memory information field free_bytes.
node_memory_inactive_bytesgaugenode_memory_inactive_bytes Memory information field inactive_bytes.
node_memory_internal_bytesgaugenode_memory_internal_bytes Memory information field internal_bytes.
node_memory_purgeable_bytesgaugenode_memory_purgeable_bytes Memory information field purgeable_bytes.
node_memory_swap_total_bytesgaugenode_memory_swap_total_bytes Memory information field swap_total_bytes.
node_memory_swap_used_bytesgaugenode_memory_swap_used_bytes Memory information field swap_used_bytes.
node_memory_swapped_in_bytes_totalcounternode_memory_swapped_in_bytes_total Memory information field swapped_in_bytes_total.
node_memory_swapped_out_bytes_totalcounternode_memory_swapped_out_bytes_total Memory information field swapped_out_bytes_total.
node_memory_total_bytesgaugenode_memory_total_bytes Memory information field total_bytes.
node_memory_wired_bytesgaugenode_memory_wired_bytes Memory information field wired_bytes.
node_network_noproto_totalcounternode_network_noproto_total Network device statistic noproto.
node_network_receive_bytes_totalcounternode_network_receive_bytes_total Network device statistic receive_bytes.
node_network_receive_drop_totalcounternode_network_receive_drop_total Network device statistic receive_drop.
node_network_receive_errs_totalcounternode_network_receive_errs_total Network device statistic receive_errs.
node_network_receive_multicast_totalcounternode_network_receive_multicast_total Network device statistic receive_multicast.
node_network_receive_packets_totalcounternode_network_receive_packets_total Network device statistic receive_packets.
node_network_transmit_bytes_totalcounternode_network_transmit_bytes_total Network device statistic transmit_bytes.
node_network_transmit_colls_totalcounternode_network_transmit_colls_total Network device statistic transmit_colls.
node_network_transmit_errs_totalcounternode_network_transmit_errs_total Network device statistic transmit_errs.
node_network_transmit_multicast_totalcounternode_network_transmit_multicast_total Network device statistic transmit_multicast.
node_network_transmit_packets_totalcounternode_network_transmit_packets_total Network device statistic transmit_packets.
node_power_supply_battery_healthgaugenode_power_supply_battery_health IOKit Power Source information field battery_health for power_supply.
node_power_supply_chargedgaugenode_power_supply_charged IOKit Power Source information field charged for power_supply.
node_power_supply_charginggaugenode_power_supply_charging IOKit Power Source information field charging for power_supply.
node_power_supply_current_amperegaugenode_power_supply_current_ampere IOKit Power Source information field current_ampere for power_supply.
node_power_supply_current_capacitygaugenode_power_supply_current_capacity IOKit Power Source information field current_capacity for power_supply.
node_power_supply_infogaugenode_power_supply_info IOKit Power Source information for power_supply.
node_power_supply_max_capacitygaugenode_power_supply_max_capacity IOKit Power Source information field max_capacity for power_supply.
node_power_supply_power_source_stategaugenode_power_supply_power_source_state IOKit Power Source information field power_source_state for power_supply.
node_power_supply_presentgaugenode_power_supply_present IOKit Power Source information field present for power_supply.
node_power_supply_time_to_empty_secondsgaugenode_power_supply_time_to_empty_seconds IOKit Power Source information field time_to_empty_seconds for power_supply.
node_power_supply_time_to_full_secondsgaugenode_power_supply_time_to_full_seconds IOKit Power Source information field time_to_full_seconds for power_supply.
node_scrape_collector_duration_secondsgaugenode_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
node_scrape_collector_successgaugenode_scrape_collector_success node_exporter: Whether a collector succeeded.
node_textfile_scrape_errorgaugenode_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
node_thermal_cpu_available_cpugaugenode_thermal_cpu_available_cpu Reflects how many, if any, CPUs have been taken offline. Represented as an integer number of CPUs (0 - Max CPUs).
node_thermal_cpu_scheduler_limit_ratiogaugenode_thermal_cpu_scheduler_limit_ratio Represents the percentage (0-100) of CPU time available. 100% at normal operation. The OS may limit this time for a percentage less than 100%.
node_thermal_cpu_speed_limit_ratiogaugenode_thermal_cpu_speed_limit_ratio Defines the speed & voltage limits placed on the CPU. Represented as a percentage (0-100) of maximum CPU speed.
node_time_secondsgaugenode_time_seconds System time in seconds since epoch (1970).
node_time_zone_offset_secondsgaugenode_time_zone_offset_seconds System time zone offset in seconds.
node_uname_infogaugenode_uname_info Labeled system information as provided by the uname system call.

容器类指标

使用cadvisor实现,在新版本Kubernetes中已经内置。

组件源码地址:https://github.com/kubernetes/kubernetes/tree/master/vendor/github.com/google/cadvisor

Sample文件:cadvisor.txt

名称类型描述
cadvisor_version_infogaugecadvisor_version_info A metric with a constant '1' value labeled by kernel version, OS version, docker version, cadvisor version & cadvisor revision.
container_blkio_device_usage_totalcountercontainer_blkio_device_usage_total Blkio Device bytes usage
container_cpu_cfs_periods_totalcountercontainer_cpu_cfs_periods_total Number of elapsed enforcement period intervals.
container_cpu_cfs_throttled_periods_totalcountercontainer_cpu_cfs_throttled_periods_total Number of throttled period intervals.
container_cpu_cfs_throttled_seconds_totalcountercontainer_cpu_cfs_throttled_seconds_total Total time duration the container has been throttled.
container_cpu_load_average_10sgaugecontainer_cpu_load_average_10s Value of container cpu load average over the last 10 seconds.
container_cpu_system_seconds_totalcountercontainer_cpu_system_seconds_total Cumulative system cpu time consumed in seconds.
container_cpu_usage_seconds_totalcountercontainer_cpu_usage_seconds_total Cumulative cpu time consumed in seconds.
container_cpu_user_seconds_totalcountercontainer_cpu_user_seconds_total Cumulative user cpu time consumed in seconds.
container_file_descriptorsgaugecontainer_file_descriptors Number of open file descriptors for the container.
container_fs_inodes_freegaugecontainer_fs_inodes_free Number of available Inodes
container_fs_inodes_totalgaugecontainer_fs_inodes_total Number of Inodes
container_fs_io_currentgaugecontainer_fs_io_current Number of I/Os currently in progress
container_fs_io_time_seconds_totalcountercontainer_fs_io_time_seconds_total Cumulative count of seconds spent doing I/Os
container_fs_io_time_weighted_seconds_totalcountercontainer_fs_io_time_weighted_seconds_total Cumulative weighted I/O time in seconds
container_fs_limit_bytesgaugecontainer_fs_limit_bytes Number of bytes that can be consumed by the container on this filesystem.
container_fs_read_seconds_totalcountercontainer_fs_read_seconds_total Cumulative count of seconds spent reading
container_fs_reads_bytes_totalcountercontainer_fs_reads_bytes_total Cumulative count of bytes read
container_fs_reads_merged_totalcountercontainer_fs_reads_merged_total Cumulative count of reads merged
container_fs_reads_totalcountercontainer_fs_reads_total Cumulative count of reads completed
container_fs_sector_reads_totalcountercontainer_fs_sector_reads_total Cumulative count of sector reads completed
container_fs_sector_writes_totalcountercontainer_fs_sector_writes_total Cumulative count of sector writes completed
container_fs_usage_bytesgaugecontainer_fs_usage_bytes Number of bytes that are consumed by the container on this filesystem.
container_fs_write_seconds_totalcountercontainer_fs_write_seconds_total Cumulative count of seconds spent writing
container_fs_writes_bytes_totalcountercontainer_fs_writes_bytes_total Cumulative count of bytes written
container_fs_writes_merged_totalcountercontainer_fs_writes_merged_total Cumulative count of writes merged
container_fs_writes_totalcountercontainer_fs_writes_total Cumulative count of writes completed
container_last_seengaugecontainer_last_seen Last time a container was seen by the exporter
container_memory_cachegaugecontainer_memory_cache Number of bytes of page cache memory.
container_memory_failcntcountercontainer_memory_failcnt Number of memory usage hits limits
container_memory_failures_totalcountercontainer_memory_failures_total Cumulative count of memory allocation failures.
container_memory_mapped_filegaugecontainer_memory_mapped_file Size of memory mapped files in bytes.
container_memory_max_usage_bytesgaugecontainer_memory_max_usage_bytes Maximum memory usage recorded in bytes
container_memory_rssgaugecontainer_memory_rss Size of RSS in bytes.
container_memory_swapgaugecontainer_memory_swap Container swap usage in bytes.
container_memory_usage_bytesgaugecontainer_memory_usage_bytes Current memory usage in bytes, including all memory regardless of when it was accessed
container_memory_working_set_bytesgaugecontainer_memory_working_set_bytes Current working set in bytes.
container_network_receive_bytes_totalcountercontainer_network_receive_bytes_total Cumulative count of bytes received
container_network_receive_errors_totalcountercontainer_network_receive_errors_total Cumulative count of errors encountered while receiving
container_network_receive_packets_dropped_totalcountercontainer_network_receive_packets_dropped_total Cumulative count of packets dropped while receiving
container_network_receive_packets_totalcountercontainer_network_receive_packets_total Cumulative count of packets received
container_network_transmit_bytes_totalcountercontainer_network_transmit_bytes_total Cumulative count of bytes transmitted
container_network_transmit_errors_totalcountercontainer_network_transmit_errors_total Cumulative count of errors encountered while transmitting
container_network_transmit_packets_dropped_totalcountercontainer_network_transmit_packets_dropped_total Cumulative count of packets dropped while transmitting
container_network_transmit_packets_totalcountercontainer_network_transmit_packets_total Cumulative count of packets transmitted
container_processesgaugecontainer_processes Number of processes running inside the container.
container_scrape_errorgaugecontainer_scrape_error 1 if there was an error while getting container metrics, 0 otherwise
container_socketsgaugecontainer_sockets Number of open sockets for the container.
container_spec_cpu_periodgaugecontainer_spec_cpu_period CPU period of the container.
container_spec_cpu_quotagaugecontainer_spec_cpu_quota CPU quota of the container.
container_spec_cpu_sharesgaugecontainer_spec_cpu_shares CPU share of the container.
container_spec_memory_limit_bytesgaugecontainer_spec_memory_limit_bytes Memory limit for the container.
container_spec_memory_reservation_limit_bytesgaugecontainer_spec_memory_reservation_limit_bytes Memory reservation limit for the container.
container_spec_memory_swap_limit_bytesgaugecontainer_spec_memory_swap_limit_bytes Memory swap limit for the container.
container_start_time_secondsgaugecontainer_start_time_seconds Start time of the container since unix epoch in seconds.
container_tasks_stategaugecontainer_tasks_state Number of tasks in given state
container_threadsgaugecontainer_threads Number of threads running inside the container
container_threads_maxgaugecontainer_threads_max Maximum number of threads allowed inside the container, infinity if value is zero
container_ulimits_softgaugecontainer_ulimits_soft Soft ulimit values for the container root process. Unlimited if -1, except priority and nice
machine_cpu_coresgaugemachine_cpu_cores Number of logical CPU cores.
machine_cpu_physical_coresgaugemachine_cpu_physical_cores Number of physical CPU cores.
machine_cpu_socketsgaugemachine_cpu_sockets Number of CPU sockets.
machine_memory_bytesgaugemachine_memory_bytes Amount of memory installed on the machine.
machine_nvm_avg_power_budget_wattsgaugemachine_nvm_avg_power_budget_watts NVM power budget.
machine_nvm_capacitygaugemachine_nvm_capacity NVM capacity value labeled by NVM mode (memory mode or app direct mode).
machine_scrape_errorgaugemachine_scrape_error 1 if there was an error while getting machine metrics, 0 otherwise.

Kubernetes状态指标

使用kube-state-metrics实现,集群中各个资源的快照值,比较大。

组件源码仓库:https://github.com/kubernetes/kube-state-metrics

Sample文件:kube-state.txt

名称类型描述
kube_node_annotationsgaugekube_node_annotations Kubernetes annotations converted to Prometheus labels.
kube_node_createdgaugekube_node_created [STABLE] Unix creation timestamp
kube_node_deletion_timestampgaugekube_node_deletion_timestamp Unix deletion timestamp
kube_node_infogaugekube_node_info [STABLE] Information about a cluster node.
kube_node_labelsgaugekube_node_labels [STABLE] Kubernetes labels converted to Prometheus labels.
kube_node_rolegaugekube_node_role The role of a cluster node.
kube_node_spec_taintgaugekube_node_spec_taint [STABLE] The taint of a cluster node.
kube_node_spec_unschedulablegaugekube_node_spec_unschedulable [STABLE] Whether a node can schedule new pods.
kube_node_status_allocatablegaugekube_node_status_allocatable [STABLE] The allocatable for different resources of a node that are available for scheduling.
kube_node_status_capacitygaugekube_node_status_capacity [STABLE] The capacity for different resources of a node.
kube_node_status_conditiongaugekube_node_status_condition [STABLE] The condition of a cluster node.
kube_persistentvolumeclaim_labelsgaugekube_persistentvolumeclaim_labels [STABLE] Kubernetes labels converted to Prometheus labels.
kube_persistentvolumeclaim_annotationsgaugekube_persistentvolumeclaim_annotations Kubernetes annotations converted to Prometheus labels.
kube_persistentvolumeclaim_infogaugekube_persistentvolumeclaim_info [STABLE] Information about persistent volume claim.
kube_persistentvolumeclaim_status_phasegaugekube_persistentvolumeclaim_status_phase [STABLE] The phase the persistent volume claim is currently in.
kube_persistentvolumeclaim_resource_requests_storage_bytesgaugekube_persistentvolumeclaim_resource_requests_storage_bytes [STABLE] The capacity of storage requested by the persistent volume claim.
kube_persistentvolumeclaim_access_modegaugekube_persistentvolumeclaim_access_mode [STABLE] The access mode(s) specified by the persistent volume claim.
kube_persistentvolumeclaim_status_conditiongaugekube_persistentvolumeclaim_status_condition Information about status of different conditions of persistent volume claim.
kube_persistentvolumeclaim_createdgaugekube_persistentvolumeclaim_created Unix creation timestamp
kube_persistentvolumeclaim_deletion_timestampgaugekube_persistentvolumeclaim_deletion_timestamp Unix deletion timestamp
kube_persistentvolume_claim_refgaugekube_persistentvolume_claim_ref [STABLE] Information about the Persistent Volume Claim Reference.
kube_persistentvolume_annotationsgaugekube_persistentvolume_annotations Kubernetes annotations converted to Prometheus labels.
kube_persistentvolume_labelsgaugekube_persistentvolume_labels [STABLE] Kubernetes labels converted to Prometheus labels.
kube_persistentvolume_status_phasegaugekube_persistentvolume_status_phase [STABLE] The phase indicates if a volume is available, bound to a claim, or released by a claim.
kube_persistentvolume_infogaugekube_persistentvolume_info [STABLE] Information about persistentvolume.
kube_persistentvolume_capacity_bytesgaugekube_persistentvolume_capacity_bytes [STABLE] Persistentvolume capacity in bytes.
kube_persistentvolume_createdgaugekube_persistentvolume_created Unix creation timestamp
kube_persistentvolume_deletion_timestampgaugekube_persistentvolume_deletion_timestamp Unix deletion timestamp
kube_pod_completion_timegaugekube_pod_completion_time [STABLE] Completion time in unix timestamp for a pod.
kube_pod_container_infogaugekube_pod_container_info [STABLE] Information about a container in a pod.
kube_pod_container_resource_limitsgaugekube_pod_container_resource_limits The number of requested limit resource by a container. It is recommended to use the kube_pod_resource_limits metric exposed by kube-scheduler instead, as it is more precise.
kube_pod_container_resource_requestsgaugekube_pod_container_resource_requests The number of requested request resource by a container. It is recommended to use the kube_pod_resource_requests metric exposed by kube-scheduler instead, as it is more precise.
kube_pod_container_state_startedgaugekube_pod_container_state_started [STABLE] Start time in unix timestamp for a pod container.
kube_pod_container_status_last_terminated_reasongaugekube_pod_container_status_last_terminated_reason Describes the last reason the container was in terminated state.
kube_pod_container_status_last_terminated_exitcodegaugekube_pod_container_status_last_terminated_exitcode Describes the exit code for the last container in terminated state.
kube_pod_container_status_readygaugekube_pod_container_status_ready [STABLE] Describes whether the containers readiness check succeeded.
kube_pod_container_status_restarts_totalcounterkube_pod_container_status_restarts_total [STABLE] The number of container restarts per container.
kube_pod_container_status_runninggaugekube_pod_container_status_running [STABLE] Describes whether the container is currently in running state.
kube_pod_container_status_terminatedgaugekube_pod_container_status_terminated [STABLE] Describes whether the container is currently in terminated state.
kube_pod_container_status_terminated_reasongaugekube_pod_container_status_terminated_reason Describes the reason the container is currently in terminated state.
kube_pod_container_status_waitinggaugekube_pod_container_status_waiting [STABLE] Describes whether the container is currently in waiting state.
kube_pod_container_status_waiting_reasongaugekube_pod_container_status_waiting_reason [STABLE] Describes the reason the container is currently in waiting state.
kube_pod_createdgaugekube_pod_created [STABLE] Unix creation timestamp
kube_pod_deletion_timestampgaugekube_pod_deletion_timestamp Unix deletion timestamp
kube_pod_infogaugekube_pod_info [STABLE] Information about pod.
kube_pod_ipsgaugekube_pod_ips Pod IP addresses
kube_pod_init_container_infogaugekube_pod_init_container_info [STABLE] Information about an init container in a pod.
kube_pod_init_container_resource_limitsgaugekube_pod_init_container_resource_limits The number of requested limit resource by an init container.
kube_pod_init_container_resource_requestsgaugekube_pod_init_container_resource_requests The number of requested request resource by an init container.
kube_pod_init_container_status_last_terminated_reasongaugekube_pod_init_container_status_last_terminated_reason Describes the last reason the init container was in terminated state.
kube_pod_init_container_status_readygaugekube_pod_init_container_status_ready [STABLE] Describes whether the init containers readiness check succeeded.
kube_pod_init_container_status_restarts_totalcounterkube_pod_init_container_status_restarts_total [STABLE] The number of restarts for the init container.
kube_pod_init_container_status_runninggaugekube_pod_init_container_status_running [STABLE] Describes whether the init container is currently in running state.
kube_pod_init_container_status_terminatedgaugekube_pod_init_container_status_terminated [STABLE] Describes whether the init container is currently in terminated state.
kube_pod_init_container_status_terminated_reasongaugekube_pod_init_container_status_terminated_reason Describes the reason the init container is currently in terminated state.
kube_pod_init_container_status_waitinggaugekube_pod_init_container_status_waiting [STABLE] Describes whether the init container is currently in waiting state.
kube_pod_init_container_status_waiting_reasongaugekube_pod_init_container_status_waiting_reason Describes the reason the init container is currently in waiting state.
kube_pod_annotationsgaugekube_pod_annotations Kubernetes annotations converted to Prometheus labels.
kube_pod_labelsgaugekube_pod_labels [STABLE] Kubernetes labels converted to Prometheus labels.
kube_pod_overhead_cpu_coresgaugekube_pod_overhead_cpu_cores The pod overhead in regards to cpu cores associated with running a pod.
kube_pod_overhead_memory_bytesgaugekube_pod_overhead_memory_bytes The pod overhead in regards to memory associated with running a pod.
kube_pod_ownergaugekube_pod_owner [STABLE] Information about the Pod's owner.
kube_pod_restart_policygaugekube_pod_restart_policy [STABLE] Describes the restart policy in use by this pod.
kube_pod_runtimeclass_name_infogaugekube_pod_runtimeclass_name_info The runtimeclass associated with the pod.
kube_pod_spec_volumes_persistentvolumeclaims_infogaugekube_pod_spec_volumes_persistentvolumeclaims_info [STABLE] Information about persistentvolumeclaim volumes in a pod.
kube_pod_spec_volumes_persistentvolumeclaims_readonlygaugekube_pod_spec_volumes_persistentvolumeclaims_readonly [STABLE] Describes whether a persistentvolumeclaim is mounted read only.
kube_pod_start_timegaugekube_pod_start_time [STABLE] Start time in unix timestamp for a pod.
kube_pod_status_phasegaugekube_pod_status_phase [STABLE] The pods current phase.
kube_pod_status_qos_classgaugekube_pod_status_qos_class The pods current qosClass.
kube_pod_status_readygaugekube_pod_status_ready [STABLE] Describes whether the pod is ready to serve requests.
kube_pod_status_ready_timegaugekube_pod_status_ready_time Readiness achieved time in unix timestamp for a pod.
kube_pod_status_initialized_timegaugekube_pod_status_initialized_time Initialized time in unix timestamp for a pod.
kube_pod_status_container_ready_timegaugekube_pod_status_container_ready_time Readiness achieved time in unix timestamp for a pod containers.
kube_pod_status_reasongaugekube_pod_status_reason The pod status reasons
kube_pod_status_scheduledgaugekube_pod_status_scheduled [STABLE] Describes the status of the scheduling process for the pod.
kube_pod_status_scheduled_timegaugekube_pod_status_scheduled_time [STABLE] Unix timestamp when pod moved into scheduled status
kube_pod_status_unschedulablegaugekube_pod_status_unschedulable [STABLE] Describes the unschedulable status for the pod.
kube_pod_tolerationsgaugekube_pod_tolerations Information about the pod tolerations
kube_pod_service_accountgaugekube_pod_service_account The service account for a pod.

Kubernetes内部指标

集群的内部指标,大部分场景下没什么用,常用于管控类指标监控。

组件源码地址:https://github.com/kubernetes/kubernetes/tree/master/staging/src/k8s.io/apiserver

Sample文件:kube-internal.txt

名称类型描述
apiserver_audit_event_totalcounterapiserver_audit_event_total [ALPHA] Counter of audit events generated and sent to the audit backend.
apiserver_audit_requests_rejected_totalcounterapiserver_audit_requests_rejected_total [ALPHA] Counter of apiserver requests rejected due to an error in audit logging backend.
apiserver_client_certificate_expiration_secondshistogramapiserver_client_certificate_expiration_seconds [ALPHA] Distribution of the remaining lifetime on the certificate used to authenticate a request.
apiserver_delegated_authn_request_duration_secondshistogramapiserver_delegated_authn_request_duration_seconds [ALPHA] Request latency in seconds. Broken down by status code.
apiserver_delegated_authn_request_totalcounterapiserver_delegated_authn_request_total [ALPHA] Number of HTTP requests partitioned by status code.
apiserver_delegated_authz_request_duration_secondshistogramapiserver_delegated_authz_request_duration_seconds [ALPHA] Request latency in seconds. Broken down by status code.
apiserver_delegated_authz_request_totalcounterapiserver_delegated_authz_request_total [ALPHA] Number of HTTP requests partitioned by status code.
apiserver_envelope_encryption_dek_cache_fill_percentgaugeapiserver_envelope_encryption_dek_cache_fill_percent [ALPHA] Percent of the cache slots currently occupied by cached DEKs.
apiserver_storage_data_key_generation_duration_secondshistogramapiserver_storage_data_key_generation_duration_seconds [ALPHA] Latencies in seconds of data encryption key(DEK) generation operations.
apiserver_storage_data_key_generation_failures_totalcounterapiserver_storage_data_key_generation_failures_total [ALPHA] Total number of failed data encryption key(DEK) generation operations.
apiserver_storage_envelope_transformation_cache_misses_totalcounterapiserver_storage_envelope_transformation_cache_misses_total [ALPHA] Total number of cache misses while accessing key decryption key(KEK).
apiserver_webhooks_x509_missing_san_totalcounterapiserver_webhooks_x509_missing_san_total [ALPHA] Counts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
authentication_token_cache_active_fetch_countgaugeauthentication_token_cache_active_fetch_count [ALPHA]
authentication_token_cache_fetch_totalcounterauthentication_token_cache_fetch_total [ALPHA]
authentication_token_cache_request_duration_secondshistogramauthentication_token_cache_request_duration_seconds [ALPHA]
authentication_token_cache_request_totalcounterauthentication_token_cache_request_total [ALPHA]
csi_operations_secondshistogramcsi_operations_seconds [ALPHA] Container Storage Interface operation duration with gRPC error code status total
get_token_countcounterget_token_count [ALPHA] Counter of total Token() requests to the alternate token source
get_token_fail_countcounterget_token_fail_count [ALPHA] Counter of failed Token() requests to the alternate token source
go_gc_duration_secondssummarygo_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
go_goroutinesgaugego_goroutines Number of goroutines that currently exist.
go_infogaugego_info Information about the Go environment.
go_memstats_alloc_bytesgaugego_memstats_alloc_bytes Number of bytes allocated and still in use.
go_memstats_alloc_bytes_totalcountergo_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytesgaugego_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
go_memstats_frees_totalcountergo_memstats_frees_total Total number of frees.
go_memstats_gc_cpu_fractiongaugego_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
go_memstats_gc_sys_bytesgaugego_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytesgaugego_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytesgaugego_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytesgaugego_memstats_heap_inuse_bytes Number of heap bytes that are in use.
go_memstats_heap_objectsgaugego_memstats_heap_objects Number of allocated objects.
go_memstats_heap_released_bytesgaugego_memstats_heap_released_bytes Number of heap bytes released to OS.
go_memstats_heap_sys_bytesgaugego_memstats_heap_sys_bytes Number of heap bytes obtained from system.
go_memstats_last_gc_time_secondsgaugego_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_totalcountergo_memstats_lookups_total Total number of pointer lookups.
go_memstats_mallocs_totalcountergo_memstats_mallocs_total Total number of mallocs.
go_memstats_mcache_inuse_bytesgaugego_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytesgaugego_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytesgaugego_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytesgaugego_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytesgaugego_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytesgaugego_memstats_other_sys_bytes Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytesgaugego_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytesgaugego_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytesgaugego_memstats_sys_bytes Number of bytes obtained from system.
go_threadsgaugego_threads Number of OS threads created.
kubelet_cgroup_manager_duration_secondshistogramkubelet_cgroup_manager_duration_seconds [ALPHA] Duration in seconds for cgroup manager operations. Broken down by method.
kubelet_container_log_filesystem_used_bytesgaugekubelet_container_log_filesystem_used_bytes [ALPHA] Bytes used by the container's logs on the filesystem.
kubelet_containers_per_pod_counthistogramkubelet_containers_per_pod_count [ALPHA] The number of containers per pod.
kubelet_device_plugin_alloc_duration_secondshistogramkubelet_device_plugin_alloc_duration_seconds [ALPHA] Duration in seconds to serve a device plugin Allocation request. Broken down by resource name.
kubelet_device_plugin_registration_totalcounterkubelet_device_plugin_registration_total [ALPHA] Cumulative number of device plugin registrations. Broken down by resource name.
kubelet_docker_operations_duration_secondshistogramkubelet_docker_operations_duration_seconds [ALPHA] Latency in seconds of Docker operations. Broken down by operation type.
kubelet_docker_operations_errors_totalcounterkubelet_docker_operations_errors_total [ALPHA] Cumulative number of Docker operation errors by operation type.
kubelet_docker_operations_timeout_totalcounterkubelet_docker_operations_timeout_total [ALPHA] Cumulative number of Docker operation timeout by operation type.
kubelet_docker_operations_totalcounterkubelet_docker_operations_total [ALPHA] Cumulative number of Docker operations by operation type.
kubelet_http_inflight_requestsgaugekubelet_http_inflight_requests [ALPHA] Number of the inflight http requests
kubelet_http_requests_duration_secondshistogramkubelet_http_requests_duration_seconds [ALPHA] Duration in seconds to serve http requests
kubelet_http_requests_totalcounterkubelet_http_requests_total [ALPHA] Number of the http requests received since the server started
kubelet_managed_ephemeral_containersgaugekubelet_managed_ephemeral_containers [ALPHA] Current number of ephemeral containers in pods managed by this kubelet. Ephemeral containers will be ignored if disabled by the EphemeralContainers feature gate, and this number will be 0.
kubelet_network_plugin_operations_duration_secondshistogramkubelet_network_plugin_operations_duration_seconds [ALPHA] Latency in seconds of network plugin operations. Broken down by operation type.
kubelet_network_plugin_operations_errors_totalcounterkubelet_network_plugin_operations_errors_total [ALPHA] Cumulative number of network plugin operation errors by operation type.
kubelet_network_plugin_operations_totalcounterkubelet_network_plugin_operations_total [ALPHA] Cumulative number of network plugin operations by operation type.
kubelet_node_namegaugekubelet_node_name [ALPHA] The node's name. The count is always 1.
kubelet_pleg_discard_eventscounterkubelet_pleg_discard_events [ALPHA] The number of discard events in PLEG.
kubelet_pleg_last_seen_secondsgaugekubelet_pleg_last_seen_seconds [ALPHA] Timestamp in seconds when PLEG was last seen active.
kubelet_pleg_relist_duration_secondshistogramkubelet_pleg_relist_duration_seconds [ALPHA] Duration in seconds for relisting pods in PLEG.
kubelet_pleg_relist_interval_secondshistogramkubelet_pleg_relist_interval_seconds [ALPHA] Interval in seconds between relisting in PLEG.
kubelet_pod_start_duration_secondshistogramkubelet_pod_start_duration_seconds [ALPHA] Duration in seconds for a single pod to go from pending to running.
kubelet_pod_worker_duration_secondshistogramkubelet_pod_worker_duration_seconds [ALPHA] Duration in seconds to sync a single pod. Broken down by operation type: create, update, or sync
kubelet_pod_worker_start_duration_secondshistogramkubelet_pod_worker_start_duration_seconds [ALPHA] Duration in seconds from seeing a pod to starting a worker.
kubelet_run_podsandbox_duration_secondshistogramkubelet_run_podsandbox_duration_seconds [ALPHA] Duration in seconds of the run_podsandbox operations. Broken down by RuntimeClass.Handler.
kubelet_run_podsandbox_errors_totalcounterkubelet_run_podsandbox_errors_total [ALPHA] Cumulative number of the run_podsandbox operation errors by RuntimeClass.Handler.
kubelet_running_containersgaugekubelet_running_containers [ALPHA] Number of containers currently running
kubelet_running_podsgaugekubelet_running_pods [ALPHA] Number of pods that have a running pod sandbox
kubelet_runtime_operations_duration_secondshistogramkubelet_runtime_operations_duration_seconds [ALPHA] Duration in seconds of runtime operations. Broken down by operation type.
kubelet_runtime_operations_errors_totalcounterkubelet_runtime_operations_errors_total [ALPHA] Cumulative number of runtime operation errors by operation type.
kubelet_runtime_operations_totalcounterkubelet_runtime_operations_total [ALPHA] Cumulative number of runtime operations by operation type.
kubelet_started_containers_errors_totalcounterkubelet_started_containers_errors_total [ALPHA] Cumulative number of errors when starting containers
kubelet_started_containers_totalcounterkubelet_started_containers_total [ALPHA] Cumulative number of containers started
kubelet_started_pods_errors_totalcounterkubelet_started_pods_errors_total [ALPHA] Cumulative number of errors when starting pods
kubelet_started_pods_totalcounterkubelet_started_pods_total [ALPHA] Cumulative number of pods started
kubelet_volume_stats_available_bytesgaugekubelet_volume_stats_available_bytes [ALPHA] Number of available bytes in the volume
kubelet_volume_stats_capacity_bytesgaugekubelet_volume_stats_capacity_bytes [ALPHA] Capacity in bytes of the volume
kubelet_volume_stats_inodesgaugekubelet_volume_stats_inodes [ALPHA] Maximum number of inodes in the volume
kubelet_volume_stats_inodes_freegaugekubelet_volume_stats_inodes_free [ALPHA] Number of free inodes in the volume
kubelet_volume_stats_inodes_usedgaugekubelet_volume_stats_inodes_used [ALPHA] Number of used inodes in the volume
kubelet_volume_stats_used_bytesgaugekubelet_volume_stats_used_bytes [ALPHA] Number of used bytes in the volume
kubernetes_build_infogaugekubernetes_build_info [ALPHA] A metric with a constant '1' value labeled by major, minor, git version, git commit, git tree state, build date, Go version, and compiler from which Kubernetes was built, and platform on which it is running.
plugin_manager_total_pluginsgaugeplugin_manager_total_plugins [ALPHA] Number of plugins in Plugin Manager
process_cpu_seconds_totalcounterprocess_cpu_seconds_total Total user and system CPU time spent in seconds.
process_max_fdsgaugeprocess_max_fds Maximum number of open file descriptors.
process_open_fdsgaugeprocess_open_fds Number of open file descriptors.
process_resident_memory_bytesgaugeprocess_resident_memory_bytes Resident memory size in bytes.
process_start_time_secondsgaugeprocess_start_time_seconds Start time of the process since unix epoch in seconds.
process_virtual_memory_bytesgaugeprocess_virtual_memory_bytes Virtual memory size in bytes.
process_virtual_memory_max_bytesgaugeprocess_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
rest_client_exec_plugin_certificate_rotation_agehistogramrest_client_exec_plugin_certificate_rotation_age [ALPHA] Histogram of the number of seconds the last auth exec plugin client certificate lived before being rotated. If auth exec plugin client certificates are unused, histogram will contain no data.
rest_client_exec_plugin_ttl_secondsgaugerest_client_exec_plugin_ttl_seconds [ALPHA] Gauge of the shortest TTL (time-to-live) of the client certificate(s) managed by the auth exec plugin. The value is in seconds until certificate expiry (negative if already expired). If auth exec plugins are unused or manage no TLS certificates, the value will be +INF.
rest_client_rate_limiter_duration_secondshistogramrest_client_rate_limiter_duration_seconds [ALPHA] Client side rate limiter latency in seconds. Broken down by verb and URL.
rest_client_request_duration_secondshistogramrest_client_request_duration_seconds [ALPHA] Request latency in seconds. Broken down by verb and URL.
rest_client_requests_totalcounterrest_client_requests_total [ALPHA] Number of HTTP requests, partitioned by status code, method, and host.
storage_operation_duration_secondshistogramstorage_operation_duration_seconds [ALPHA] Storage operation duration
storage_operation_errors_totalcounterstorage_operation_errors_total [ALPHA] Storage operation errors (Deprecated since 1.21.0)
storage_operation_status_countcounterstorage_operation_status_count [ALPHA] Storage operation return statuses count (Deprecated since 1.21.0)
volume_manager_total_volumesgaugevolume_manager_total_volumes [ALPHA] Number of volumes in Volume Manager
workqueue_adds_totalcounterworkqueue_adds_total [ALPHA] Total number of adds handled by workqueue
workqueue_depthgaugeworkqueue_depth [ALPHA] Current depth of workqueue
workqueue_longest_running_processor_secondsgaugeworkqueue_longest_running_processor_seconds [ALPHA] How many seconds has the longest running processor for workqueue been running.
workqueue_queue_duration_secondshistogramworkqueue_queue_duration_seconds [ALPHA] How long in seconds an item stays in workqueue before being requested.
workqueue_retries_totalcounterworkqueue_retries_total [ALPHA] Total number of retries handled by workqueue
workqueue_unfinished_work_secondsgaugeworkqueue_unfinished_work_seconds [ALPHA] How many seconds of work has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
workqueue_work_duration_secondshistogramworkqueue_work_duration_seconds [ALPHA] How long in seconds processing an item from workqueue takes.