Skip to main content

Monitoring HBase RegionServer Monitoring

Collect and monitor common performance metrics for HBase RegionServer.

Protocol: HTTP

Pre-Monitoring Operations

Review the hbase-site.xml file to obtain the value of the hbase.regionserver.info.port configuration item, which is used for monitoring.

Configuration Parameters

Parameter NameParameter Description
Target HostThe IPV4, IPV6, or domain name of the monitored entity. Note ⚠️ Do not include the protocol header (e.g., https://, http://).
PortThe port number of the HBase regionserver, default is 16030, i.e., the value of thehbase.regionserver.info.port parameter
Task NameA unique name to identify this monitoring task.
Query TimeoutSet the timeout for Kafka connections in milliseconds, default is 3000 ms.
Collection IntervalThe interval time for periodic data collection in seconds, with a minimum interval of 30 seconds.
Probe Before AddingWhether to probe and check the availability of monitoring before adding new monitoring, only proceed with the addition if the probe is successful.
Description NoteAdditional notes to identify and describe this monitoring, users can add notes here.

Collection Metrics

All metric names are directly referenced from the official fields, hence there may be non-standard naming.

Metric Set: server

Metric NameUnitMetric Description
regionCountNoneNumber of Regions
readRequestCountNoneNumber of read requests since cluster restart
writeRequestCountNoneNumber of write requests since cluster restart
averageRegionSizeMBAverage size of a Region
totalRequestCountNoneTotal number of requests
ScanTime_num_opsNoneTotal number of Scan requests
Append_num_opsNoneTotal number of Append requests
Increment_num_opsNoneTotal number of Increment requests
Get_num_opsNoneTotal number of Get requests
Delete_num_opsNoneTotal number of Delete requests
Put_num_opsNoneTotal number of Put requests
ScanTime_meanNoneAverage time of a Scan request
ScanTime_minNoneMinimum time of a Scan request
ScanTime_maxNoneMaximum time of a Scan request
ScanSize_meanbytesAverage size of a Scan request
ScanSize_minNoneMinimum size of a Scan request
ScanSize_maxNoneMaximum size of a Scan request
slowPutCountNoneNumber of slow Put operations
slowGetCountNoneNumber of slow Get operations
slowAppendCountNoneNumber of slow Append operations
slowIncrementCountNoneNumber of slow Increment operations
slowDeleteCountNoneNumber of slow Delete operations
blockCacheSizeNoneSize of memory used by block cache
blockCacheCountNoneNumber of blocks in Block Cache
blockCacheExpressHitPercentNoneBlock cache hit ratio
memStoreSizeNoneSize of Memstore
FlushTime_num_opsNoneNumber of RS writes to disk/Memstore flushes
flushQueueLengthNoneLength of Region Flush queue
flushedCellsSizeNoneSize flushed to disk
storeFileCountNoneNumber of Storefiles
storeCountNoneNumber of Stores
storeFileSizeNoneSize of Storefiles
compactionQueueLengthNoneLength of Compaction queue
percentFilesLocalNonePercentage of HFile in local HDFS Data Node
percentFilesLocalSecondaryRegionsNonePercentage of HFile for secondary region replicas in local HDFS Data Node
hlogFileCountNoneNumber of WAL files
hlogFileSizeNoneSize of WAL files

Metric Set: IPC

Metric NameUnitMetric Description
numActiveHandlerNoneCurrent number of RITs
NotServingRegionExceptionNoneNumber of RITs exceeding the threshold
RegionMovedExceptionmsDuration of the oldest RIT
RegionTooBusyExceptionmsDuration of the oldest RIT

Metric Set: JVM

Metric NameUnitMetric Description
MemNonHeapUsedMNoneCurrent active RegionServer list
MemNonHeapCommittedMNoneCurrent offline RegionServer list
MemHeapUsedMNoneZookeeper list
MemHeapCommittedMNoneMaster node
MemHeapMaxMNoneCluster balance load times
MemMaxMNoneRPC handle count
GcCountMBCluster data reception volume