Monitoring HBase RegionServer Monitoring
Collect and monitor common performance metrics for HBase RegionServer.
Protocol: HTTP
Pre-Monitoring Operations
Review the hbase-site.xml file to obtain the value of the hbase.regionserver.info.port configuration item, which is used for monitoring.
Configuration Parameters
| Parameter Name | Parameter Description |
|---|---|
| Target Host | The IPV4, IPV6, or domain name of the monitored entity. Note ⚠️ Do not include the protocol header (e.g., https://, http://). |
| Port | The port number of the HBase regionserver, default is 16030, i.e., the value of thehbase.regionserver.info.port parameter |
| Task Name | A unique name to identify this monitoring task. |
| Query Timeout | Set the timeout for Kafka connections in milliseconds, default is 3000 ms. |
| Collection Interval | The interval time for periodic data collection in seconds, with a minimum interval of 30 seconds. |
| Probe Before Adding | Whether to probe and check the availability of monitoring before adding new monitoring, only proceed with the addition if the probe is successful. |
| Description Note | Additional notes to identify and describe this monitoring, users can add notes here. |
Collection Metrics
All metric names are directly referenced from the official fields, hence there may be non-standard naming.
Metric Set: server
| Metric Name | Unit | Metric Description |
|---|---|---|
| regionCount | None | Number of Regions |
| readRequestCount | None | Number of read requests since cluster restart |
| writeRequestCount | None | Number of write requests since cluster restart |
| averageRegionSize | MB | Average size of a Region |
| totalRequestCount | None | Total number of requests |
| ScanTime_num_ops | None | Total number of Scan requests |
| Append_num_ops | None | Total number of Append requests |
| Increment_num_ops | None | Total number of Increment requests |
| Get_num_ops | None | Total number of Get requests |
| Delete_num_ops | None | Total number of Delete requests |
| Put_num_ops | None | Total number of Put requests |
| ScanTime_mean | None | Average time of a Scan request |
| ScanTime_min | None | Minimum time of a Scan request |
| ScanTime_max | None | Maximum time of a Scan request |
| ScanSize_mean | bytes | Average size of a Scan request |
| ScanSize_min | None | Minimum size of a Scan request |
| ScanSize_max | None | Maximum size of a Scan request |
| slowPutCount | None | Number of slow Put operations |
| slowGetCount | None | Number of slow Get operations |
| slowAppendCount | None | Number of slow Append operations |
| slowIncrementCount | None | Number of slow Increment operations |
| slowDeleteCount | None | Number of slow Delete operations |
| blockCacheSize | None | Size of memory used by block cache |
| blockCacheCount | None | Number of blocks in Block Cache |
| blockCacheExpressHitPercent | None | Block cache hit ratio |
| memStoreSize | None | Size of Memstore |
| FlushTime_num_ops | None | Number of RS writes to disk/Memstore flushes |
| flushQueueLength | None | Length of Region Flush queue |
| flushedCellsSize | None | Size flushed to disk |
| storeFileCount | None | Number of Storefiles |
| storeCount | None | Number of Stores |
| storeFileSize | None | Size of Storefiles |
| compactionQueueLength | None | Length of Compaction queue |
| percentFilesLocal | None | Percentage of HFile in local HDFS Data Node |
| percentFilesLocalSecondaryRegions | None | Percentage of HFile for secondary region replicas in local HDFS Data Node |
| hlogFileCount | None | Number of WAL files |
| hlogFileSize | None | Size of WAL files |
Metric Set: IPC
| Metric Name | Unit | Metric Description |
|---|---|---|
| numActiveHandler | None | Current number of RITs |
| NotServingRegionException | None | Number of RITs exceeding the threshold |
| RegionMovedException | ms | Duration of the oldest RIT |
| RegionTooBusyException | ms | Duration of the oldest RIT |
Metric Set: JVM
| Metric Name | Unit | Metric Description |
|---|---|---|
| MemNonHeapUsedM | None | Current active RegionServer list |
| MemNonHeapCommittedM | None | Current offline RegionServer list |
| MemHeapUsedM | None | Zookeeper list |
| MemHeapCommittedM | None | Master node |
| MemHeapMaxM | None | Cluster balance load times |
| MemMaxM | None | RPC handle count |
| GcCount | MB | Cluster data reception volume |