Zabbix
Zabbix is an enterprise-class open source distributed monitoring solution designed to monitor and track the performance and availability of IT infrastructure, including networks, servers, virtual machines, applications, services, databases, websites, cloud services, and more.[1] It provides a flexible notification mechanism that allows users to set up alerts via email or other channels when predefined thresholds are exceeded, along with comprehensive reporting and data visualization tools for capacity planning and analysis.[1] Released under the GNU AGPLv3 license, Zabbix is free to use and distribute, with its source code publicly available, enabling customization and extension through plugins and integrations.[1] Created by Alexei Vladishev in the early 2000s, Zabbix is actively developed and supported by Zabbix SIA, a company dedicated to advancing open source monitoring technologies.[2] The software supports both active and passive monitoring modes, including polling for data collection and trapping for real-time events, and features a web-based frontend for configuration, reporting, and statistics access.[1] Scalable from single devices to environments with hundreds of thousands of monitored entities, Zabbix is deployed on-premises, in the cloud, or as a SaaS offering, with no licensing fees or per-device costs, making it suitable for enterprises, managed service providers, and organizations seeking low total cost of ownership.[3] Widely adopted by Fortune 500 companies and major brands such as Dell, Orange, and ICANN, Zabbix has earned recognition for its reliability and versatility in sectors including IT services, manufacturing, and telecommunications.[3] It has been named a Gartner Peer Insights Customers' Choice for IT Infrastructure Monitoring Tools multiple times, including in 2019 and 2020, reflecting high user satisfaction with its power, ease of integration, and support for multitenant environments.[4]Introduction
Overview
Zabbix is an enterprise-class, open-source distributed monitoring software designed to oversee IT infrastructure, encompassing networks, servers, virtual machines, cloud services, and applications.[5] It enables real-time tracking of availability, performance, and integrity metrics across diverse environments, helping organizations prevent downtime, detect issues proactively, and optimize resource utilization.[6] The platform excels in scalability, supporting the collection of millions of metrics from hundreds of thousands of devices through features like distributed proxies and high-availability configurations.[6] Deployment options include on-premise installations, cloud-based SaaS via Zabbix Cloud, and hybrid setups integrated with major providers such as AWS, Azure, and Google Cloud, allowing flexible adaptation to varying infrastructure needs.[7][8][9][10] Zabbix is trusted by prominent organizations including Dell, Orange, and ICANN for its robust monitoring capabilities in production environments.[11][12][13] As an open-source solution with no per-device licensing fees, it offers a low total cost of ownership while providing enterprise-grade functionality without vendor lock-in.[14]Development and Licensing
Zabbix originated as a personal project by Alexei Vladishev, a software engineer based in Riga, Latvia, who developed the initial monitoring solution to address his own needs for tracking IT infrastructure performance. The first public release of Zabbix occurred in April 2001, marking the beginning of its evolution as an open-source tool.[15] In 2005, Zabbix SIA was founded in Riga, Latvia, with Vladishev as CEO and owner, to commercialize the software, provide professional technical support, and sustain its ongoing development. The company, which also maintains offices in other regions, has since become the primary maintainer of Zabbix, offering enterprise-level services while preserving its open-source foundation.[16] Zabbix has been distributed under the GNU General Public License version 2 (GPLv2) or later since its inception in 2001, ensuring free access for both commercial and non-commercial use. Starting with version 7.0, released as a Long-Term Support (LTS) edition, the licensing shifted to the GNU Affero General Public License version 3 (AGPLv3) to better protect the project's copyleft principles in networked environments, adapt to modern software distribution practices, and balance community openness with enterprise sustainability. This change requires that modifications made to the software, when accessed over a network, be made available under the same license, without altering usage rights for end users.[17] The development of Zabbix follows a community-driven model, where contributions from users and developers are welcomed through the official Git repository hosted at git.zabbix.com, subject to a Contributor License Agreement to ensure compatibility with the project's licensing. Zabbix SIA oversees the core maintenance, integration of contributions, and provision of professional support services. Governance includes a structured release policy: standard versions are issued every six months with 12 months of support, while LTS versions are released every 1.5 years and receive five years of support, comprising three years of full support followed by two years of limited support.[18][19]Architecture
Core Components
The core components of Zabbix form a distributed architecture designed for scalable monitoring of IT infrastructure. At the heart is the Zabbix server, which serves as the central hub responsible for collecting performance and availability data from monitored hosts, processing events, calculating triggers, and managing the overall configuration and operational data. Written primarily in C for optimal performance and low memory footprint, the server operates as a daemon on Unix-like systems and supports multiple concurrent processes such as pollers for data retrieval and alert managers for notifications.[20][21] Zabbix agents are lightweight software components installed on the devices or hosts being monitored, enabling direct data collection from local resources like CPU usage, disk space, and network interfaces. There are two variants: the traditional Zabbix agent (Agent 1), written in C and supporting a wide range of platforms, and the newer Zabbix agent 2 (introduced in version 4.4), written in Go with some reused C code for enhanced flexibility and plugin support. Agents operate in active mode, where they push data periodically to the server or proxy, or passive mode, where the server pulls data on request, allowing efficient integration into various network topologies.[22][23][24] For distributed environments, such as remote locations behind firewalls, Zabbix proxies act as intermediaries to offload the server's workload by collecting data from agents in their vicinity and forwarding it to the central server after local buffering. Like the server, proxies are written in C and require a separate lightweight database such as SQLite, MySQL, or PostgreSQL to store temporary data. This setup enables scalable monitoring without direct exposure of the main server to all network segments.[25][26] The frontend provides a web-based interface for users to configure Zabbix, view dashboards, and visualize monitoring data, typically running on the same host as the server using PHP (version 8.0 or later) with web servers like Apache 2.4+ or Nginx 1.20+. It connects to the backend database, supporting MySQL 8.0+, MariaDB 10.5+, or PostgreSQL 13.0+ for data persistence and retrieval, ensuring a responsive and customizable user experience across browsers.[27][28] Additional specialized gateways extend Zabbix's capabilities for specific monitoring needs. The Java Gateway, a daemon written in Java, facilitates JMX (Java Management Extensions) monitoring by allowing the Zabbix server or proxy to query it for counter values from Java applications, acting as a passive intermediary without caching data. For network devices, Zabbix integrates SNMP trap processing through external tools like snmptrapd, combined with scripts to forward traps to the server for real-time event detection and alerting.[29][30]Data Collection and Storage
Zabbix collects monitoring data through a combination of pull and push mechanisms. The Zabbix server actively polls Zabbix agents and proxies at user-configurable intervals, with a minimum polling interval of one second to support high-intensity data collection.[6] This polling model allows the server to retrieve metrics from monitored hosts in a controlled manner. Additionally, Zabbix supports a push model where agents or external systems send data directly to the server via trapper processes, enabling real-time ingestion without constant polling.[6] Zabbix proxies act as intermediaries, collecting data on behalf of the server and forwarding it at intervals, which distributes the load in large environments.[24] Collected data is stored in the Zabbix database using two primary structures: history tables for raw, short-term values and trends tables for long-term aggregated data. History tables retain each individual value collected from items, typically for periods ranging from one hour to 25 years, depending on configuration, and are used for immediate analysis in features like latest data views.[31] Trends, in contrast, store hourly averages, minima, maxima, and counts of values, providing compressed representations for efficient long-term storage and querying over extended periods, also configurable up to 25 years.[31] This dual approach balances detail and storage efficiency, with trends reducing database size by aggregating data while preserving statistical insights. Scalability in data handling is achieved through distributed proxies and database optimizations. Proxies enable load balancing by offloading collection tasks from the central server, supporting an unlimited number of proxies and proxy groups for vertical and horizontal scaling across large infrastructures.[32] For high-volume environments, database partitioning divides history and trends tables—daily for history and monthly for trends—improving query performance and manageability in setups monitoring thousands of hosts.[33] Zabbix can process millions of metrics per second, particularly with distributed preprocessing across servers and proxies, allowing it to handle environments with hundreds of thousands of monitored devices.[34] Security measures protect data during collection and storage. Communications between Zabbix components, including server-to-agent, server-to-proxy, and proxy-to-agent connections, can be encrypted using Transport Layer Security (TLS) protocols version 1.2 or 1.3, with support for certificate-based or pre-shared key authentication.[35] Role-based access control (RBAC) governs data visibility through predefined or custom user roles, such as Guest, User, Admin, and Super Admin, which restrict access to specific hosts, items, and dashboards based on permissions.[36] Performance is optimized via automated processes that manage data lifecycle and querying efficiency. The housekeeping process periodically purges outdated data from history, trends, events, and other tables according to configurable retention periods, preventing database bloat and maintaining system responsiveness; it can be enabled globally or overridden per item.[37] Trend calculations further enhance performance by enabling fast retrieval of summarized data for historical analysis, reducing the need to scan raw history tables in large datasets.[31] In conjunction with partitioning and proxy distribution, these features ensure reliable operation even under heavy loads.Features
Monitoring and Discovery
Zabbix employs network discovery rules to automatically detect and add hosts within specified IP ranges, enabling proactive monitoring of dynamic IT environments. These rules allow Zabbix to scan networks periodically, checking for device availability using protocols such as SNMP for retrieving management data, LLDP for discovering neighboring devices, and ICMP for basic ping tests. Upon detection, Zabbix generates events that can trigger actions to dynamically add hosts to the monitoring system, assigning them to default groups like "Discovered hosts" and linking appropriate templates without manual intervention. As of Zabbix 7.4 (2025), network discovery supports nested host prototypes, allowing discovered hosts to automatically discover further sub-hosts in hierarchical environments like hypervisors and virtual machines.[38][39] Data collection in Zabbix relies on various item types, each designed to gather metrics from different sources. The Zabbix agent, installed on monitored hosts, collects internal system data such as CPU load or memory usage through active or passive checks. Simple checks, performed directly by the Zabbix server without an agent, include ICMP pings to verify host reachability. For network devices, SNMP items retrieve operational statistics like interface traffic. JMX items, facilitated by the Zabbix Java Gateway, monitor Java application metrics, such as heap memory usage. Calculated items derive new values from existing data using formulas, for instance, computing average response times across multiple hosts. Additionally, HTTP agent items perform synthetic checks by polling web endpoints to assess page load times or content availability.[40] Low-level discovery (LLD) enhances automation by dynamically identifying and configuring monitoring for variable entities on hosts, such as filesystems or network interfaces, eliminating the need for manual item creation. LLD operates through discovery rules defined in templates, where an item key— likevfs.fs.discovery for mounted filesystems—queries the host and returns JSON data with macros (e.g., {#FSNAME} for filesystem names). Zabbix then uses prototypes to generate corresponding items, triggers, and graphs for each discovered entity, updating them as the environment changes. This process supports preprocessing steps to filter or transform data, ensuring scalability in environments with fluctuating resources. As of Zabbix 7.4 (2025), LLD supports nested discovery rules and prototypes, enabling multi-tier automation for complex structures like discovering interfaces on discovered VMs.[41][42]
Application monitoring in Zabbix extends to business services and synthetic user simulations, providing oversight of end-to-end performance. Business service monitoring builds a hierarchical service tree to track availability and SLA compliance, mapping problems from underlying IT components to high-level services via tags, thus identifying bottlenecks in workflows like email delivery. Synthetic checks, implemented through web scenarios and browser items, simulate user interactions on web applications, measuring response times, step execution, and required string presence to detect usability issues. Log file monitoring complements this by scanning files for patterns using regular expressions, alerting on matches like error keywords in /var/[log](/page/Log)/[syslog](/page/Syslog), with support for rotation and real-time analysis limited to recent entries for efficiency.[43][44][45]
Representative metrics monitored include CPU utilization via agent keys like system.cpu.util, disk space with vfs.fs.size, and network traffic through SNMP OIDs for inbound/outbound bytes. Custom scripts, executed as user parameters in the agent configuration, allow tailored metrics, such as application-specific counters, to be pushed back to Zabbix for storage and analysis.[40][46]
Alerting and Reporting
Zabbix alerting processes data collected from monitored items to detect anomalies through triggers, which evaluate conditions and initiate actions when problems arise. Triggers use logical expressions to assess item values, such as firing when the average CPU utilization exceeds 80% over five minutes, calculated using functions likeavg(/host/[system](/page/System).cpu.load,5m)>0.8. These expressions support time-based periods, value counts, and trend analysis for predictive alerting, with triggers recalculating on new data or periodically for certain functions. Trigger dependencies ensure that child triggers only fire if parent triggers are in an OK state, preventing cascading alerts from interrelated issues.[47][48]
Triggers are assigned severity levels from 1 to 5 to indicate problem urgency, influencing escalation and notification priorities. The default levels include:
| Severity | Level | Color | Description |
|---|---|---|---|
| Not classified | 0 | Gray | Default for unassigned triggers. |
| Information | 1 | Light blue | Informational events providing insights without immediate action. |
| Warning | 2 | Yellow | Potential issues requiring investigation. |
| Average | 3 | Orange | Significant problems needing prompt resolution. |
| High | 4 | Light red | Critical issues demanding immediate attention. |
| Disaster | 5 | Red | Severe incidents risking outages or data loss. |
trendavg over historical data, to predict future issues like capacity exhaustion.[53][54][55]
Reporting features generate scheduled summaries for compliance and analysis, producing PDF or CSV exports of dashboard content at intervals like daily or monthly. Reports include SLA metrics with breakdowns by service, reporting period, and root cause insights from underlying trigger events, sent automatically to specified users or groups. This supports business service monitoring by quantifying uptime against defined objectives, such as 99.9% availability, and highlighting contributing problems.[56][57]
History
Origins and Founding
Zabbix was developed by Alexei Vladishev, a software engineer seeking an affordable alternative to expensive commercial monitoring tools like HP OpenView and IBM Tivoli, which he encountered while working in IT infrastructure management. Motivated by the need for a robust, open-source solution to monitor network servers, applications, and services, Vladishev began the project in 1998, initially as a personal tool to address his own monitoring requirements in a resource-constrained environment.[58] The first public release of Zabbix occurred in April 2001 with version 1.0 alpha1, marking its debut as an enterprise-class distributed monitoring solution under the GNU General Public License (GPLv2). This initial version focused on basic metrics collection through agent-based monitoring, which allowed for more detailed and efficient data gathering compared to SNMP-only approaches prevalent in tools like early Nagios implementations, though Zabbix aimed for greater integration of alerting, visualization, and scalability features from the outset. Early development emphasized simplicity and extensibility, evolving from a rudimentary script-based system to handle growing demands in open-source IT environments.[15][58] Hosted on SourceForge.net from its inception, Zabbix saw initial community adoption in the early 2000s among sysadmins and IT teams looking for free, scalable alternatives to proprietary software, with downloads and forum discussions building momentum organically. Vladishev self-funded the project through its pre-commercial phase, prioritizing enterprise-grade features like distributed polling and high-availability support to differentiate it in the burgeoning open-source monitoring landscape. A key early milestone was the release of the first beta version in 2001, which stabilized core agent functionalities and laid the groundwork for broader use cases beyond basic network checks. In April 2005, Vladishev founded Zabbix SIA to provide professional support and advance the project's development.[59][15][15]Release History
Zabbix's first stable release, version 1.0, was made available on March 23, 2004, establishing the foundational agent-server model for distributed monitoring of IT infrastructure.[60] This version introduced core components such as the Zabbix agent for data collection and the server for processing and storage, enabling basic polling and trapping mechanisms. Subsequent milestone releases marked significant evolutionary steps. Version 2.0, released on May 21, 2012, added web monitoring capabilities, allowing users to simulate user interactions with web applications and track response times and content changes.[61] In 2016, version 3.0 on February 15 introduced encryption support using TLS for communications between Zabbix components, enhancing data security across agents, proxies, and the server.[62] Version 4.0, an LTS release on October 1, 2018, improved proxy functionality with support for passive mode and better scalability for distributed environments.[63] The 5.0 LTS release on May 11, 2020, expanded encryption to database connections, promoting secure end-to-end data handling.[64] Version 6.0 LTS, launched February 14, 2022, brought event correlation features for linking related alerts and reducing noise in problem detection.[65] The 7.0 LTS version, released June 4, 2024, shifted to the GNU Affero General Public License version 3 (AGPLv3) and included observability enhancements such as synthetic web scenario monitoring and multi-factor authentication.[66] Recent standard releases include 7.2 on December 10, 2024, focusing on usability improvements, and 7.4 on June 30, 2025, with the latest minor update 7.4.5 issued on October 31, 2025, addressing stability fixes.[19][67] For the 7.0 LTS, support extends five years, with full support until June 2027 and limited security support until June 2029.[19] Zabbix follows a structured release policy: long-term support (LTS) versions every 1.5 years with five years of support, and standard releases every six months with 12 months of support.[19] Standard releases often introduce developmental features, such as the planned 7.1, while LTS versions emphasize stability.[19] Looking ahead, the roadmap outlines Zabbix 8.0 LTS for Q2 2026, announced at the Zabbix Summit 2025, with priorities including advanced log management, enhanced interoperability via OpenTelemetry, and complex event processing for better root cause analysis.[68][69] No significant disruptions to the 2025 release cadence have been reported.[70]| Version | Release Date | Type | Key Innovation | Support End (Full/Limited) |
|---|---|---|---|---|
| 1.0 | March 23, 2004 | Stable | Agent-server model | Unsupported |
| 2.0 | May 21, 2012 | Major | Web monitoring | Unsupported |
| 3.0 | February 15, 2016 | Major | TLS encryption | Unsupported |
| 4.0 LTS | October 1, 2018 | LTS | Proxy improvements | October 2021 / October 2023 |
| 5.0 LTS | May 11, 2020 | LTS | Database encryption | May 2023 / May 2025 |
| 6.0 LTS | February 14, 2022 | LTS | Event correlation | February 28, 2025 / February 28, 2027 |
| 7.0 LTS | June 4, 2024 | LTS | AGPLv3, observability | June 2027 / June 2029 |
| 7.2 | December 10, 2024 | Standard | Usability enhancements | June 30, 2025 / December 31, 2025 |
| 7.4 | June 30, 2025 | Standard | Performance tweaks | Q1 2026 / Q3 2026 |
Deployment
Installation Options
Zabbix requires a 64-bit operating system, with Linux distributions recommended for the server, such as those based on Debian, Red Hat, or SUSE.[27] The minimum hardware specifications include 8 GiB of RAM and 2 CPU cores for small installations monitoring up to 1,000 metrics, while larger deployments (e.g., 100,000 metrics) recommend 64 GiB RAM and 16 cores.[27] Supported databases include MySQL (8.0.30 to 9.0.x), MariaDB (10.5 to 12.0.x), and PostgreSQL (13 to 18.x). For enhanced performance with extensive time-series data in large-scale setups, TimescaleDB (a PostgreSQL extension) is recommended.[27] For on-premise installations, Zabbix provides pre-built packages for major Linux distributions, including Debian, Ubuntu, RHEL derivatives like AlmaLinux and CentOS Stream, and SUSE Linux Enterprise Server.[71] These packages are available via the official Zabbix repository at repo.zabbix.com and support components such as the server, frontend, and agent, typically installed using tools like apt for Debian-based systems or yum/dnf for RPM-based ones.[72] Docker containers offer a quick setup option, where official images for the server, proxy, and Java gateway can be pulled from Docker Hub and deployed using Docker Compose for multi-container environments, including database integration.[73] Source compilation is available for custom builds, involving downloading the tarball, configuring with options like database type, and running make install on supported UNIX-like systems.[74] Cloud deployment options include Zabbix Cloud, a fully managed SaaS platform that eliminates on-premise hardware needs by providing scalable monitoring nodes deployable with a few clicks, handling server, database, and frontend automatically.[7] Additionally, pre-configured Zabbix appliance images are available on major cloud marketplaces, such as AWS, Azure, and Google Cloud Platform, allowing VM-based deployment in minutes via their respective consoles.[75] Zabbix proxies, used for distributed monitoring, can be installed via dedicated packages supporting SQLite for lightweight setups or full databases like PostgreSQL, with options for active or passive modes configured during installation.[26] Agents, essential for host monitoring, have separate packages for Linux (via repositories) and Windows (MSI installers supporting 64-bit systems from Windows Server 2003 onward), enabling passive or active communication with the server or proxy.[76] Initial setup begins with creating the Zabbix database using provided SQL scripts for the chosen DBMS, followed by editing the server configuration file (zabbix_server.conf) to specify database credentials and other parameters like listen ports.[74] The frontend is then accessed via a web browser at http://<server_ip>/zabbix on a PHP-supported web server like Apache or Nginx, guiding through a wizard to connect to the database and complete the installation with default credentials (Admin/zabbix).[77]Configuration Essentials
After installation, configuring hosts forms the foundation of Zabbix monitoring setup. To add a host via the frontend, navigate to Data collection → Hosts and click "Create host" in the upper-right corner. Enter a unique host name (alphanumeric with spaces, dots, dashes, or underscores permitted, excluding leading or trailing spaces), an optional visible name for display purposes, and assign at least one host group for logical organization and permission assignment—non-existent group names can be entered to create new ones on the fly. Interfaces for agent, SNMP, JMX, or IPMI must be defined, specifying IP/DNS addresses, ports (e.g., 10050 for agents), and connection types. Templates can then be linked directly in the host form using a text input or "Select" button, automatically applying predefined items, triggers, graphs, and dashboards to the host; unlinking options include preserving or clearing these entities. Hosts can be enabled or disabled via a checkbox, and additional tabs allow configuration of IPMI authentication, tags with macros, host-level macros, inventory modes (manual or automatic), and encryption settings like PSK or certificates.[78] Host groups enhance management by grouping hosts logically, with each host requiring membership in at least one group to facilitate permissions and actions like mass updates. To create a group, go to Data collection → Host groups and click "Create host group," entering a unique name and optionally assigning hosts or templates. Groups support inheritance and are essential for scaling configurations across similar devices.[79] The template system in Zabbix enables reusable configurations, allowing a single set of monitoring entities—such as items, triggers, graphs, low-level discoveries, and dashboards—to be applied to multiple hosts efficiently. Templates are created in Data collection → Templates by clicking "Create template," specifying a unique name, visible name, groups, and description, then populating with entities via creation, copying, or import. Linking a template to a host inherits all its elements, promoting consistency; for example, unlinking preserves entities on the host unless "Unlink and clear" is selected to remove them. Official out-of-the-box templates cover common devices, including "Linux by Zabbix agent" for monitoring CPU, memory, disk, and network on Unix-like systems (requiring Zabbix agent 7.4 or later), and "Windows by Zabbix agent" for similar metrics on Windows hosts (also requiring agent 7.4 or later).[80][81][82] Customization in templates relies on user macros, which provide flexibility by substituting variables in items, triggers, and other elements. Macros are defined at global, template, or host levels as name-value pairs (e.g., plain text, secret text, or Vault-integrated), supporting formats like {MACRO} in keys or {MACRO} for user-defined ones; they inherit hierarchically, allowing overrides at lower levels without altering the [template](/page/Template). For instance, a macro like {CPU.LOAD.CRIT:1m} can define critical thresholds adaptable per host. This approach ensures templates remain generic yet tailored, avoiding hard-coded values.[83] User management in Zabbix employs role-based access control (RBAC) through user groups and roles to enforce permissions granularly. Users are organized into groups via Users → User groups → Create user group, where a unique name is set, members added via dropdown, and access media like frontend or LDAP specified; groups can be enabled/disabled and support multi-factor authentication defaults. Permissions are assigned per host or template group as read-write (full access), read (view-only), or deny (no access), with the strictest rule applying in conflicts—read-write overrides read, but deny overrides both. Roles, configurable in Administration → User roles, build on base types (User, Admin, Super admin) to define fine-grained UI access, such as module visibility or action permissions, enhancing security by limiting capabilities. A user can belong to multiple groups, aggregating permissions logically.[84][36] For authentication, Zabbix integrates with LDAP servers, including Microsoft Active Directory and OpenLDAP, to validate usernames and passwords externally while requiring local user accounts. Configuration occurs in Administration → Authentication → LDAP settings, enabling LDAP or just-in-time (JIT) provisioning to auto-create/update users on first login based on LDAP attributes like group membership; multiple servers can be defined per user group, with authentication attempting the alphabetically first viable server. Options include case-sensitive logins, base DN for searches, and provisioning periods (default 3600 seconds) to sync changes like group moves, ensuring seamless enterprise integration without storing credentials locally.[85] Basic tuning optimizes performance post-installation by adjusting parameters in configuration files like zabbix_server.conf. Polling intervals are controlled via process counts, such as StartPollers (default 5, up to 1000) for general pollers, StartAgentPollers (default 1) for agent-specific checks, and StartSNMPPollers (default 1) for SNMP; these determine parallel data collection threads, with CacheUpdateFrequency (default 10 seconds) refreshing the configuration cache. Buffer sizes include CacheSize (default 32M for configuration data), HistoryCacheSize (default 16M for recent values), ValueCacheSize (default 8M for history access), and TrendCacheSize (default 4M for aggregated trends), all tunable in bytes to balance memory usage and query speed—larger values reduce database load but increase RAM demands. Other settings like Timeout (default 3 seconds for connections) and HousekeepingFrequency (default 1 hour) fine-tune operations; changes require server restart.[86] Enabling HTTPS for the frontend secures web access by configuring the hosting web server (e.g., Apache or Nginx) with TLS 1.2 or 1.3 certificates, as Zabbix's built-in encryption focuses on component communications rather than browser sessions. Follow web server-specific SSL setup guides, ensuring certificate validity and strong ciphers to protect login and data transmission.[87] Troubleshooting begins with log analysis, as Zabbix components log to files specified in their configs (e.g., zabbix_server.conf or zabbix_agentd.conf under LogFile, defaulting to system logs if unspecified). Server and agent logs support levels from 0 (no logs) to 5 (debug), with level 3 (warnings) standard and 4/5 for diagnostics; rotation occurs at LogFileSize (default 1 MB). Common errors include database connectivity failures, such as "Lost connection to MySQL server" (error 2013), often due to network timeouts or high load—mitigate by checking DB server keepalive settings (e.g., net.ipv4.tcp_keepalive_time to 300) and ensuring Zabbix's DBHost, DBName, and DBUser parameters match. Agent logs may reveal unreachable hosts from firewall blocks or mismatched ports, while server logs flag proxy issues or cache overflows; use tools like tail -f on logs and zabbix_server -R log_level_increase for real-time debugging. For inaccessible backends, Zabbix notifies and retries connections automatically.[86][88][89]Integrations and Extensions
Built-in Integrations
Zabbix provides native support for several standard protocols to facilitate data collection from diverse IT environments. Simple Network Management Protocol (SNMP) enables monitoring of network devices such as routers, switches, and printers by polling for management information bases (MIBs) or receiving traps. Intelligent Platform Management Interface (IPMI) allows direct access to hardware sensors for server health metrics like temperature, fan speed, and power usage without relying on the operating system. Java Management Extensions (JMX) integrates with Java-based applications, querying MBeans for performance data such as heap usage and thread counts via the Zabbix Java gateway.[90] For database monitoring, Open Database Connectivity (ODBC) supports querying relational databases like MySQL, PostgreSQL, and Oracle through standardized SQL statements, while Java Database Connectivity (JDBC) extends this capability for Java environments using the same gateway.[91] Notification capabilities are built into Zabbix through configurable media types that deliver alerts via multiple channels. Email notifications send formatted messages with trigger details, escalation options, and attachments for immediate team awareness. Short Message Service (SMS) provides concise alerts to mobile devices for critical incidents, supporting providers like Twilio or direct modem connections. For issue tracking, webhooks enable seamless integration with tools like Jira by posting JSON payloads to create or update tickets automatically upon trigger activation.[92] Similarly, ServiceNow integration uses webhooks to generate incidents, linking Zabbix events to service desk workflows for streamlined resolution.[93] Zabbix includes out-of-the-box templates for virtualization and cloud platforms, supporting automatic discovery and metric collection. VMware environments are monitored via vCenter or ESXi APIs, tracking hosts, virtual machines, datastores, and clusters for resource utilization and events.[94] Microsoft Hyper-V is supported through Windows agent-based templates that discover and monitor virtual machines, hosts, and storage for CPU, memory, and disk performance.[82] In cloud settings, Amazon Web Services (AWS) templates cover services like EC2 instances, RDS databases, S3 buckets, and Elastic Load Balancers via API polling for metrics such as billing, latency, and availability.[95] Microsoft Azure monitoring templates focus on virtual machines, scale sets, and costs, using HTTP-based discovery without external scripts.[96] OpenStack integration provides HTTP-based templates for Nova compute, Neutron networking, and Cinder storage, enabling auto-discovery of instances and quotas.[97] Log and event monitoring is handled natively for system-level insights. Syslog messages from network devices can be captured as traps or monitored via file tailing on syslog servers, allowing correlation of events with thresholds. Windows event logs are queried using active checks with WinEventLog keys, filtering by source, level, or ID for security and application errors.[98] For advanced log analysis, Zabbix supports data export to external systems like the ELK Stack (Elasticsearch, Logstash, Kibana) through trapper items or API pushes, facilitating correlation between metrics and logs. Representative examples highlight Zabbix's extensibility in modern and industrial setups. Kubernetes clusters are monitored using official templates that discover nodes, pods, and services via the Kubernetes API, tracking container metrics, resource limits, and deployment health.[99] In industrial IoT and operational technology (OT) environments, the built-in Modbus plugin for Zabbix Agent 2 enables polling of industrial devices like PLCs for sensor data such as temperature or pressure, supporting multiple concurrent connections over TCP or RTU.[100] As of Zabbix 7.4 (released July 2025), additional built-in templates include support for Palo Alto Networks firewalls, Pure Storage FlashArray, Azure SQL Managed Instance, and Azure MSSQL, along with refreshed integrations for Microsoft Teams, Jira, PagerDuty, and GitHub.[101]API and Customization
The Zabbix API provides a programmatic interface for automating monitoring tasks, enabling users to create, update, and retrieve configuration elements such as hosts, items, triggers, and graphs. It operates as an HTTP-based service integrated into the Zabbix web frontend and adheres to the JSON-RPC 2.0 protocol, where requests and responses are structured as discrete method calls with JSON payloads. For instance, thehost.create method allows automated provisioning of new hosts by specifying parameters like hostnames, interfaces, and group associations, while trigger.create facilitates the definition of alert conditions based on item data. This API supports extensive automation, including bulk operations for trigger manipulation, such as updating expressions or dependencies across multiple entities.[102][103][104]
To interact with the API, users can leverage client libraries in various languages, including Python (via the pyzabbix package), PHP (through native extensions or wrappers), and Go (with bindings like go-zabbix-api). Authentication occurs via session IDs obtained through user.login or, in Zabbix 5.4 and later, API tokens generated under Administration → Users → API tokens, which offer finer-grained control and expiration options compared to traditional username-password logins. These tokens must be included in the auth parameter of API requests, ensuring secure access without exposing credentials in scripts.[102][105][106]
Zabbix supports user-defined external scripts to extend data collection and response mechanisms beyond built-in capabilities. External checks, executed directly by the Zabbix server or proxy, invoke shell scripts or binaries from the ExternalScripts directory to gather metrics without requiring an agent on the target host; for example, a script might query a custom application log and return parsed values as item data. Script items, available in Zabbix agent 2, execute user-provided JavaScript code to fetch data over HTTP/HTTPS or process parameters, enabling dynamic custom metrics like API responses from third-party services. For alerting, global scripts configurable under Alerts → Scripts allow execution of commands during action operations, such as restarting services on trigger events, while media scripts—often implemented via the webhook media type—use JavaScript to format and send custom notifications to external systems like Slack or PagerDuty. Permissions for these scripts are governed by user roles, with scopes limiting execution to specific contexts like host maintenance or action operations.[107][108][109][110]
Customization extends to plugins and modules for deeper integration. Zabbix agent 2 plugins, written in Go, allow developers to add support for custom metrics by implementing interfaces for data collection, such as monitoring proprietary hardware sensors or cloud APIs; these are loaded dynamically via the Plugins configuration parameter. On the frontend side, PHP-based modules enable tailored user interfaces, including custom dashboards or reports, defined through manifest files that specify entry points and dependencies for seamless integration into the Zabbix UI. For example, a module might generate CSV exports of health metrics directly from the browser.[111][112][113]
Automation scenarios often involve orchestrating Zabbix with infrastructure tools via the API. The official Zabbix Ansible collection provides modules like zabbix_host for creating and managing hosts, enabling playbook-driven deployments that synchronize inventory with monitoring configurations. Similarly, Terraform users can leverage API calls through providers to provision monitored resources, such as dynamically linking templates to newly created hosts. Inbound event handling is supported through webhook receivers in the media type configuration, where Zabbix can process incoming HTTP payloads to update events or acknowledge alerts, facilitating integrations like receiving GitHub notifications for deployment monitoring.[114][115][110]
Security considerations for extensions emphasize controlled access and isolation. API tokens should be assigned to dedicated service accounts with minimal user permissions, revokable via the frontend, and rotated regularly to mitigate exposure risks. Script execution permissions are enforced through role-based access control, restricting sensitive operations like remote commands to authorized groups and avoiding hardcoded credentials in scripts. Best practices include running external scripts in chrooted environments or with restricted user privileges to sandbox potentially untrusted code, preventing escalation of vulnerabilities from custom plugins or modules.[105][116][117]