Fact-checked by Grok 2 weeks ago

Google Compute Engine

Google Compute Engine is an infrastructure-as-a-service (IaaS) offering from that enables users to create and manage (VM) instances and bare metal servers on Google's global data center infrastructure. It provides scalable compute resources, allowing customers to run diverse workloads such as web applications, databases, , and without managing underlying . Launched in preview on June 28, 2012, and achieving general availability on December 2, 2013, Compute Engine utilizes a KVM-based to deliver reliable, self-managed instances with options for and Windows operating systems, while bare metal servers provide direct access without . Compute Engine supports a variety of machine types tailored to specific needs, including general-purpose (e.g., , E2), compute-optimized (e.g., ), memory-optimized (e.g., ), and accelerator-optimized instances equipped with GPUs, Google's custom Tensor Processing Units (TPUs), or Arm-based processors like for and tasks. Users can customize configurations for vCPUs, memory, and storage, with options like Persistent Disk for block storage, Local SSD for high-performance temporary storage, and Hyperdisk for advanced throughput. The service guarantees 99.9% uptime for most instances and 99.95% for memory-optimized VMs, featuring to minimize downtime during maintenance. Integrated seamlessly with other Google Cloud services, Compute Engine facilitates container orchestration via Google Kubernetes Engine (GKE), data analytics with , and storage with , enabling hybrid and multi-cloud architectures. Pricing models include pay-as-you-go, Spot VMs for up to 91% discounts on interruptible workloads, and committed use discounts for predictable savings, with a free tier offering one e2-micro instance monthly. Available across 42 regions and 127 zones worldwide as of 2025, it emphasizes security through features like shielded VMs, customer-managed encryption keys, and compliance with standards such as , PCI DSS, and HIPAA.

History

Launch and Early Development

Google Compute Engine was announced on June 28, 2012, during the developer conference as a limited preview service within the (GCP). This launch marked Google's entry into the (IaaS) market, offering users the ability to provision and manage virtual machines (VMs) on its global infrastructure without the need to handle underlying hardware. The service was positioned to compete with offerings like Amazon EC2, emphasizing Google's strengths in , , and cost-efficiency, with claims of providing 50% more compute power per dollar compared to competitors. At launch, Google Compute Engine focused on delivering KVM-based virtual machines primarily for operating systems, enabling developers and businesses to run large-scale workloads such as applications, , and . Initial VM configurations supported up to 8 virtual CPUs and 3.75 GB of per core, with persistent block storage for data durability. Key early integrations with other GCP services, such as , allowed users to store and access unstructured data directly from VMs, facilitating seamless workflows for applications requiring object storage alongside compute resources. Access to the limited preview required sign-up and was initially restricted to selected developers, with no public pricing or general availability timeline disclosed. In early 2013, the service transitioned from limited preview to a broader phase, ending the and requiring users to provide details for continued access. By May 2013, opened the to all users via the Google Cloud Console, expanding availability and introducing initial machine type offerings like the n1-standard series, which balanced CPU and memory for general-purpose workloads (e.g., n1-standard-1 with 1 vCPU and 3.75 GB ). During this period, support for additional distributions grew, and foundational features like for maintenance were tested to ensure . Windows support was introduced in limited preview later in development, broadening OS . The service reached general availability on December 2, 2013, with a 99.95% monthly uptime , 24/7 support, and reduced pricing to encourage broader adoption. This milestone solidified Google Compute Engine's role in GCP, transitioning it from experimental preview to a production-ready IaaS platform capable of supporting enterprise-scale deployments.

Major Milestones and Updates

Following its general availability in , Google Compute Engine saw key enhancements in operating system support and pricing models. In April 2014, the service introduced sustained use discounts, which automatically apply up to a 30% reduction for instances running more than 25% of a billing month, optimizing costs for long-running workloads without requiring commitments. support launched in limited preview that same year, enabling users to run workloads on the platform, with expanded capabilities—including license mobility for existing on-premises licenses—announced on December 8, 2014. Compute options diversified further in 2015 and 2016 to address interruptible and accelerated workloads. Preemptible virtual machines (now known as Spot VMs) debuted in beta on May 18, 2015, offering up to 70% discounts compared to on-demand pricing for batch jobs tolerant of interruptions (with current Spot VMs offering up to 91% discounts), and achieved general availability in September 2015. Initial GPU-accelerated instances were announced on November 16, 2016, powered by NVIDIA Tesla K80 cards, and became available worldwide in early 2017 to support , data analytics, and tasks. Infrastructure growth accelerated through the late 2010s, with regions and zones expanding to enhance global availability and reduce latency. By mid-2020, Google Cloud had grown to 24 regions across 73 zones in 17 countries, up from just a handful at launch, facilitating broader adoption for distributed applications. Integration with and advanced notably in 2020, when Confidential Computing launched with Confidential VMs on Compute Engine; these use hardware-based trusted execution environments to encrypt , protecting sensitive AI/ML models and processing without performance overhead. Recent updates from 2024 to 2025 emphasize performance for AI-driven and specialized workloads. In July 2024, Hyperdisk ML entered general availability as a high-throughput block storage option tailored for , delivering up to 1,200,000 MBps read throughput per volume to accelerate data loading for training pipelines across up to 2,500 attached VMs. September 2025 brought general availability of Flex-start VMs, which support short-duration tasks up to seven days using a flexible provisioning model that consumes Spot quota for cost savings on bursty or experimental workloads. The G4 accelerator-optimized machine series followed in October 2025, featuring NVIDIA RTX PRO 6000 Blackwell GPUs for graphics-intensive applications like virtual desktops and simulations, available in multiple regions with low-latency networking. November 2025 marked further hardware innovations, with the N4D VM series achieving general availability on November 7, powered by fifth-generation processors and offering up to 96 vCPUs, 768 GB of DDR5 memory, and Titanium I/O for general-purpose tasks in regions like us-central1. On November 6, the N4A series entered preview, utilizing Google's custom processors based on N3 architecture, with configurations up to 64 vCPUs and 512 GB DDR5 for efficient, scalable inference and web serving in limited regions such as us-central1 and europe-west3. These developments underscore ongoing efforts to balance cost, performance, and security in .

Overview and Core Concepts

Virtual Machine Instances

A (VM) instance in Google Compute Engine is a self-managed that runs on Google's using a KVM-based , allowing users to deploy and operate workloads on customizable compute resources. These instances support both and Windows operating systems and can be configured for a wide range of applications, from web servers to tasks. The lifecycle of a Compute Engine VM instance progresses through distinct states, including provisioning (where resources are allocated), running (when the instance is active and operational), stopping (where the instance is shut down but resources are preserved), and terminating (where the instance is deleted and resources are released). Users can monitor and manage these states to ensure efficient resource utilization and application availability throughout the instance's duration. Instances are created through the Google Cloud Console for a graphical interface, the CLI for command-line automation, or the Compute Engine for programmatic integration, with key steps involving selection of a machine type, bootable , and deployment zone. This process enables rapid deployment tailored to specific workload requirements, such as compute capacity and geographic placement. For scalable deployments, Compute Engine supports instance groups, which manage collections of identical ; managed instance groups (MIGs) provide advanced features like automatic healing, rolling updates, and autoscaling based on metrics such as CPU utilization or custom load balancing. MIGs ensure by distributing instances across multiple zones and dynamically adjusting group size to match demand. In September 2025, introduced Flex-start VMs in general availability, a feature for single-instance deployments with runtime limits up to seven days, optimized for bursty workloads like training or through a queuing system that improves resource access efficiency. Compute Engine also offers bare metal instances, which provide direct access without overhead, catering to low-latency applications such as financial trading or that require maximal performance and minimal interference. VM instances can attach to persistent storage options for durable data management, with details on these attachments covered in dedicated storage sections.

Basic Resource Units

In Google Compute Engine, the fundamental resources are measured primarily in terms of virtual CPUs (vCPUs) and gigabytes (GB) of memory, which form the core building blocks for virtual machine instances. Historically, Google introduced the Google Compute Engine Unit (GCEU) as an abstraction for CPU capacity, where 2.75 GCEUs represented the compute power equivalent to one logical CPU core on an n1-standard-1 instance; however, this metric has been largely superseded in modern usage by direct vCPU and memory allocations for simplicity and alignment with hardware capabilities. A vCPU in Compute Engine represents a single hardware hyper-thread (or thread) on the underlying physical processors, which include Intel Xeon Scalable, AMD EPYC, and Arm-based (Tau) CPUs. By default, simultaneous multithreading (SMT, also known as hyper-threading) is enabled, allowing two vCPUs to share one physical core, thereby providing efficient resource utilization without dedicating full cores unless specified otherwise via configuration options. vCPUs can be allocated from 1 up to 384 per instance, depending on the machine type and series, with the exact mapping to physical hardware determined by the selected CPU platform. Memory is allocated in increments of GB and is closely tied to vCPU counts, with predefined ratios varying by machine family to balance performance needs. For general-purpose standard machine types, the typical ratio is 4 GB of memory per vCPU, though ranges can extend from 3 to 7 GB per vCPU; specialized families like high-memory types offer up to 24 GB per vCPU, while high-CPU types provide as low as 0.9 GB per vCPU to prioritize processing power. Custom allocations allow flexibility within these bounds, ensuring memory scales proportionally to computational demands. Disk resources are provisioned as block storage in GB, with Persistent Disks serving as the primary unit for durable, scalable storage attached to instances; quotas limit total disk size per , with default limits varying by and often starting in the terabyte range for standard Persistent Disk, though these can encompass both SSD and HDD variants. Network bandwidth is another key allocatable unit, measured in Gbps for ingress and egress; while ingress is unlimited, egress bandwidth is capped per instance based on machine type—ranging from 1 Gbps for small instances to 200 Gbps for high-performance series—with premium Tier_1 networking options enabling higher sustained throughput for data-intensive workloads. Compute Engine enforces quotas to manage resource availability, with default limits applied per and to prevent overuse; for example, the standard CPU quota (total vCPUs) often starts at 8-24 per region for new projects as of early 2025, alongside corresponding memory quotas, and boot disks have a minimum size of 10 . These quotas are visible and adjustable via the Google Cloud console, where users can request increases through a form-based process, typically approved based on usage history and justification to accommodate scaling needs.

Infrastructure and Locations

Regions and Zones

Google Compute Engine organizes its infrastructure into regions and zones to provide geographical distribution, , and compliance options for deployments. A is an independent geographic area, such as us-central1 in , , that spans one or more physical locations and contains multiple zones. Each region operates independently, allowing users to select locations based on specific needs while ensuring resources within a region can communicate with low . As of November 2025, Google Cloud operates 42 regions worldwide, with expansions including new facilities in Europe, such as Stockholm, Sweden (europe-north2), and in North America, such as Querétaro, Mexico (northamerica-south1). Zones represent isolated locations within a , designed to enhance by isolating failures such as power outages or network issues to a single without affecting others in the same . For example, the us-central1 includes zones like us-central1-a, us-central1-b, and us-central1-c, each hosting a of the 's . With 127 zones available as of 2025, users can deploy instances across multiple zones within a to achieve , as resources in different zones are engineered to be failure-independent. Selecting regions and zones involves evaluating factors like , , and service to optimize and meet legal requirements. For instance, to minimize for users in , one might choose the europe-west1 region in , while data residency rules such as the EU's GDPR may necessitate deploying in European regions to keep within the continent. considerations include checking zone-specific quotas and schedules to ensure uninterrupted operations. Multi-regional resources, such as replicated buckets, enable global replication across multiple for enhanced durability and accessibility, though their use ties into broader resource scoping policies.

Resource Scopes and Placement Policies

In Google Compute Engine, resources are organized into scopes that determine their availability and accessibility across the infrastructure. Zonal resources, such as instances, are confined to a single within a and can only interact with other resources in that same . Regional resources, including managed instance groups (MIGs), span multiple zones within a single , enabling broader distribution for improved . Global resources, like custom images and snapshots, are accessible across all regions and zones, facilitating reuse without location-specific constraints. Placement policies in Compute Engine allow users to control the physical distribution of virtual machines to optimize for reliability, performance, or . The compact placement policy groups instances closely together on the same underlying or within the same , reducing inter-instance communication , which is particularly useful for tightly coupled workloads like applications. In contrast, the spread placement policy distributes instances across distinct to minimize the risk of correlated failures from or zonal outages, enhancing overall for mission-critical services. The default "any" policy imposes no specific constraints, allowing the system to place instances based on . These placement policies effectively implement affinity and anti-affinity principles for instance placement. Compact policies enforce affinity by co-locating instances to promote low-latency interactions, while spread policies apply anti-affinity by separating them to avoid single points of failure, thereby supporting strategies for high availability without requiring custom scripting. At a higher level, Compute Engine resources are managed within a hierarchical structure that aligns with Google Cloud's overall organization. Projects serve as the primary containers for resources, where all Compute Engine instances, disks, and networks are created and billed. Folders provide optional intermediate grouping for projects, enabling structured organization by department or environment, while the organization node at the top represents the root for an entire enterprise, enforcing policies and access controls across the hierarchy. This structure ensures isolated, scalable management of resources while inheriting permissions downward. A recent enhancement to regional MIGs, introduced in public preview as of November 2025, allows automatic repair of failed virtual machines in an alternate within the same when the primary zone is unavailable. This requires enabling update-on-repair and helps maintain instance group health during zonal disruptions, further bolstering availability without manual intervention.

Compute Resources

Machine Types

Google Compute Engine offers a of predefined machine type families tailored to different workload requirements, balancing vCPU, memory, and other resources for optimal performance and cost-efficiency. These families include general-purpose, compute-optimized, memory-optimized, accelerator-optimized, and storage-optimized types, each with specific series designed for common use cases such as serving, , in-memory databases, inference, and high-I/O . Machine types determine the vCPU-to-memory ratios, networking , and other capabilities, allowing users to select configurations that align with their application's needs without custom modifications. The general-purpose machine family, suitable for versatile workloads like web servers, containerized applications, and development environments, encompasses the , N2, and N4 series. The series, an earlier generation, supports up to 96 vCPUs with a of 6.5 GB per vCPU and networking up to 32 Gbps, providing balanced performance for standard tasks. The N2 series, powered by processors (with Ice Lake for instances over 80 vCPUs), scales to 128 vCPUs at 8 GB per vCPU and up to 32 Gbps networking, offering improved price-performance for medium-scale applications. The N4 series extends this with up to 80 vCPUs at 8 GB per vCPU and 50 Gbps networking, while the N4D variant, based on processors, reaches 96 vCPUs with the same and became generally available in November 2025 for enhanced flexibility in general workloads. Compute-optimized machine types, such as the and series, prioritize high-frequency CPUs for demanding tasks including (HPC), , and game servers. The series delivers up to 60 vCPUs with 4 memory per vCPU and sustained all-core frequencies up to 3.8 GHz, paired with up to 32 Gbps networking for compute-intensive operations. The series advances this capability to 176 vCPUs at 8 per vCPU, supporting even larger-scale HPC and training workloads with networking bandwidth up to 100 Gbps. Memory-optimized types like the and series are engineered for applications requiring substantial , such as in-memory databases, caching layers, and deployments. The series accommodates up to 160 vCPUs with up to 24 memory per vCPU (totaling over 3.8 TB), and networking up to 32 Gbps to handle data-heavy queries efficiently. The series focuses on ultra-high configurations, supporting 208–416 vCPUs with as much as 12 TB total (approximately 28 per vCPU in larger instances), ideal for analytics and real-time processing with the same networking bandwidth. Accelerator-optimized machine types, including the A2, A3, and G2 series, integrate GPUs for graphics rendering, inference, and generative tasks. The A2 series pairs up to 96 vCPUs with 16 NVIDIA A100 GPUs and up to 100 Gbps networking, optimized for large-scale ML training. The A3 series scales to 224 vCPUs with 8 H100 GPUs and exceptional 3,200 Gbps networking, targeting advanced workloads. The G2 series, featuring L4 GPUs, supports up to 96 vCPUs with 8 GPUs per instance and 100 Gbps networking, particularly suited for graphics-intensive applications like remote visualization and . Storage-optimized machine types, represented by the Z3 series, cater to high-I/O workloads such as SQL/NoSQL , data analytics, and vector databases requiring rapid local storage access. These instances provide up to 176 vCPUs with 36 TiB of local SSD storage and networking bandwidth up to 100 Gbps, enabling low-latency data throughput for scale-out storage systems.
Machine FamilyKey SeriesvCPU RangeMemory Ratio (GB/vCPU)Max Networking BandwidthPrimary Use Cases
General-purposeN1, N2, N4/N4DUp to 1286.5–832–50 GbpsWeb servers,
Compute-optimizedC2, C3Up to 1764–8Up to 100 GbpsHPC, batch jobs
Memory-optimizedM1, M2Up to 41614–28Up to 32 GbpsIn-memory
Accelerator-optimizedA2, , G2Up to 224Varies (8–16 base)100–3,200 Gbps training, graphics
Storage-optimizedZ3Up to 176VariesUp to 100 GbpsHigh-I/O , analytics

Custom Configurations

Google Compute Engine allows users to create custom machine types that enable precise specification of virtual CPUs (vCPUs) and to match specific workload requirements, offering greater flexibility than predefined machine types. For example, a user can configure an instance with exactly 10 vCPUs and GB of memory using the format custom-10-61440 (where is specified in ), which is particularly useful for applications needing non-standard resource ratios, such as memory-intensive databases or compute-light services. Memory allocations must be in multiples of 256 , and the total configuration must align with the supported machine series, such as N2 or E2. Constraints on custom machine types ensure compatibility with underlying hardware. In standard configurations, memory per vCPU ranges from 0.9 to 6.5 , though this varies by series—for instance, N1 series supports 0.922 to 6.656 per vCPU. options, available for series like N4, N4A, N2, and N1, remove the per-vCPU upper limit, allowing up to 8 or more per vCPU (e.g., up to 624 total for N1), billed at a premium rate to support workloads like large-scale analytics. vCPUs can generally be specified in multiples of 1 starting from 1, except for certain series like E2, which require multiples of 2 up to 32 vCPUs. Sole-tenant nodes provide dedicated physical hardware isolation for custom machine types, ensuring that VMs run exclusively on servers reserved for a single project to meet compliance or security needs. These nodes support custom configurations in compatible series like N2, where VMs must match the node's machine series but can vary in size within the node's total capacity (e.g., up to 80 vCPUs and 640 memory for an n2-standard-80 node). Custom machine types integrate with accelerators for enhanced performance in AI and . Users can attach GPUs (e.g., A100 or T4) or Google TPUs to custom in supported series like or A2, enabling tailored setups such as a custom-32-225280 instance with four T4 GPUs for training. As of November 2025, the Arm-based N4A series, powered by Google's processor on the N3 platform, supports custom machine types in preview, offering up to 64 vCPUs and 512 GB of DDR5 memory with extended memory options for cost-effective general-purpose workloads.

Storage Options

Persistent Disks

Persistent Disks are block storage devices in Google Compute Engine that provide durable, high-availability storage independent of (VM) instances, allowing data to persist even if the instance is stopped or terminated. They function like physical hard drives but are managed by Google Cloud, offering features such as live attachment and detachment to running VMs without downtime. Google Compute Engine offers several types of Persistent Disks to suit different workloads, balancing cost, , and requirements. Standard Persistent Disks (pd-standard) use hard disk drives (HDDs) and are optimized for large-scale sequential read/write operations, such as media serving or data analytics, with scaling at 0.75 read and 1.5 write per GiB of provisioned space, up to a maximum of 7,500 read and 15,000 write per instance on larger machines. Balanced Persistent Disks (pd-balanced) employ solid-state drives (SSDs) for a cost-effective mix of and price, delivering up to 6 (read and write) per GiB, with a baseline of 3,000 and maximums reaching 80,000 per instance, suitable for general-purpose applications like web servers. SSD Persistent Disks (pd-ssd) provide high- storage for demanding workloads such as databases, offering up to 30 (read and write) per GiB and peaking at 100,000 per instance, with throughput limits of 1,200 MiBps for reads and writes. For workloads requiring predictable , Extreme Persistent Disks (pd-extreme) allow provisioning of up to 120,000 and 4,000 MiBps throughput for reads, ensuring consistent without scaling solely on disk . Across all types, scales with disk and the number of vCPUs in the attached VM instance, but is capped by per-instance limits to prevent overload. Persistent Disks can be sized from a minimum of 10 GB to a maximum of 64 TB per volume, with sizes adjustable in 1 GB increments; for greater capacity, multiple disks can be combined using software configurations within the VM. Up to 128 Persistent Disk volumes (including the boot disk) can be attached to a single VM instance, supporting a total attached capacity of up to 257 TiB, which enables scalable storage setups for complex applications. All Persistent Disks are encrypted at rest by default using Google-managed encryption keys, ensuring data security without additional configuration; alternatively, users can opt for customer-supplied encryption keys (CSEK) to manage their own 256-bit keys, providing greater control over for compliance needs, though Google does not store these keys and data becomes inaccessible if they are lost. between the disk and VM is also encrypted. For backup, Persistent Disks support incremental snapshots that can be created and managed separately.

Local SSD and Hyperdisk

Google Compute Engine offers Local SSD as an ephemeral storage option that provides high-performance, low-latency block storage physically attached to the host machine running the instance. This storage uses NVMe or interfaces and is designed for temporary workloads where data persistence is not required, as all data on Local SSD disks is lost when the instance stops, is preempted, or the host encounters an error. Unlike persistent disks, which maintain data independently of the instance lifecycle, Local SSD emphasizes speed over and cannot be detached from the instance or used for snapshots. Local SSD disks come in standard and variants, with each disk offering 375 GiB of capacity, though SSD supports up to 6 TiB per disk on certain bare metal configurations. Instances can attach multiple disks, enabling up to 72 TiB of total Local SSD capacity, depending on the machine type and series (e.g., Z3 series allows 12 disks of 6 TiB each). Performance scales with the number of disks and interface; for example, SSD on NVMe can deliver up to 9,000,000 read , 6,000,000 write , 36,000 /s read throughput, and 30,000 /s write throughput. Common use cases include caching, scratch space for high-I/O applications like (e.g., temporary tables in SQL Server), and transient data processing in environments. Limitations include incompatibility with shared-core machine types, the inability to add disks after instance creation, and no support for custom keys or data preservation beyond preview features for live migrations. Hyperdisk provides a family of durable, high-performance block storage volumes that can be customized for and throughput independently of capacity, making it suitable for demanding workloads while maintaining data persistence across instance restarts. Available in several types—Balanced, Balanced , Extreme, Throughput, and ML—Hyperdisk volumes attach directly to instances like physical disks and support features such as regional replication for and, for the ML variant, sharing across multiple read-only instances with limits varying by volume size (up to 2,500 for volumes ≤256 GiB and lower for larger volumes). The Balanced type offers a general-purpose balance with up to 160,000 and 2,400 /s throughput; Extreme prioritizes at up to 350,000 with 5,000 /s throughput; and Throughput focuses on with up to 2,400 /s at lower . Hyperdisk ML, optimized for AI and machine learning workloads, delivers the highest performance in the family, with up to 1,200,000 MiB/s throughput and 19,200,000 IOPS, enabling faster model loading and reduced idle time for accelerators in inference and training scenarios. This type supports volumes from 4 GiB to 64 TiB and is particularly useful for immutable datasets like model weights, where multiple instances can access the same volume in read-only mode for large-scale HPC or analytics tasks such as those in Hadoop or Spark. It became generally available in 2024, enhancing support for AI-driven applications. Limitations for Hyperdisk include restrictions on using Extreme, ML, or Throughput types as boot disks, zonal-only availability for ML volumes, and the need to adjust performance settings in increments (e.g., throughput every 6 hours), with attachment limits varying by volume size (up to 2,500 for ≤256 GiB, lower for larger).
Hyperdisk TypeMax IOPSMax Throughput (MiB/s)Key Focus
Balanced160,0002,400General-purpose workloads
Extreme350,0005,000High random I/O
Throughput9,6002,400Sequential access
ML19,200,0001,200,000AI/ML data loading

Networking and Connectivity

Virtual Private Cloud

Google Compute Engine (GCE) utilizes Virtual Private Cloud (VPC) networks as the foundational networking layer, providing isolated, scalable virtual environments for resources like virtual machine (VM) instances. A VPC network acts as a global, virtual version of a physical network, spanning multiple regions without the need for physical cabling, and enables users to define subnets within specific regions for logical segmentation of resources. These networks support auto-mode or custom-mode configurations, where auto-mode automatically creates subnets in every region, while custom-mode allows manual definition of IP ranges and subnet placements to suit workload requirements. IP addressing in GCE VPCs includes internal and addresses assigned to instances, with support for both dual-stack and IPv6-only configurations to accommodate modern networking needs. External IPv4 addresses can be optionally attached to instances for public , while alias ranges enable secondary assignments to or load balancers without additional network interfaces. Subnets are associated with primary ranges (CIDR blocks) and can include secondary ranges for alias , ensuring efficient address management across regional deployments. support, introduced in 2022 and expanded thereafter, allows global addresses for improved scalability in IPv6-enabled workloads. Firewall rules in VPC networks control ingress and egress using distributed, stateful that apply to all instances within the network. Rules are defined by priority (lower numbers take precedence), direction, and action (allow or deny), with matching based on IP protocols, ports, source/destination IP ranges, and instance tags or service accounts for granular . For example, a common rule might allow HTTP () from any source to instances tagged "web-server," while denying all other ingress to minimize exposure. These rules are enforced at the instance level but defined at the VPC level, with a default quota of 1000 rules per project, and hierarchical firewall policies available for enterprise-scale management. Routes in VPC networks direct , including for internet-bound and static routes for on-premises or peered connectivity. The system-generated (0.0.0.0/0) handles outbound to the via Google's edge routers, while routes can specify next-hop types such as VM instances, VPN tunnels, or interconnects with metrics to prioritize paths. Route propagation is automatic for connected networks, ensuring dynamic updates without manual intervention in most cases. VPC Network Peering enables secure, low-latency connectivity between multiple VPC networks, either within the same , across projects, or between different organizations, without requiring gateways or VPNs. Peering connections exchange routes automatically (unless disabled), allowing instances in peered networks to communicate using internal addresses as if they were in the same network, which is particularly useful for multi- architectures or hybrid cloud setups. Limitations include no transitive peering (direct connections only) and non-overlapping ranges to prevent conflicts. Shared VPCs extend this capability by allowing centralized network administration across projects, where a project owns the VPC and subnets are shared with attached projects for deployment.

Load Balancing and IP Management

Google Cloud Load Balancing provides scalable traffic distribution for Compute Engine instances, supporting various types tailored to different protocols and scopes. The platform offers external Application Load Balancer for HTTP(S) traffic, which operates globally using proxy-based distribution to handle content-based routing and SSL offloading. External passthrough Network Load Balancer supports /SSL/UDP protocols with non-proxied forwarding for low-latency applications, while Internal Application and Passthrough Load Balancers manage intra-VPC traffic for private services. These load balancers integrate with managed instance groups (MIGs) as backend services, enabling automatic distribution of traffic across multiple VM instances for and . IP address management in Compute Engine distinguishes between ephemeral and static external IPs to support reliable external connectivity. Ephemeral external IPs are automatically assigned from Google's pool upon VM creation and released when the instance stops or terminates, making them suitable for temporary workloads but unsuitable for services requiring persistent addressing. Static external IPs can be reserved in advance or promoted from an existing ephemeral IP, ensuring consistent public access for DNS records or external integrations, with options for regional or global scopes. Reservations allow pre-allocation without attachment to a specific instance, facilitating flexible assignment across projects or regions. Global load balancing leverages addressing to route traffic to the nearest healthy backend across worldwide regions, minimizing for multi-region deployments. A single anycast serves as the frontend, with Google's edge network directing packets based on proximity and backend health, supporting both and network tiers for optimized performance. This approach enables seamless and content delivery integration, such as with Cloud CDN, for applications spanning multiple zones or regions. Autoscaling integrates with load balancing through backend services and health checks to dynamically adjust instance counts based on traffic demands. Health checks probe instance groups at configurable intervals to verify responsiveness, removing unhealthy backends from load distribution and triggering autoscaling policies. Autoscalers can base decisions on load balancing capacity metrics, such as serving capacity or HTTP request rates, ensuring resources scale in tandem with incoming traffic while integrating with MIGs for rolling updates. Recent enhancements to global load balancing include traffic isolation policies, introduced in May 2025, which route requests preferentially to the closest region for multi-region applications, reducing latency in preview mode. Additionally, failover capabilities for global external Application Load Balancers, reaching general availability in November 2024, provide regional backup backends for improved in distributed setups. These updates build on prior optimizations like service load balancing policies from July 2024, enhancing multi-region traffic management.

Images and Snapshots

Operating System Images

Google Compute Engine provides a variety of preconfigured operating system (OS) images that users can select to boot (VM) instances, ensuring compatibility with Google's infrastructure. These images include popular distributions and editions, all optimized for cloud workloads with built-in support for features like automatic security updates and networking. images are maintained by Google or partners and are available at no additional licensing cost for most variants, while images incur on-demand licensing fees. Among the supported Linux public images are (versions 13, 12, and 11), (such as 24.04 and 22.04), and (10 and 9), each with default disk sizes ranging from 10 GB to 20 GB and configurations that disable root password login for enhanced . For Windows, public images encompass Server 2022, 2019, 2016, and the 2025 edition, which achieved general availability in late and supports extended updates until November 2034; these images enable automatic updates and integration with Cloud tools like the guest environment for metadata access. Users can list and select these images via the Cloud console or CLI commands, with regular patches applied for critical vulnerabilities. In addition to public images, users can create custom images to tailor environments with specific software or configurations. Custom images are generated from existing boot disks, snapshots, or imported virtual disks stored in Cloud Storage, allowing for the pre-installation of applications before launching instances. This approach supports scenarios like migrating on-premises workloads or standardizing VM setups across deployments. To manage version updates efficiently, Google Compute Engine uses image families, which are logical groupings of related images within projects like debian-cloud or ubuntu-os-cloud. For instance, the debian-11 family always references the latest non-deprecated Debian 11 image, enabling rolling updates without manual intervention; if issues arise, administrators can deprecate the current image to revert to a prior stable version. This mechanism ensures access to the most recent stable releases while avoiding end-of-life versions. Deprecated images, such as those for and 20.04, enter an end-of-support phase where ceases updates and eventually deletes them from public availability, prompting users to migrate to supported alternatives. During , image families automatically exclude these versions, and existing VMs can continue running but without patches or guarantees; extended paid may be available for select OSes like Windows via Microsoft's programs. Images in Google Compute Engine operate on a global scope, allowing them to be shared seamlessly across projects and regions without duplication. Public images are inherently accessible project-wide, while custom images can be exported to or granted permissions via policies for use in other projects, facilitating consistent deployments in multi-project environments.

Snapshots and Data Backup

Google Compute Engine provides disk snapshots as a mechanism for backing up data from persistent disks and Hyperdisks. These snapshots capture the contents of a disk at a specific point in time and serve as incremental backups, storing only the data that has changed since the previous snapshot to optimize storage efficiency. All disk snapshots are encrypted at rest using Google-managed keys, and they are stored in , with options for multi-regional or regional locations to ensure durability and availability. Disk snapshots can be created manually through the Google Cloud console, CLI, or , allowing users to initiate backups . Automated creation is supported via snapshot schedules, which enable periodic backups at user-defined intervals, such as daily or weekly, configurable through the . Retention policies can be applied to manage snapshot lifecycle, with standard snapshots suitable for short- to medium-term retention and archive snapshots designed for long-term storage at lower costs; snapshots persist independently of the source disk and can be retained indefinitely until manually deleted. To restore data from a disk , users create a new persistent disk or Hyperdisk from the snapshot, which must be at least as large as the original source disk. This new disk can then be attached to a running or new instance using the console, commands, or APIs, after which the is mounted to access the restored data. The restoration process supports both zonal and regional disks, enabling quick recovery without for the original instance. In addition to disk-level backups, Google Compute Engine offers machine image snapshots, which provide full backups of an entire instance. A machine image captures the instance's configuration, metadata, permissions, operating system, and data from all attached disks in a crash-consistent manner, using differential snapshots for subsequent images to store only changes from prior versions. These are particularly useful for instances, , or replicating environments across projects. Disk and machine image snapshots support global scope, allowing them to be created and restored in any or within the . Standard and archive snapshots are automatically replicated across multiple s for high (up to 99.999999999% over a year), facilitating by enabling quick restoration in a secondary during outages.

Features and Capabilities

Performance Optimizations

Google Compute Engine offers various CPU platforms to optimize (VM) performance based on workload requirements, supporting processors such as Granite Rapids and for general-purpose and compute-intensive tasks, processors including and for cost-effective scale-out applications, and Arm-based processors like Google Axion and Grace for energy-efficient and cloud-native workloads. These platforms enable users to select machine series tailored to specific architectures, with providing advanced vector extensions like for , offering strong multi-threaded performance for , and Arm delivering up to 50% better price-performance in certain inference scenarios. Transparent maintenance in Compute Engine is achieved through automatic , which seamlessly transfers running VMs to healthy hosts during infrastructure events like repairs or software updates without , , or changes to instance configurations such as IP addresses or attached storage. This process involves a brief blackout period of under one second, during which the VM's memory state is copied to the target host, ensuring for most workloads while excluding specialized setups like those with attached GPUs or large local SSDs. Disk performance can be enhanced by tuning IOPS and throughput provisions, particularly with Hyperdisk volumes that allow dynamic adjustments every four to six hours without detaching the disk, enabling workloads to scale from baseline levels up to 350,000 for Hyperdisk Extreme or 1,200,000 MiB/s throughput for Hyperdisk ML. For read-heavy applications, Hyperdisk supports asynchronous replication to create read replicas in a secondary region, providing low-latency access to duplicated data while maintaining primary write performance. Optimization techniques include aligning application I/O patterns with provisioned limits and using tools like fio for to avoid contention across multiple disks. Acceleration for machine learning workloads is facilitated by attaching GPUs or TPUs to VMs, with GPU-enabled machine types like the A4 series integrating up to eight NVIDIA B200 GPUs for training large models and the G2 series supporting up to eight L4 GPUs for efficient inference. TPUs, optimized for tensor operations, can be deployed as TPU VMs directly connected to Compute Engine instances for hybrid setups, accelerating frameworks like with up to 10x peak performance gains over previous generations in training and serving tasks. These attachments require compatible machine types and zones, ensuring seamless integration for data processing and graphics-intensive applications. As of November 2025, the N4D machine series, powered by processors and featuring up to 768 GB of DDR5 , delivers up to 3.5x better price-performance for web-serving workloads, 50% higher performance for general , and 70% higher for workloads compared to the prior N2D series, enhancing VM efficiency for web serving and Java-based workloads. This update supports custom machine types with up to 96 vCPUs and a 4.1 GHz max-boost , providing substantial gains in memory-bound scenarios without altering existing policies.

Management and Automation Tools

Google Compute Engine provides several tools for orchestrating, monitoring, and automating the lifecycle of instances, enabling efficient at scale. Cloud Deployment Manager is an infrastructure-as-code service that automates the creation, updating, and deletion of Compute Engine resources through declarative configuration files written in or templates. It supports deploying instance groups, networks, and disks by leveraging underlying Compute Engine APIs, allowing users to version and reuse infrastructure definitions for consistent environments. Note that Cloud Deployment Manager will reach end of support on March 31, 2026, with recommendations to migrate to alternatives like Infrastructure Manager. For broader orchestration, Google Compute Engine integrates seamlessly with , an open-source infrastructure-as-code tool from . Users can provision and manage Compute Engine instances, such as virtual machines and autoscalers, using Terraform's declarative language and the official Google provider, which translates configurations into API calls for creating resources like google_compute_instance. This integration supports complex setups, including state management and dependency handling, to ensure reproducible deployments across projects. Monitoring and logging capabilities are essential for maintaining Compute Engine operations, with Cloud Monitoring collecting metrics such as CPU utilization, disk I/O, and network traffic from virtual machine instances via the Ops Agent. Users can create dashboards, set alerting policies based on thresholds, and visualize performance trends to proactively address issues. Complementing this, Cloud Logging aggregates and analyzes logs from Compute Engine VMs, including system events and application outputs, enabling real-time search, filtering, and export for compliance and troubleshooting. These tools integrate with other Google Cloud services to provide unified observability across hybrid environments. Autoscaling in Compute Engine is handled through managed instance groups (MIGs), which automatically adjust the number of virtual machine instances based on predefined policies tied to metrics like CPU usage, memory consumption, or custom signals from Cloud Monitoring. For example, a policy can scale out by adding instances when average CPU exceeds 60% and scale in when it drops below 40%, ensuring resource elasticity without manual intervention. This feature supports both zonal and regional MIGs, with options for predictive scaling based on historical load patterns to minimize latency during traffic spikes. Operating system management is streamlined via OS Config, a service within VM Manager that automates patching, reporting, and enforcement on Compute Engine instances. It applies OS updates using native mechanisms for supported images like , RHEL, and Windows, with scheduling options to patch during maintenance windows and assess against baselines such as benchmarks. The OS Config agent, installed on VMs, reports status and vulnerabilities, enabling fleet-wide remediation without downtime risks. Automation is further enhanced through the gcloud CLI, part of the Google Cloud SDK, which provides scripting-friendly commands for managing Compute Engine resources programmatically. Commands like gcloud compute instances create and gcloud compute instance-groups managed list-instances support batch operations, filtering, and output formatting in or for integration into pipelines or shell scripts. As of November 10, 2025, observability fields for reservations became generally available, allowing users to query via or CLI which reservations a VM consumes and list VMs attached to specific reservations, improving visibility into committed resource utilization.

Billing and Pricing

Pricing Models

Google Compute Engine employs a pay-as-you-go model for (VM) instances, where costs are calculated based on the resources consumed, including vCPUs, , and . Billing occurs per second after a 10-minute minimum charge, allowing for flexible usage without long-term commitments. For on-demand instances, vCPUs and are priced separately; for example, in the us-central1 region, an N2 vCPU may cost approximately $0.03465 per hour, while is around $0.003938 per GiB-hour, with rates varying by machine family and region. Spot VMs offer a cost-effective alternative for interruptible workloads, providing discounts typically up to 91% (previously 60-91%) compared to prices, with that can adjust up to once per day based on . Starting October 28, 2025, discounts may be less than 60% off prices. These instances can be preempted by Google with a 30-second notice, making them suitable for fault-tolerant applications like . For instance, a Spot VM equivalent to an on-demand N2 instance might cost as low as $0.003465 per vCPU-hour under optimal conditions. Certain operating systems incur premium charges in addition to base VM costs, particularly for licensed images such as or specialized Linux distributions. Windows licensing, for example, adds fees that scale with vCPU count, such as $0.006 per core per hour for instances with 9-127 vCPUs, billed on-demand through Google Cloud. Similarly, (RHEL) adds approximately $0.06 per hour for instances with 1-4 vCPUs and $0.13 per hour for those with 5+ vCPUs (rates as of 2023; verify current pricing), while (SLES) charges $0.02-$0.11 per hour depending on the machine type. These fees ensure compliance with vendor licensing while integrating seamlessly with Compute Engine billing. Network egress traffic, which includes data leaving the Google Cloud network, follows tiered pricing to encourage efficient data management. Inbound traffic is free, but outbound to the internet is charged per GiB; for example, in , the first 1 GiB is free monthly, followed by $0.12 per GiB for the next 1 , decreasing to $0.08 per GiB beyond 10 . Inter-region transfers within the same cost $0.01 per GiB, while cross-continent egress starts at $0.02-$0.14 per GiB depending on the destinations. Persistent disk storage contributes to overall costs, with pricing based on provisioned rather than usage. Standard persistent disks (HDD) $0.04 per GiB-month, while SSD-backed disks are $0.17 per GiB-month, prorated per second for the full provisioned amount. To illustrate total cost calculation for a simple on-demand VM: consider an instance with 2 vCPUs at $0.03465 per hour each, 8 GiB at $0.003938 per GiB-hour, a 100 GiB SSD disk at $0.17 per GiB-month (approximately $0.000235 per GiB-hour), running for 730 hours (one month). The monthly would be (2 × 0.03465 × 730) + (8 × 0.003938 × 730) + (100 × 0.17) ≈ $50.59 (vCPUs) + $22.99 () + $17.00 (disk) = $90.58, excluding any premium OS or egress fees. This formula—total = (vCPU-hours × vCPU-rate) + (GiB-hours × -rate) + (provisioned GiB-months × disk-rate)—highlights the granular, resource-based billing structure.

Discounts and Reservations

Google Compute Engine offers several mechanisms to reduce costs for long-term or predictable workloads, including sustained use discounts, committed use discounts, reservations, and promotional credits. These options allow users to optimize expenses without altering their , by applying automatic reductions or guaranteeing resource availability. Sustained use discounts (SUDs) are applied automatically to eligible resources that run for more than 25% of a billing month, providing tiered savings that increase with higher utilization levels, up to 30% off on-demand prices for full-month usage. SUDs apply to instances, persistent disk, and certain other resources, but only if no other discounts like committed use are active on the same usage. Committed use discounts (CUDs) enable deeper savings through contractual s to specific usage over 1- or 3-year periods, without upfront payments. -based CUDs target predictable workloads on particular types, offering discounts up to 57% compared to , depending on the and . In contrast, flexible CUDs are spend-based s that apply broadly across Compute Engine, Engine, and Run, providing a flat 28% discount for 1-year s and 46% for 3-year s on eligible vCPU and memory usage. These can be combined with reservations to ensure capacity while maximizing savings, and they automatically cover the highest-discount eligible usage first. Reservations allow users to pre-provision capacity in specific zones or regions, guaranteeing availability for critical workloads even during high demand. Users are charged at standard on-demand rates for reserved resources, but reservations can be paired with CUDs or SUDs for additional discounts, and they support features like future reservations for planning up to one year ahead. To aid management, Compute Engine provides reservation recommendations based on historical usage, helping identify opportunities to optimize idle capacity. New customers receive $300 in free credits upon signup, applicable to Compute Engine and other Google Cloud services for the first 90 days, alongside an always-free tier that includes one e2-micro VM instance per month in select regions. In November 2025, Google made generally available enhanced reservation observability features, including API fields to verify which reservation a VM consumes and list VMs using a specific reservation, improving transparency for capacity management.

References

  1. [1]
    Compute Engine overview | Google Cloud Documentation
    Compute Engine is an infrastructure as a service (IaaS) product that offers self-managed virtual machine (VM) instances and bare metal instances. Compute Engine ...PMU overview · Instance creation overview · Stop or suspend VMs overview
  2. [2]
    Compute Engine | Google Cloud
    Compute Engine is a computing and hosting service that lets you create and run virtual machines on Google infrastructure, comparable to Amazon EC2 and Azure ...
  3. [3]
    Google Compute Engine launches, expanding Google's cloud ...
    Jun 28, 2012 · Google App Engine has been at the heart of Google's cloud offerings since our launch in 2008, and we're excited to begin providing developers ...
  4. [4]
    Google Compute Engine is now Generally Available with expanded ...
    Dec 2, 2013 · Google Compute Engine is now Generally Available with expanded OS support, transparent maintenance, and lower prices · Expanded operating system ...
  5. [5]
    Compute Engine instances - Google Cloud Documentation
    This page provides an overview of Compute Engine instances. A Compute Engine instance can be either a virtual machine (VM) or bare metal instance that is ...Create and start an instance · Windows workloads · Microsoft Licensing on Google...
  6. [6]
    Regions and zones | Compute Engine - Google Cloud Documentation
    Authenticate workloads to Google Cloud API using service accounts · Authenticate workloads to other workloads over mTLS · Agent for Compute Workloads overview ...Resource Quotas · Location Selection Tips · Available Regions And Zones
  7. [7]
    Google Announces Cloud Infrastructure Service: Google Compute ...
    Jun 28, 2012 · The cloud service, which Google describes as in a 'Limited Preview Release,' allows users to run large-scale computing workloads on Linux ...Missing: June | Show results with:June
  8. [8]
    Google Taps KVM for Compute Engine - Database Trends and ...
    Google recently announced the Google Compute Engine, a service running on KVM that enables developers to have quick access to vast numbers of virtual ...
  9. [9]
    Google's New IaaS Offering Runs Linux VMs in the Cloud - InfoQ
    Jun 28, 2012 · As is typical for Google, GCE will be made available on a limited preview basis, which you must sign up to receive. No date for general release ...
  10. [10]
    Free Ride Will Soon Be Over For Google Compute Engine Limited ...
    Feb 13, 2013 · Google Compute Engine Limited Preview beta customers who want to continue using the cloud service will have to fork over their credit-card ...
  11. [11]
    Google Compute Engine is now open to all
    May 15, 2013 · To get started, go to the Google Cloud Console, select Compute Engine and click the “New Instance” button. Fill out the required information ...Missing: beta | Show results with:beta
  12. [12]
    [PDF] Early Observations on Performance of Google Compute Engine for ...
    Dec 23, 2013 · Corresponding to the single-CPU GCE type (i.e. n1-standard-1-d), we only show the single-thread memory performance of different VM types. As ...<|control11|><|separator|>
  13. [13]
    Expanded Windows Support on Google Cloud Platform
    Dec 8, 2014 · Monday, December 8, 2014. Our customers, large and small, have put a number of things on their holiday wish lists, including better support ...
  14. [14]
    Google Compute Engine Goes Into General Availability - Forbes
    Dec 3, 2013 · GCE has only been in beta for a little over 12 months, such a short time to GA indicates the seriousness with which Google approaches this ...
  15. [15]
    Introducing Sustained Use Discounts - Google Cloud Platform Blog
    Apr 4, 2014 · Introducing Sustained Use Discounts - Automatically pay less for sustained workloads on Compute Engine. Friday, April 4, 2014. At Google Cloud ...
  16. [16]
    Introducing Preemptible VMs, a new class of compute available at ...
    May 18, 2015 · Preemptible VMs are the same as regular instances except for one key difference - they may be shut down at any time.Missing: 2016 | Show results with:2016
  17. [17]
    Announcing GPUs for Google Cloud Platform
    Nov 16, 2016 · Early in 2017, Google Cloud Platform will offer GPUs worldwide for Google Compute Engine and Google Cloud Machine Learning users.
  18. [18]
    The new Google Cloud region in Jakarta is now open
    Jun 24, 2020 · With this region, Google Cloud now offers 24 regions and 73 zones across 17 countries worldwide. Having a region in Jakarta will help new and ...
  19. [19]
    Introducing Google Cloud Confidential Computing & VMs
    Jul 14, 2020 · As of June 16, 2022, Confidential VMs are generally available on compute optimized C2D VMs. Learn more here. At Google, we believe the future of ...
  20. [20]
    Compute Engine release notes - Google Cloud Documentation
    On January 21, 2026, all remaining accounts will automatically migrate to the new model. You can opt in before that date to start receiving the expanded ...
  21. [21]
    Introducing Flex-start VMs for the Compute Engine Instance API.
    Sep 25, 2025 · Flex-start VMs can run uninterrupted for a maximum of seven days and consume preemptible quota. A new way to request capacity.
  22. [22]
    G4 VMs powered by NVIDIA RTX 6000 Blackwell GPUs are GA
    Oct 20, 2025 · The G4 VM is available now, bringing GPU availability to more Google Cloud regions than ever before, for applications that are latency sensitive ...
  23. [23]
    Compute Engine instance lifecycle - Google Cloud Documentation
    This document explains the lifecycle of a Compute Engine instance, covering the various states it can go through from creation to deletion.
  24. [24]
    Create and start a Compute Engine instance
    In the Google Cloud console, go to the Create an instance page. Go to Create an instance · To configure instance properties, use the options in the navigation ...
  25. [25]
    Overview of creating Compute Engine instances
    Compute Engine lets you create and run instances on Google infrastructure. This document provides an overview of the various configuration parameters that ...
  26. [26]
    Instance groups | Compute Engine - Google Cloud Documentation
    Compute Engine offers two kinds of VM instance groups, managed and unmanaged: Managed instance groups (MIGs) let you operate apps on multiple identical VMs. You ...Create a MIG with autoscaling · Work with managed instances · About regional MIGs
  27. [27]
    Autoscaling groups of instances | Compute Engine
    Managed instance groups (MIGs) offer autoscaling capabilities that let you automatically add or delete virtual machine (VM) instances from a MIG based on ...Understand autoscaler decisions · Using an autoscaling policy...
  28. [28]
    About Flex-start VMs - Compute - Google Cloud Documentation
    This document provides an overview of Flex-start VMs, detailing their key characteristics, as well as the requirements and limitations that you apply when ...Missing: GA September
  29. [29]
    Bare metal instances on Compute Engine
    Learn about the features and available machine series for bare metal instances on Compute Engine.
  30. [30]
    Set the number of threads per core | Compute Engine
    On Compute Engine, each virtual CPU (vCPU) is implemented as a single hardware multithread, and two vCPUs share each physical CPU core by default.
  31. [31]
    CPU platforms | Compute Engine - Google Cloud Documentation
    The machine type of your compute instance determines the number of vCPUs and amount of memory allocated to the instance. CPU processor, Processor SKU, Supported ...X86 Processors · Intel Processors · Cpu Features<|control11|><|separator|>
  32. [32]
  33. [33]
    Machine families resource and comparison guide | Compute Engine
    This document describes the machine families, machine series, and machine types that you can choose from to create a virtual machine (VM) instance or bare ...Missing: initial IaaS KVM
  34. [34]
    Allocation quotas | Compute Engine - Google Cloud Documentation
    This quota is visible in the Google Cloud console on the Quotas page. Compute Engine automatically sets this quota to be 10 times your regular CPU quota. You ...
  35. [35]
    Network bandwidth | Compute Engine - Google Cloud Documentation
    Read about Google Cloud's network bandwidth and Tier_1 networking for compute instances, ingress and egress rates, receive and transmit queues, and queue ...Missing: 2024 | Show results with:2024<|separator|>
  36. [36]
    Compute Engine quota and limits overview
    A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software ...
  37. [37]
    Google Compute Engine FAQ | Google Cloud Documentation
    How do I find out how much quota I have used or have left? Check your quota limits and usage in the quota page on the Google Cloud console. If you reach the ...
  38. [38]
    Regions and zones  |  Compute Engine  |  Google Cloud
    ### Summary of Regions and Zones in Google Cloud Compute Engine
  39. [39]
  40. [40]
  41. [41]
    General-purpose machine family for Compute Engine  |  Google Cloud
    ### Summary of Custom Machine Types for General-Purpose Series (as of November 2025)
  42. [42]
    N4D VMs based on AMD Turin now GA | Google Cloud Blog
    N4D now GA: Gain up to 3.5x price-performance for scale-out workloads. November 11, 2025 ...
  43. [43]
  44. [44]
  45. [45]
    Create a VM with a custom machine type  |  Compute Engine  |  Google Cloud
    ### Summary of Custom Machine Types on Google Cloud Compute Engine
  46. [46]
    Sole-tenancy overview  |  Compute Engine  |  Google Cloud
    ### Summary: Custom Machine Types Support for Sole-Tenant Nodes
  47. [47]
    Axion-based N4A VMs now in preview | Google Cloud Blog
    Compute. Unlock 2x better price-performance with Axion-based N4A VMs, now in preview. November 6, 2025. Nate Baum. Senior Product Manager. Mo ...
  48. [48]
    About Persistent Disk | Compute Engine
    This document describes the features, types, performance and benefits of Persistent Disk volumes. If you need block storage for a virtual machine (VM) ...Persistent Disk types · Zonal Persistent Disk · Regional Persistent Disk
  49. [49]
    Persistent Disk performance overview | Compute Engine
    For N2D, the maximum write throughput per instance for SSD Persistent Disk is 600 MiBps. Sharing disks between instances. Attaching a disk to multiple instances ...Iops Limits For Zonal... · Iops Limits For Regional... · Baseline Performance
  50. [50]
    Extreme Persistent Disk | Compute Engine
    When you create an Extreme Persistent Disk, you can provision 2,500 to 120,000 IOPS. If you need more than 125,000 IOPS, we recommend Google Cloud Hyperdisk.
  51. [51]
    Create a new Persistent Disk volume | Compute Engine
    In the Google Cloud console, activate Cloud Shell. · Use the gcloud compute disks create command to create the zonal Persistent Disk volume. · After you create ...
  52. [52]
  53. [53]
    About Local SSD disks | Compute Engine
    Local SSD disks offer superior I/O operations per second (IOPS), and very low latency compared to the persistent storage provided by Google Cloud Hyperdisk and ...<|control11|><|separator|>
  54. [54]
    Google Cloud Hyperdisk overview | Compute Engine
    This document describes the features of Google Cloud Hyperdisk. Hyperdisk is the fastest and most efficient durable disk for Compute Engine.Hyperdisk features · Choose a Hyperdisk type for... · Durability of Hyperdisk
  55. [55]
    About Hyperdisk ML | Compute Engine - Google Cloud Documentation
    This document describes the features of Hyperdisk ML, which offers the highest throughput of all Google Cloud Hyperdisk types.Missing: date | Show results with:date
  56. [56]
    Google Cloud Platform Technology Nuggets — October 16–31 ...
    Nov 4, 2024 · Hyperdisk ML, the AI/ML-focused block storage service is generally available. Google has been announced as a Leader in Gartner Magic Quadrant ...
  57. [57]
    Choose a load balancer - Google Cloud Documentation
    External versus internal load balancing; Global versus regional load balancing; Premium versus Standard Network Service Tiers; Proxy versus passthrough load ...Load balancing aspects · Proxy versus passthrough load...
  58. [58]
    IP addresses | Compute Engine - Google Cloud Documentation
    Compute Engine automatically assigns a single IPv4 address from the primary IPv4 subnet ranges. You assign a specific internal IPv4 address when you create a ...Internal IP addresses · External IP addresses · Regional and global IP...
  59. [59]
    Configure static external IP addresses | Compute Engine
    In the Google Cloud console, go to the IP addresses page. Go to IP addresses. Find the address in the list and check the Type column for the type of IP address.
  60. [60]
    Cloud Load Balancing overview - Google Cloud Documentation
    Single anycast IP address. With Cloud Load Balancing, a single anycast IP address is the frontend for all of your backend instances in regions around the world.Types of Google Cloud load... · Summary of types of Google...
  61. [61]
    External Application Load Balancer overview
    Global Anycast external IP addresses over Premium Tier; Can access backends across multiple regions; Supports Cloud CDN; Supports Cloud Armor. Classic ...Cloud Run backends... · Cloud Storage (backend... · Set up a global external...
  62. [62]
    Load balancing and scaling | Compute Engine
    Google Cloud offers several different types of load balancing that differ in capabilities, usage scenarios, and how you configure them. See Google Cloud load ...
  63. [63]
    Scaling based on load balancing serving capacity | Compute Engine
    You can use autoscaling in conjunction with load balancing by setting up an autoscaler that scales based on the load of your instances. An external or internal ...
  64. [64]
    Cloud Load Balancing release notes - Google Cloud Documentation
    Global and cross-region load balancers now support enabling traffic isolation on the service load balancing policy. By default, these load balancers use the ...
  65. [65]
    OS images  |  Compute Engine  |  Google Cloud
    ### Public OS Images for Google Compute Engine
  66. [66]
  67. [67]
    Google Cloud latest news and announcements
    Jan 1, 2025 · We are excited to announce the general availability of Windows Server 2025 on Google Compute Engine. You can now run Windows Server 2025 Data ...<|control11|><|separator|>
  68. [68]
    Operating systems lifecycle  |  Compute Engine  |  Google Cloud
    ### Summary of Deprecated and End-of-Life OS Images in Google Compute Engine
  69. [69]
  70. [70]
    Create archive and standard disk snapshots  |  Compute Engine  |  Google Cloud
    ### Summary of Snapshots in Google Cloud Compute Engine
  71. [71]
  72. [72]
  73. [73]
    Machine images  |  Compute Engine  |  Google Cloud
    ### Summary of Machine Images in Google Compute Engine
  74. [74]
    Live migration process during maintenance events | Compute Engine
    Live migration lets Google Cloud perform maintenance without interrupting a workload, rebooting an instance, or modifying any of the instance's properties ...
  75. [75]
    Analyze the provisioned IOPS and throughput for Hyperdisk volumes
    You can change the provisioned IOPS or throughput once every 6 hours for Hyperdisk ML or once every 4 hours for all other Hyperdisk types.
  76. [76]
    Optimize Hyperdisk performance | Compute Engine
    After you provision your Google Cloud Hyperdisk volumes, your application and operating system might require performance tuning to meet your performance needs.
  77. [77]
    GPU machine types | Compute Engine - Google Cloud Documentation
    This document outlines the NVIDIA GPU models available on Compute Engine, which you can use to accelerate machine learning (ML), data processing, ...
  78. [78]
    Introduction to Cloud TPU - Google Cloud Documentation
    Cloud TPUs are optimized for specific workloads. In some situations, you might want to use GPUs or CPUs on Compute Engine instances to run your machine learning ...Tpus · Best Practices For Model... · Padding
  79. [79]
  80. [80]
    Supported resource types | Cloud Deployment Manager
    Deployment Manager uses the underlying APIs of each Google Cloud service to deploy your resources. For example, to create Compute Engine virtual machine ...
  81. [81]
    Deployment Manager Fundamentals - Google Cloud Documentation
    Cloud Deployment Manager will reach end of support on March 31, 2026. If you currently use Deployment Manager, please migrate to Infrastructure Manager or ...Configuration · Templates · Types
  82. [82]
    Provision Compute Engine resources with Terraform
    This page introduces you to using Terraform with Compute Engine, including an introduction to how Terraform works and some resources to help you get started ...
  83. [83]
    Getting Started with the Google Cloud provider - Terraform Registry
    A Google Compute Engine VM instance is named google_compute_instance in Terraform. The google part of the name identifies the provider for Terraform.
  84. [84]
    Cloud Monitoring documentation - Google Cloud
    Learn how to collect and monitor metrics from an Apache web server installed on a Compute Engine virtual machine (VM) instance by using the Ops Agent.
  85. [85]
    Cloud Monitoring - Google Cloud
    Cloud Monitoring offers automatic out-of-the-box metric collection dashboards for Google Cloud services. It also supports monitoring of hybrid and multicloud ...Ops Agent · Alerting overview · Cloud Monitoring overview · Monitoring API usage
  86. [86]
    Cloud Logging documentation - Google Cloud
    This hands-on lab shows you how to view your Cloud Run functions with their execution times, execution counts, and memory usage in the Google Cloud console.Cloud Logging overview · Firewall Rules Logging · Cloud Audit Logs overview
  87. [87]
    Cloud Logging overview | Google Cloud Documentation
    This document provides an overview of Cloud Logging, which is a real-time log-management system with storage, search, analysis, and monitoring support.Visualize and monitor your log... · Log storage and retention · Categories of log data
  88. [88]
    Create a MIG with autoscaling enabled | Compute Engine
    This document describes how to create an autoscaled managed instance group (MIG) that automatically adds and removes VMs based on average CPU utilization ...Before you begin · Create a MIG and enable...
  89. [89]
    Compute Engine managed instance groups get scale-in controls
    Oct 21, 2020 · New scale-in controls in Compute Engine let you limit the VM deletion rate by preventing the autoscaler from reducing a MIG's size by more VM instances.
  90. [90]
    google_compute_autoscaler | Resources | hashicorp/google
    Autoscalers allow you to automatically scale virtual machine instances in managed instance groups according to an autoscaling policy that you define.
  91. [91]
    About Patch | VM Manager - Google Cloud Documentation
    The OS Config service enables patch management in your environment while the OS Config agent uses the update mechanism for each operating system to apply ...
  92. [92]
    Using Compute Engine's OS patch management service
    Jan 13, 2021 · In this blog, we share a step-by-step guide on how to set up a project with a schedule to automatically patch filtered VM instances.
  93. [93]
    Best practices for OS patch management on Compute Engine.
    Sep 14, 2020 · Google Cloud's OS patch management service is a powerful tool to help you install updates at scale across the whole fleet safely and effectively.
  94. [94]
    Scripting gcloud CLI commands | Google Cloud SDK
    Google Cloud SDK comes with a variety of tools like filtering, formatting, and the --quiet flag, enabling you to effectively handle output and automate tasks.
  95. [95]
    VM instance pricing | Google Cloud
    This page describes the cost of running a Compute Engine VM instance with any of the following machine types, as well as other VM instance-related pricing.Missing: GA October 2025
  96. [96]
    Pricing | Compute Engine: Virtual Machines (VMs) - Google Cloud
    This page is a list of Compute Engine pricing in a single place for convenience. It is intended for reference purposes and does not provide detailed pricing ...
  97. [97]
    Spot VMs pricing - Google Cloud
    Prices adjust based on market trends and supply and demand for Spot VMs capacity. You pay the Spot price that is in effect when your instances are running.Missing: premium egress 2025
  98. [98]
  99. [99]
    Disk and image pricing | Google Cloud
    Hyperdisk ML provisioned space. $0.000109589 / 1 gibibyte hour. Hyperdisk ML provisioned throughput. $0.000164384 / 1 hour. If you pay in a currency other than ...Premium Images · Disk Pricing · Persistent Disk And...
  100. [100]
  101. [101]
    Network pricing
    Summary of each segment:
  102. [102]
    Sustained use discounts - Compute - Google Cloud Documentation
    The discount increases incrementally with usage and you can get up to a 30% net discount off of the resource cost for virtual machine (VM) instances that run ...Eligible Resources And... · Incremental Usage Levels · View Sustained Use Discounts
  103. [103]
    Pricing Overview | Google Cloud
    With Google Cloud's pay-as-you-go pricing structure, you only pay for the services you use. No up-front fees. No termination charges. Pricing varies by product ...Compute Engine · Price list · Storage · Cloud SQL pricing
  104. [104]
    Committed use discounts (CUDs) for Compute Engine
    This document explains Google Cloud's committed use discounts (CUDs) and the types of CUDs that you can receive for Compute Engine. Google Cloud offers CUDs ...
  105. [105]
    Combine reservations with committed use discounts | Compute Engine
    Committed use discounts (CUDs) provide deeply discounted prices for your Compute Engine resources in exchange for 1-year or 3-year committed-use contracts ...
  106. [106]
    About reservations | Compute Engine - Google Cloud Documentation
    You're charged for the reserved resources at the same on-demand rate as running VMs, including any applicable discounts, as long as the reservation exists.
  107. [107]
    Reservation recommendations | Compute Engine
    Compute Engine provides reservation recommendations to help you identify idle or underutilized on-demand reservations for the previous seven days so that you ...Customize Recommendations · Choose The Right... · The Observation Period<|control11|><|separator|>