Fact-checked by Grok 2 weeks ago

Google Compute Engine

Google Compute Engine is an infrastructure-as-a-service (IaaS) offering from Google Cloud Platform that enables users to create and manage virtual machine (VM) instances and bare metal servers on Google's global data center infrastructure.^[1] It provides scalable compute resources, allowing customers to run diverse workloads such as web applications, databases, batch processing, and high-performance computing without managing underlying hardware.^[2] Launched in preview on June 28, 2012, and achieving general availability on December 2, 2013, Compute Engine utilizes a KVM-based hypervisor to deliver reliable, self-managed virtual machine instances with options for Linux and Windows operating systems, while bare metal servers provide direct hardware access without virtualization.^[3]^[4]^[5] Compute Engine supports a variety of machine types tailored to specific needs, including general-purpose (e.g., N1, E2), compute-optimized (e.g., C2), memory-optimized (e.g., M2), and accelerator-optimized instances equipped with GPUs, Google's custom Tensor Processing Units (TPUs), or Arm-based processors like Axion for AI and machine learning tasks.^[1] Users can customize configurations for vCPUs, memory, and storage, with options like Persistent Disk for block storage, Local SSD for high-performance temporary storage, and Hyperdisk for advanced throughput.^[6] The service guarantees 99.9% uptime for most instances and 99.95% for memory-optimized VMs, featuring live migration to minimize downtime during maintenance.^[2] Integrated seamlessly with other Google Cloud services, Compute Engine facilitates container orchestration via Google Kubernetes Engine (GKE), data analytics with BigQuery, and storage with Cloud Storage, enabling hybrid and multi-cloud architectures.^[1] Pricing models include pay-as-you-go, Spot VMs for up to 91% discounts on interruptible workloads, and committed use discounts for predictable savings, with a free tier offering one e2-micro instance monthly.^[2] Available across 42 regions and 127 zones worldwide as of 2025, it emphasizes security through features like shielded VMs, customer-managed encryption keys, and compliance with standards such as SOC, PCI DSS, and HIPAA.^[7]

History

Launch and Early Development

Google Compute Engine was announced on June 28, 2012, during the Google I/O developer conference as a limited preview service within the Google Cloud Platform (GCP). This launch marked Google's entry into the Infrastructure as a Service (IaaS) market, offering users the ability to provision and manage virtual machines (VMs) on its global infrastructure without the need to handle underlying hardware.^[8] The service was positioned to compete with offerings like Amazon EC2, emphasizing Google's strengths in scalability, performance, and cost-efficiency, with claims of providing 50% more compute power per dollar compared to competitors. At launch, Google Compute Engine focused on delivering KVM-based virtual machines primarily for Linux operating systems, enabling developers and businesses to run large-scale workloads such as web applications, batch processing, and data analysis.^[9] Initial VM configurations supported up to 8 virtual CPUs and 3.75 GB of RAM per core, with persistent block storage for data durability.^[10] Key early integrations with other GCP services, such as Google Cloud Storage, allowed users to store and access unstructured data directly from VMs, facilitating seamless workflows for applications requiring object storage alongside compute resources. Access to the limited preview required sign-up and was initially restricted to selected developers, with no public pricing or general availability timeline disclosed. In early 2013, the service transitioned from limited preview to a broader beta phase, ending the free trial period and requiring users to provide credit card details for continued access.^[11] By May 2013, Google opened the beta to all users via the Google Cloud Console, expanding availability and introducing initial machine type offerings like the n1-standard series, which balanced CPU and memory for general-purpose workloads (e.g., n1-standard-1 with 1 vCPU and 3.75 GB RAM).^[12]^[13] During this beta period, support for additional Linux distributions grew, and foundational features like live migration for maintenance were tested to ensure high availability. Windows support was introduced in limited preview later in development, broadening OS compatibility.^[14] The service reached general availability on December 2, 2013, with a 99.95% monthly uptime SLA, 24/7 support, and reduced pricing to encourage broader adoption.^[4] This milestone solidified Google Compute Engine's role in GCP, transitioning it from experimental preview to a production-ready IaaS platform capable of supporting enterprise-scale deployments.^[15]

Major Milestones and Updates

Following its general availability in 2013, Google Compute Engine saw key enhancements in operating system support and pricing models. In April 2014, the service introduced sustained use discounts, which automatically apply up to a 30% reduction for instances running more than 25% of a billing month, optimizing costs for long-running workloads without requiring commitments.^[16] Windows Server support launched in limited preview that same year, enabling users to run Microsoft workloads on the platform, with expanded capabilities—including license mobility for existing on-premises licenses—announced on December 8, 2014.^[14] Compute options diversified further in 2015 and 2016 to address interruptible and accelerated workloads. Preemptible virtual machines (now known as Spot VMs) debuted in beta on May 18, 2015, offering up to 70% discounts compared to on-demand pricing for batch jobs tolerant of interruptions (with current Spot VMs offering up to 91% discounts), and achieved general availability in September 2015.^[17] Initial GPU-accelerated instances were announced on November 16, 2016, powered by NVIDIA Tesla K80 cards, and became available worldwide in early 2017 to support machine learning, data analytics, and high-performance computing tasks.^[18] Infrastructure growth accelerated through the late 2010s, with regions and zones expanding to enhance global availability and reduce latency. By mid-2020, Google Cloud had grown to 24 regions across 73 zones in 17 countries, up from just a handful at launch, facilitating broader adoption for distributed applications.^[19] Integration with AI and machine learning advanced notably in 2020, when Confidential Computing launched with Confidential VMs on Compute Engine; these use hardware-based trusted execution environments to encrypt data in use, protecting sensitive AI/ML models and processing without performance overhead.^[20] Recent updates from 2024 to 2025 emphasize performance for AI-driven and specialized workloads. In July 2024, Hyperdisk ML entered general availability as a high-throughput block storage option tailored for machine learning, delivering up to 1,200,000 MBps read throughput per volume to accelerate data loading for training pipelines across up to 2,500 attached VMs.^[21] September 2025 brought general availability of Flex-start VMs, which support short-duration tasks up to seven days using a flexible provisioning model that consumes Spot quota for cost savings on bursty or experimental workloads.^[22] The G4 accelerator-optimized machine series followed in October 2025, featuring NVIDIA RTX PRO 6000 Blackwell GPUs for graphics-intensive applications like virtual desktops and Omniverse simulations, available in multiple regions with low-latency networking.^[23] November 2025 marked further hardware innovations, with the N4D VM series achieving general availability on November 7, powered by fifth-generation AMD EPYC Turin processors and offering up to 96 vCPUs, 768 GB of DDR5 memory, and Titanium I/O for general-purpose tasks in regions like us-central1.^[21] On November 6, the N4A series entered preview, utilizing Google's custom Axion processors based on Arm Neoverse N3 architecture, with configurations up to 64 vCPUs and 512 GB DDR5 for efficient, scalable AI inference and web serving in limited regions such as us-central1 and europe-west3.^[21] These developments underscore ongoing efforts to balance cost, performance, and security in cloud computing.

Overview and Core Concepts

Virtual Machine Instances

A virtual machine (VM) instance in Google Compute Engine is a self-managed virtual server that runs on Google's infrastructure using a KVM-based hypervisor, allowing users to deploy and operate workloads on customizable compute resources.^[6]^[1] These instances support both Linux and Windows operating systems and can be configured for a wide range of applications, from web servers to high-performance computing tasks.^[6] The lifecycle of a Compute Engine VM instance progresses through distinct states, including provisioning (where resources are allocated), running (when the instance is active and operational), stopping (where the instance is shut down but resources are preserved), and terminating (where the instance is deleted and resources are released).^[24] Users can monitor and manage these states to ensure efficient resource utilization and application availability throughout the instance's duration.^[24] Instances are created through the Google Cloud Console for a graphical interface, the gcloud CLI for command-line automation, or the Compute Engine API for programmatic integration, with key steps involving selection of a machine type, bootable image, and deployment zone.^[25]^[26] This process enables rapid deployment tailored to specific workload requirements, such as compute capacity and geographic placement.^[25] For scalable deployments, Compute Engine supports instance groups, which manage collections of identical VMs; managed instance groups (MIGs) provide advanced features like automatic healing, rolling updates, and autoscaling based on metrics such as CPU utilization or custom load balancing.^[27]^[28] MIGs ensure high availability by distributing instances across multiple zones and dynamically adjusting group size to match demand.^[27] In September 2025, Google introduced Flex-start VMs in general availability, a feature for single-instance deployments with runtime limits up to seven days, optimized for bursty workloads like AI training or batch processing through a queuing system that improves resource access efficiency.^[22]^[29] Compute Engine also offers bare metal instances, which provide direct hardware access without virtualization overhead, catering to low-latency applications such as financial trading or real-time analytics that require maximal performance and minimal interference.^[5]^[2] VM instances can attach to persistent storage options for durable data management, with details on these attachments covered in dedicated storage sections.^[6]

Basic Resource Units

In Google Compute Engine, the fundamental resources are measured primarily in terms of virtual CPUs (vCPUs) and gigabytes (GB) of memory, which form the core building blocks for virtual machine instances. Historically, Google introduced the Google Compute Engine Unit (GCEU) as an abstraction for CPU capacity, where 2.75 GCEUs represented the compute power equivalent to one logical CPU core on an n1-standard-1 instance; however, this metric has been largely superseded in modern usage by direct vCPU and memory allocations for simplicity and alignment with hardware capabilities. A vCPU in Compute Engine represents a single hardware hyper-thread (or thread) on the underlying physical processors, which include Intel Xeon Scalable, AMD EPYC, and Arm-based (Tau) CPUs. By default, simultaneous multithreading (SMT, also known as hyper-threading) is enabled, allowing two vCPUs to share one physical core, thereby providing efficient resource utilization without dedicating full cores unless specified otherwise via configuration options.^[30]^[31] vCPUs can be allocated from 1 up to 384 per instance, depending on the machine type and series, with the exact mapping to physical hardware determined by the selected CPU platform.^[32] Memory is allocated in increments of GB and is closely tied to vCPU counts, with predefined ratios varying by machine family to balance performance needs. For general-purpose standard machine types, the typical ratio is 4 GB of memory per vCPU, though ranges can extend from 3 to 7 GB per vCPU; specialized families like high-memory types offer up to 24 GB per vCPU, while high-CPU types provide as low as 0.9 GB per vCPU to prioritize processing power.^[33] Custom allocations allow flexibility within these bounds, ensuring memory scales proportionally to computational demands. Disk resources are provisioned as block storage in GB, with Persistent Disks serving as the primary unit for durable, scalable storage attached to instances; quotas limit total disk size per region, with default limits varying by project and often starting in the terabyte range for standard Persistent Disk, though these can encompass both SSD and HDD variants.^[34] Network bandwidth is another key allocatable unit, measured in Gbps for ingress and egress; while ingress is unlimited, egress bandwidth is capped per instance based on machine type—ranging from 1 Gbps for small instances to 200 Gbps for high-performance series—with premium Tier_1 networking options enabling higher sustained throughput for data-intensive workloads.^[35] Compute Engine enforces quotas to manage resource availability, with default limits applied per project and region to prevent overuse; for example, the standard CPU quota (total vCPUs) often starts at 8-24 per region for new projects as of early 2025, alongside corresponding memory quotas, and boot disks have a minimum size of 10 GB.^[36] These quotas are visible and adjustable via the Google Cloud console, where users can request increases through a form-based process, typically approved based on usage history and justification to accommodate scaling needs.^[37]

Infrastructure and Locations

Regions and Zones

Google Compute Engine organizes its infrastructure into regions and zones to provide geographical distribution, fault tolerance, and compliance options for deployments. A region is an independent geographic area, such as us-central1 in Iowa, United States, that spans one or more physical locations and contains multiple zones.^[38] Each region operates independently, allowing users to select locations based on specific needs while ensuring resources within a region can communicate with low latency. As of November 2025, Google Cloud operates 42 regions worldwide, with expansions including new facilities in Europe, such as Stockholm, Sweden (europe-north2), and in North America, such as Querétaro, Mexico (northamerica-south1).^[39]^[40]^[41] Zones represent isolated locations within a region, designed to enhance fault tolerance by isolating failures such as power outages or network issues to a single zone without affecting others in the same region. For example, the us-central1 region includes zones like us-central1-a, us-central1-b, and us-central1-c, each hosting a subset of the region's capacity. With 127 zones available as of November 2025, users can deploy instances across multiple zones within a region to achieve high availability, as resources in different zones are engineered to be failure-independent.^[38]^[39] Selecting regions and zones involves evaluating factors like latency, regulatory compliance, and service availability to optimize performance and meet legal requirements. For instance, to minimize latency for users in Europe, one might choose the europe-west1 region in Belgium, while data residency rules such as the EU's GDPR may necessitate deploying in European regions to keep personal data within the continent. Availability considerations include checking zone-specific quotas and maintenance schedules to ensure uninterrupted operations. Multi-regional resources, such as replicated storage buckets, enable global replication across multiple regions for enhanced durability and accessibility, though their use ties into broader resource scoping policies.^[42]^[43]

Resource Scopes and Placement Policies

In Google Compute Engine, resources are organized into scopes that determine their availability and accessibility across the infrastructure. Zonal resources, such as virtual machine instances, are confined to a single zone within a region and can only interact with other resources in that same zone. Regional resources, including managed instance groups (MIGs), span multiple zones within a single region, enabling broader distribution for improved fault tolerance. Global resources, like custom images and snapshots, are accessible across all regions and zones, facilitating reuse without location-specific constraints. Placement policies in Compute Engine allow users to control the physical distribution of virtual machines to optimize for reliability, performance, or latency. The compact placement policy groups instances closely together on the same underlying hardware or within the same network topology, reducing inter-instance communication latency, which is particularly useful for tightly coupled workloads like high-performance computing applications. In contrast, the spread placement policy distributes instances across distinct hardware to minimize the risk of correlated failures from hardware or zonal outages, enhancing overall availability for mission-critical services. The default "any" policy imposes no specific constraints, allowing the system to place instances based on availability. These placement policies effectively implement affinity and anti-affinity principles for instance placement. Compact policies enforce affinity by co-locating instances to promote low-latency interactions, while spread policies apply anti-affinity by separating them to avoid single points of failure, thereby supporting strategies for high availability without requiring custom scripting. At a higher level, Compute Engine resources are managed within a hierarchical structure that aligns with Google Cloud's overall organization. Projects serve as the primary containers for resources, where all Compute Engine instances, disks, and networks are created and billed. Folders provide optional intermediate grouping for projects, enabling structured organization by department or environment, while the organization node at the top represents the root for an entire enterprise, enforcing policies and access controls across the hierarchy. This structure ensures isolated, scalable management of resources while inheriting permissions downward. A recent enhancement to regional MIGs, introduced in public preview as of November 2025, allows automatic repair of failed virtual machines in an alternate zone within the same region when the primary zone is unavailable. This feature requires enabling update-on-repair and helps maintain instance group health during zonal disruptions, further bolstering availability without manual intervention.^[21]

Compute Resources

Machine Types

Google Compute Engine offers a variety of predefined machine type families tailored to different workload requirements, balancing vCPU, memory, and other resources for optimal performance and cost-efficiency. These families include general-purpose, compute-optimized, memory-optimized, accelerator-optimized, and storage-optimized types, each with specific series designed for common use cases such as web serving, high-performance computing, in-memory databases, machine learning inference, and high-I/O data processing. Machine types determine the vCPU-to-memory ratios, networking bandwidth, and other capabilities, allowing users to select configurations that align with their application's needs without custom modifications. The general-purpose machine family, suitable for versatile workloads like web servers, containerized applications, and development environments, encompasses the N1, N2, and N4 series. The N1 series, an earlier generation, supports up to 96 vCPUs with a memory ratio of 6.5 GB per vCPU and networking bandwidth up to 32 Gbps, providing balanced performance for standard tasks. The N2 series, powered by Intel Cascade Lake processors (with Ice Lake for instances over 80 vCPUs), scales to 128 vCPUs at 8 GB memory per vCPU and up to 32 Gbps networking, offering improved price-performance for medium-scale applications. The N4 series extends this with up to 80 vCPUs at 8 GB per vCPU and 50 Gbps networking, while the N4D variant, based on AMD EPYC Turin processors, reaches 96 vCPUs with the same memory ratio and became generally available in November 2025 for enhanced flexibility in general workloads.^[44]^[45] Compute-optimized machine types, such as the C2 and C3 series, prioritize high-frequency CPUs for demanding tasks including high-performance computing (HPC), batch processing, and game servers. The C2 series delivers up to 60 vCPUs with 4 GB memory per vCPU and sustained all-core turbo frequencies up to 3.8 GHz, paired with up to 32 Gbps networking for compute-intensive operations. The C3 series advances this capability to 176 vCPUs at 8 GB per vCPU, supporting even larger-scale HPC and AI training workloads with networking bandwidth up to 100 Gbps.^[46] Memory-optimized types like the M1 and M2 series are engineered for applications requiring substantial RAM, such as in-memory databases, caching layers, and SAP HANA deployments. The M1 series accommodates up to 160 vCPUs with up to 24 GB memory per vCPU (totaling over 3.8 TB), and networking up to 32 Gbps to handle data-heavy queries efficiently. The M2 series focuses on ultra-high memory configurations, supporting 208–416 vCPUs with as much as 12 TB total memory (approximately 28 GB per vCPU in larger instances), ideal for analytics and real-time processing with the same networking bandwidth.^[47] Accelerator-optimized machine types, including the A2, A3, and G2 series, integrate GPUs for graphics rendering, machine learning inference, and generative AI tasks. The A2 series pairs up to 96 vCPUs with 16 NVIDIA A100 GPUs and up to 100 Gbps networking, optimized for large-scale ML training. The A3 series scales to 224 vCPUs with 8 NVIDIA H100 GPUs and exceptional 3,200 Gbps networking, targeting advanced AI workloads. The G2 series, featuring NVIDIA L4 GPUs, supports up to 96 vCPUs with 8 GPUs per instance and 100 Gbps networking, particularly suited for graphics-intensive applications like remote visualization and video processing. Storage-optimized machine types, represented by the Z3 series, cater to high-I/O workloads such as SQL/NoSQL databases, data analytics, and vector databases requiring rapid local storage access. These instances provide up to 176 vCPUs with 36 TiB of local SSD storage and networking bandwidth up to 100 Gbps, enabling low-latency data throughput for scale-out storage systems.

Machine Family	Key Series	vCPU Range	Memory Ratio (GB/vCPU)	Max Networking Bandwidth	Primary Use Cases
General-purpose	N1, N2, N4/N4D	Up to 128	6.5–8	32–50 Gbps	Web servers, microservices
Compute-optimized	C2, C3	Up to 176	4–8	Up to 100 Gbps	HPC, AI /ML batch jobs
Memory-optimized	M1, M2	Up to 416	14–28	Up to 32 Gbps	In-memory databases
Accelerator-optimized	A2, A3, G2	Up to 224	Varies (8–16 base)	100–3,200 Gbps	ML training, graphics
Storage-optimized	Z3	Up to 176	Varies	Up to 100 Gbps	High-I/O databases, analytics

Custom Configurations

Google Compute Engine allows users to create custom machine types that enable precise specification of virtual CPUs (vCPUs) and memory to match specific workload requirements, offering greater flexibility than predefined machine types.^[48] For example, a user can configure an instance with exactly 10 vCPUs and 60 GB of memory using the format custom-10-61440 (where memory is specified in MB), which is particularly useful for applications needing non-standard resource ratios, such as memory-intensive databases or compute-light services.^[48] Memory allocations must be in multiples of 256 MB, and the total configuration must align with the supported machine series, such as N2 or E2.^[44] Constraints on custom machine types ensure compatibility with underlying hardware. In standard configurations, memory per vCPU ranges from 0.9 GB to 6.5 GB, though this varies by series—for instance, N1 series supports 0.922 GB to 6.656 GB per vCPU.^[48] Extended memory options, available for series like N4, N4A, N2, and N1, remove the per-vCPU upper limit, allowing up to 8 GB or more per vCPU (e.g., up to 624 GB total for N1), billed at a premium rate to support workloads like large-scale analytics.^[44] vCPUs can generally be specified in multiples of 1 starting from 1, except for certain series like E2, which require multiples of 2 up to 32 vCPUs.^[44] Sole-tenant nodes provide dedicated physical hardware isolation for custom machine types, ensuring that VMs run exclusively on servers reserved for a single project to meet compliance or security needs.^[49] These nodes support custom configurations in compatible series like N2, where VMs must match the node's machine series but can vary in size within the node's total capacity (e.g., up to 80 vCPUs and 640 GB memory for an n2-standard-80 node).^[49] Custom machine types integrate with accelerators for enhanced performance in AI and high-performance computing. Users can attach NVIDIA GPUs (e.g., A100 or T4) or Google TPUs to custom VMs in supported series like N1 or A2, enabling tailored setups such as a custom-32-225280 instance with four T4 GPUs for machine learning training. As of November 2025, the Arm-based N4A series, powered by Google's Axion processor on the Arm Neoverse N3 platform, supports custom machine types in preview, offering up to 64 vCPUs and 512 GB of DDR5 memory with extended memory options for cost-effective general-purpose workloads.^[50]

Storage Options

Persistent Disks

Persistent Disks are block storage devices in Google Compute Engine that provide durable, high-availability storage independent of virtual machine (VM) instances, allowing data to persist even if the instance is stopped or terminated.^[51] They function like physical hard drives but are managed by Google Cloud, offering features such as live attachment and detachment to running VMs without downtime.^[51] Google Compute Engine offers several types of Persistent Disks to suit different workloads, balancing cost, performance, and latency requirements. Standard Persistent Disks (pd-standard) use hard disk drives (HDDs) and are optimized for large-scale sequential read/write operations, such as media serving or data analytics, with performance scaling at 0.75 read IOPS and 1.5 write IOPS per GiB of provisioned space, up to a maximum of 7,500 read and 15,000 write IOPS per instance on larger machines.^[52] Balanced Persistent Disks (pd-balanced) employ solid-state drives (SSDs) for a cost-effective mix of performance and price, delivering up to 6 IOPS (read and write) per GiB, with a baseline of 3,000 IOPS and maximums reaching 80,000 IOPS per instance, suitable for general-purpose applications like web servers.^[52] SSD Persistent Disks (pd-ssd) provide high-performance storage for demanding workloads such as databases, offering up to 30 IOPS (read and write) per GiB and peaking at 100,000 IOPS per instance, with throughput limits of 1,200 MiBps for reads and writes.^[52] For workloads requiring predictable latency, Extreme Persistent Disks (pd-extreme) allow provisioning of up to 120,000 IOPS and 4,000 MiBps throughput for reads, ensuring consistent performance without scaling solely on disk size.^[53] Across all types, performance scales with disk size and the number of vCPUs in the attached VM instance, but is capped by per-instance limits to prevent overload.^[52] Persistent Disks can be sized from a minimum of 10 GB to a maximum of 64 TB per volume, with sizes adjustable in 1 GB increments; for greater capacity, multiple disks can be combined using software RAID configurations within the VM.^[54] Up to 128 Persistent Disk volumes (including the boot disk) can be attached to a single VM instance, supporting a total attached capacity of up to 257 TiB, which enables scalable storage setups for complex applications.^[51] All Persistent Disks are encrypted at rest by default using Google-managed encryption keys, ensuring data security without additional configuration; alternatively, users can opt for customer-supplied encryption keys (CSEK) to manage their own 256-bit AES keys, providing greater control over encryption for compliance needs, though Google does not store these keys and data becomes inaccessible if they are lost.^[55] Data in transit between the disk and VM is also encrypted. For backup, Persistent Disks support incremental snapshots that can be created and managed separately.^[51]

Local SSD and Hyperdisk

Google Compute Engine offers Local SSD as an ephemeral storage option that provides high-performance, low-latency block storage physically attached to the host machine running the virtual machine instance.^[56] This storage uses NVMe or SCSI interfaces and is designed for temporary workloads where data persistence is not required, as all data on Local SSD disks is lost when the instance stops, is preempted, or the host encounters an error.^[56] Unlike persistent disks, which maintain data independently of the instance lifecycle, Local SSD emphasizes speed over durability and cannot be detached from the instance or used for snapshots.^[56] Local SSD disks come in standard and Titanium variants, with each disk offering 375 GiB of capacity, though Titanium SSD supports up to 6 TiB per disk on certain bare metal configurations.^[56] Instances can attach multiple disks, enabling up to 72 TiB of total Local SSD capacity, depending on the machine type and series (e.g., Z3 series allows 12 disks of 6 TiB each).^[56] Performance scales with the number of disks and interface; for example, Titanium SSD on NVMe can deliver up to 9,000,000 read IOPS, 6,000,000 write IOPS, 36,000 MiB/s read throughput, and 30,000 MiB/s write throughput.^[56] Common use cases include caching, scratch space for high-I/O applications like databases (e.g., temporary tables in SQL Server), and transient data processing in high-performance computing environments.^[56] Limitations include incompatibility with shared-core machine types, the inability to add disks after instance creation, and no support for custom encryption keys or data preservation beyond preview features for live migrations.^[56] Hyperdisk provides a family of durable, high-performance block storage volumes that can be customized for IOPS and throughput independently of capacity, making it suitable for demanding workloads while maintaining data persistence across instance restarts.^[57] Available in several types—Balanced, Balanced High Availability, Extreme, Throughput, and ML—Hyperdisk volumes attach directly to instances like physical disks and support features such as regional replication for high availability and, for the ML variant, sharing across multiple read-only instances with limits varying by volume size (up to 2,500 for volumes ≤256 GiB and lower for larger volumes).^[57] The Balanced type offers a general-purpose balance with up to 160,000 IOPS and 2,400 MiB/s throughput; Extreme prioritizes IOPS at up to 350,000 with 5,000 MiB/s throughput; and Throughput focuses on sequential access with up to 2,400 MiB/s at lower IOPS.^[57] Hyperdisk ML, optimized for AI and machine learning workloads, delivers the highest performance in the family, with up to 1,200,000 MiB/s throughput and 19,200,000 IOPS, enabling faster model loading and reduced idle time for accelerators in inference and training scenarios.^[58] This type supports volumes from 4 GiB to 64 TiB and is particularly useful for immutable datasets like model weights, where multiple instances can access the same volume in read-only mode for large-scale HPC or analytics tasks such as those in Hadoop or Spark.^[58] It became generally available in 2024, enhancing support for AI-driven applications.^[59] Limitations for Hyperdisk include restrictions on using Extreme, ML, or Throughput types as boot disks, zonal-only availability for ML volumes, and the need to adjust performance settings in increments (e.g., throughput every 6 hours), with attachment limits varying by volume size (up to 2,500 for ≤256 GiB, lower for larger).^[57]^[58]

Hyperdisk Type	Max IOPS	Max Throughput (MiB/s)	Key Focus
Balanced	160,000	2,400	General-purpose workloads
Extreme	350,000	5,000	High random I/O
Throughput	9,600	2,400	Sequential access
ML	19,200,000	1,200,000	AI/ML data loading

Networking and Connectivity

Virtual Private Cloud

Google Compute Engine (GCE) utilizes Virtual Private Cloud (VPC) networks as the foundational networking layer, providing isolated, scalable virtual environments for resources like virtual machine (VM) instances. A VPC network acts as a global, virtual version of a physical network, spanning multiple regions without the need for physical cabling, and enables users to define subnets within specific regions for logical segmentation of resources. These networks support auto-mode or custom-mode configurations, where auto-mode automatically creates subnets in every region, while custom-mode allows manual definition of IP ranges and subnet placements to suit workload requirements. IP addressing in GCE VPCs includes internal IPv4 and IPv6 addresses assigned to instances, with support for both dual-stack and IPv6-only configurations to accommodate modern networking needs. External IPv4 addresses can be optionally attached to instances for public internet access, while alias IP ranges enable secondary IP assignments to VMs or load balancers without additional network interfaces. Subnets are associated with primary IP ranges (CIDR blocks) and can include secondary ranges for alias IPs, ensuring efficient address management across regional deployments. IPv6 support, introduced in 2022 and expanded thereafter, allows global anycast addresses for improved scalability in IPv6-enabled workloads.^[60] Firewall rules in VPC networks control ingress and egress traffic using distributed, stateful firewalls that apply to all instances within the network. Rules are defined by priority (lower numbers take precedence), direction, and action (allow or deny), with matching based on IP protocols, ports, source/destination IP ranges, and instance tags or service accounts for granular security. For example, a common rule might allow HTTP traffic (port 80) from any source to instances tagged "web-server," while denying all other ingress to minimize exposure. These rules are enforced at the instance level but defined at the VPC level, with a default quota of 1000 rules per project, and hierarchical firewall policies available for enterprise-scale management.^[61] Routes in VPC networks direct traffic flow, including default routes for internet-bound traffic and custom static routes for on-premises or peered network connectivity. The system-generated default route (0.0.0.0/0) handles outbound traffic to the internet via Google's edge routers, while custom routes can specify next-hop types such as VM instances, VPN tunnels, or interconnects with metrics to prioritize paths. Route propagation is automatic for connected networks, ensuring dynamic updates without manual intervention in most cases. VPC Network Peering enables secure, low-latency connectivity between multiple VPC networks, either within the same project, across projects, or between different organizations, without requiring gateways or VPNs. Peering connections exchange routes automatically (unless disabled), allowing instances in peered networks to communicate using internal IP addresses as if they were in the same network, which is particularly useful for multi-project architectures or hybrid cloud setups. Limitations include no transitive peering (direct connections only) and non-overlapping IP ranges to prevent conflicts. Shared VPCs extend this capability by allowing centralized network administration across projects, where a host project owns the VPC and subnets are shared with attached projects for resource deployment.

Load Balancing and IP Management

Google Cloud Load Balancing provides scalable traffic distribution for Compute Engine instances, supporting various types tailored to different protocols and scopes. The platform offers external Application Load Balancer for HTTP(S) traffic, which operates globally using proxy-based distribution to handle content-based routing and SSL offloading. External passthrough Network Load Balancer supports TCP/SSL/UDP protocols with non-proxied forwarding for low-latency applications, while Internal Application and Passthrough Load Balancers manage intra-VPC traffic for private services. These load balancers integrate with managed instance groups (MIGs) as backend services, enabling automatic distribution of traffic across multiple VM instances for high availability and scalability.^[62] IP address management in Compute Engine distinguishes between ephemeral and static external IPs to support reliable external connectivity. Ephemeral external IPs are automatically assigned from Google's pool upon VM creation and released when the instance stops or terminates, making them suitable for temporary workloads but unsuitable for services requiring persistent addressing. Static external IPs can be reserved in advance or promoted from an existing ephemeral IP, ensuring consistent public access for DNS records or external integrations, with options for regional or global scopes. Reservations allow pre-allocation without attachment to a specific instance, facilitating flexible assignment across projects or regions.^[63]^[64] Global load balancing leverages anycast IP addressing to route traffic to the nearest healthy backend across worldwide regions, minimizing latency for multi-region deployments. A single anycast IP serves as the frontend, with Google's edge network directing packets based on proximity and backend health, supporting both premium and standard network tiers for optimized performance. This approach enables seamless failover and content delivery integration, such as with Cloud CDN, for applications spanning multiple zones or regions.^[65]^[66] Autoscaling integrates with load balancing through backend services and health checks to dynamically adjust instance counts based on traffic demands. Health checks probe instance groups at configurable intervals to verify responsiveness, removing unhealthy backends from load distribution and triggering autoscaling policies. Autoscalers can base decisions on load balancing capacity metrics, such as serving capacity or HTTP request rates, ensuring resources scale in tandem with incoming traffic while integrating with MIGs for rolling updates.^[67]^[68] Recent enhancements to global load balancing include traffic isolation policies, introduced in May 2025, which route requests preferentially to the closest region for multi-region applications, reducing latency in preview mode. Additionally, failover capabilities for global external Application Load Balancers, reaching general availability in November 2024, provide regional backup backends for improved resilience in distributed setups. These updates build on prior optimizations like service load balancing policies from July 2024, enhancing multi-region traffic management.^[69]

Images and Snapshots

Operating System Images

Google Compute Engine provides a variety of preconfigured public operating system (OS) images that users can select to boot virtual machine (VM) instances, ensuring compatibility with Google's infrastructure. These images include popular Linux distributions and Windows Server editions, all optimized for cloud workloads with built-in support for features like automatic security updates and IPv6 networking. Public images are maintained by Google or partners and are available at no additional licensing cost for most Linux variants, while Windows images incur on-demand licensing fees.^[70]^[71] Among the supported Linux public images are Debian (versions 13, 12, and 11), Ubuntu LTS (such as 24.04 and 22.04), and CentOS Stream (10 and 9), each with default disk sizes ranging from 10 GB to 20 GB and configurations that disable root password login for enhanced security. For Windows, public images encompass Server 2022, 2019, 2016, and the 2025 edition, which achieved general availability in late 2024 and supports extended security updates until November 2034; these images enable automatic updates and integration with Google Cloud tools like the guest environment for metadata access. Users can list and select these images via the Google Cloud console or gcloud CLI commands, with regular patches applied for critical vulnerabilities.^[71]^[70]^[72]^[73] In addition to public images, users can create custom images to tailor environments with specific software or configurations. Custom images are generated from existing boot disks, snapshots, or imported virtual disks stored in Cloud Storage, allowing for the pre-installation of applications before launching instances. This approach supports scenarios like migrating on-premises workloads or standardizing VM setups across deployments. To manage version updates efficiently, Google Compute Engine uses image families, which are logical groupings of related images within projects like debian-cloud or ubuntu-os-cloud. For instance, the debian-11 family always references the latest non-deprecated Debian 11 image, enabling rolling updates without manual intervention; if issues arise, administrators can deprecate the current image to revert to a prior stable version. This mechanism ensures access to the most recent stable releases while avoiding end-of-life versions.^[70] Deprecated images, such as those for Windows Server 2012 and Ubuntu 20.04, enter an end-of-support phase where Google ceases updates and eventually deletes them from public availability, prompting users to migrate to supported alternatives. During deprecation, image families automatically exclude these versions, and existing VMs can continue running but without security patches or compatibility guarantees; extended paid support may be available for select OSes like Windows via Microsoft's programs.^[74]^[71]^[75] Images in Google Compute Engine operate on a global scope, allowing them to be shared seamlessly across projects and regions without duplication. Public images are inherently accessible project-wide, while custom images can be exported to Cloud Storage or granted permissions via IAM policies for use in other projects, facilitating consistent deployments in multi-project environments.^[70]

Snapshots and Data Backup

Google Compute Engine provides disk snapshots as a mechanism for backing up data from persistent disks and Hyperdisks. These snapshots capture the contents of a disk at a specific point in time and serve as incremental backups, storing only the data that has changed since the previous snapshot to optimize storage efficiency. All disk snapshots are encrypted at rest using Google-managed keys, and they are stored in Google Cloud Storage, with options for multi-regional or regional locations to ensure durability and availability.^[76]^[77] Disk snapshots can be created manually through the Google Cloud console, gcloud CLI, or REST APIs, allowing users to initiate backups on demand. Automated creation is supported via snapshot schedules, which enable periodic backups at user-defined intervals, such as daily or weekly, configurable through the APIs. Retention policies can be applied to manage snapshot lifecycle, with standard snapshots suitable for short- to medium-term retention and archive snapshots designed for long-term storage at lower costs; snapshots persist independently of the source disk and can be retained indefinitely until manually deleted.^[77]^[78] To restore data from a disk snapshot, users create a new persistent disk or Hyperdisk from the snapshot, which must be at least as large as the original source disk. This new disk can then be attached to a running or new virtual machine instance using the console, gcloud commands, or APIs, after which the file system is mounted to access the restored data. The restoration process supports both zonal and regional disks, enabling quick recovery without downtime for the original instance.^[79] In addition to disk-level backups, Google Compute Engine offers machine image snapshots, which provide full backups of an entire virtual machine instance. A machine image captures the instance's configuration, metadata, permissions, operating system, and data from all attached disks in a crash-consistent manner, using differential snapshots for subsequent images to store only changes from prior versions. These are particularly useful for cloning instances, troubleshooting, or replicating environments across projects.^[80] Disk and machine image snapshots support global scope, allowing them to be created and restored in any region or zone within the project. Standard and archive snapshots are automatically replicated across multiple regions for high durability (up to 99.999999999% over a year), facilitating disaster recovery by enabling quick restoration in a secondary region during outages.^[76]^[80]

Features and Capabilities

Performance Optimizations

Google Compute Engine offers various CPU platforms to optimize virtual machine (VM) performance based on workload requirements, supporting Intel Xeon processors such as Granite Rapids and Sapphire Rapids for general-purpose and compute-intensive tasks, AMD EPYC processors including Turin and Genoa for cost-effective scale-out applications, and Arm-based processors like Google Axion and NVIDIA Grace for energy-efficient AI and cloud-native workloads.^[31] These platforms enable users to select machine series tailored to specific architectures, with Intel providing advanced vector extensions like AVX512 for high-performance computing, AMD offering strong multi-threaded performance for databases, and Arm delivering up to 50% better price-performance in certain inference scenarios.^[32] Transparent maintenance in Compute Engine is achieved through automatic live migration, which seamlessly transfers running VMs to healthy hosts during infrastructure events like hardware repairs or software updates without downtime, reboot, or changes to instance configurations such as IP addresses or attached storage.^[81] This process involves a brief blackout period of under one second, during which the VM's memory state is copied to the target host, ensuring high availability for most workloads while excluding specialized setups like those with attached GPUs or large local SSDs.^[81] Disk performance can be enhanced by tuning IOPS and throughput provisions, particularly with Hyperdisk volumes that allow dynamic adjustments every four to six hours without detaching the disk, enabling workloads to scale from baseline levels up to 350,000 IOPS for Hyperdisk Extreme or 1,200,000 MiB/s throughput for Hyperdisk ML.^[82] For read-heavy applications, Hyperdisk supports asynchronous replication to create read replicas in a secondary region, providing low-latency access to duplicated data while maintaining primary write performance.^[57] Optimization techniques include aligning application I/O patterns with provisioned limits and using tools like fio for benchmarking to avoid shared resource contention across multiple disks.^[83] Acceleration for machine learning workloads is facilitated by attaching GPUs or TPUs to VMs, with GPU-enabled machine types like the A4 series integrating up to eight NVIDIA B200 GPUs for training large models and the G2 series supporting up to eight L4 GPUs for efficient inference.^[84] TPUs, optimized for tensor operations, can be deployed as TPU VMs directly connected to Compute Engine instances for hybrid setups, accelerating frameworks like TensorFlow with up to 10x peak performance gains over previous generations in training and serving tasks.^[85] These attachments require compatible machine types and zones, ensuring seamless integration for data processing and graphics-intensive applications.^[86] As of November 2025, the N4D machine series, powered by AMD EPYC Turin processors and featuring up to 768 GB of DDR5 memory, delivers up to 3.5x better price-performance for web-serving workloads, 50% higher performance for general computing, and 70% higher for Java workloads compared to the prior N2D series, enhancing VM efficiency for web serving and Java-based workloads.^[45] This update supports custom machine types with up to 96 vCPUs and a 4.1 GHz max-boost frequency, providing substantial gains in memory-bound scenarios without altering existing migration policies.^[21]

Management and Automation Tools

Google Compute Engine provides several tools for orchestrating, monitoring, and automating the lifecycle of virtual machine instances, enabling efficient management at scale. Cloud Deployment Manager is an infrastructure-as-code service that automates the creation, updating, and deletion of Compute Engine resources through declarative configuration files written in YAML or Python templates. It supports deploying instance groups, networks, and disks by leveraging underlying Compute Engine APIs, allowing users to version and reuse infrastructure definitions for consistent environments.^[87] Note that Cloud Deployment Manager will reach end of support on March 31, 2026, with recommendations to migrate to alternatives like Infrastructure Manager.^[88] For broader orchestration, Google Compute Engine integrates seamlessly with Terraform, an open-source infrastructure-as-code tool from HashiCorp. Users can provision and manage Compute Engine instances, such as virtual machines and autoscalers, using Terraform's declarative language and the official Google provider, which translates configurations into API calls for creating resources like google_compute_instance.^[89] This integration supports complex setups, including state management and dependency handling, to ensure reproducible deployments across projects.^[90] Monitoring and logging capabilities are essential for maintaining Compute Engine operations, with Cloud Monitoring collecting metrics such as CPU utilization, disk I/O, and network traffic from virtual machine instances via the Ops Agent.^[91] Users can create dashboards, set alerting policies based on thresholds, and visualize performance trends to proactively address issues.^[92] Complementing this, Cloud Logging aggregates and analyzes logs from Compute Engine VMs, including system events and application outputs, enabling real-time search, filtering, and export for compliance and troubleshooting.^[93] These tools integrate with other Google Cloud services to provide unified observability across hybrid environments.^[94] Autoscaling in Compute Engine is handled through managed instance groups (MIGs), which automatically adjust the number of virtual machine instances based on predefined policies tied to metrics like CPU usage, memory consumption, or custom signals from Cloud Monitoring.^[95] For example, a policy can scale out by adding instances when average CPU exceeds 60% and scale in when it drops below 40%, ensuring resource elasticity without manual intervention.^[96] This feature supports both zonal and regional MIGs, with options for predictive scaling based on historical load patterns to minimize latency during traffic spikes.^[97] Operating system management is streamlined via OS Config, a service within VM Manager that automates patching, compliance reporting, and configuration enforcement on Compute Engine instances.^[98] It applies OS updates using native mechanisms for supported images like Debian, RHEL, and Windows, with scheduling options to patch during maintenance windows and assess compliance against baselines such as CIS benchmarks.^[99] The OS Config agent, installed on VMs, reports patch status and vulnerabilities, enabling fleet-wide remediation without downtime risks.^[100] Automation is further enhanced through the gcloud CLI, part of the Google Cloud SDK, which provides scripting-friendly commands for managing Compute Engine resources programmatically. Commands like gcloud compute instances create and gcloud compute instance-groups managed list-instances support batch operations, filtering, and output formatting in JSON or YAML for integration into CI/CD pipelines or shell scripts.^[101] As of November 10, 2025, observability fields for reservations became generally available, allowing users to query via API or CLI which reservations a VM consumes and list VMs attached to specific reservations, improving visibility into committed resource utilization.^[21]

Billing and Pricing

Pricing Models

Google Compute Engine employs a pay-as-you-go pricing model for virtual machine (VM) instances, where costs are calculated based on the resources consumed, including vCPUs, memory, and storage. Billing occurs per second after a 10-minute minimum charge, allowing for flexible usage without long-term commitments. For on-demand instances, vCPUs and memory are priced separately; for example, in the us-central1 region, an N2 vCPU may cost approximately $0.03465 per hour, while memory is around $0.003938 per GiB-hour, with rates varying by machine family and region.^[102]^[103] Spot VMs offer a cost-effective alternative for interruptible workloads, providing discounts typically up to 91% (previously 60-91%) compared to on-demand prices, with dynamic pricing that can adjust up to once per day based on supply and demand. Starting October 28, 2025, discounts may be less than 60% off on-demand prices.^[104] These instances can be preempted by Google with a 30-second notice, making them suitable for fault-tolerant applications like batch processing. For instance, a Spot VM equivalent to an on-demand N2 instance might cost as low as $0.003465 per vCPU-hour under optimal conditions.^[105]^[106] Certain operating systems incur premium charges in addition to base VM costs, particularly for licensed images such as Windows Server or specialized Linux distributions. Windows licensing, for example, adds fees that scale with vCPU count, such as $0.006 per core per hour for instances with 9-127 vCPUs, billed on-demand through Google Cloud. Similarly, Red Hat Enterprise Linux (RHEL) adds approximately $0.06 per hour for instances with 1-4 vCPUs and $0.13 per hour for those with 5+ vCPUs (rates as of 2023; verify current pricing), while SUSE Linux Enterprise Server (SLES) charges $0.02-$0.11 per hour depending on the machine type. These fees ensure compliance with vendor licensing while integrating seamlessly with Compute Engine billing.^[107]^[108]^[109] Network egress traffic, which includes data leaving the Google Cloud network, follows tiered pricing to encourage efficient data management. Inbound traffic is free, but outbound to the internet is charged per GiB; for example, in North America, the first 1 GiB is free monthly, followed by $0.12 per GiB for the next 1 TiB, decreasing to $0.08 per GiB beyond 10 TiB. Inter-region transfers within the same continent cost $0.01 per GiB, while cross-continent egress starts at $0.02-$0.14 per GiB depending on the destinations.^[110] Persistent disk storage contributes to overall costs, with pricing based on provisioned capacity rather than usage. Standard persistent disks (HDD) cost $0.04 per GiB-month, while SSD-backed disks are $0.17 per GiB-month, prorated per second for the full provisioned amount. To illustrate total cost calculation for a simple on-demand VM: consider an instance with 2 vCPUs at $0.03465 per hour each, 8 GiB memory at $0.003938 per GiB-hour, a 100 GiB SSD disk at $0.17 per GiB-month (approximately $0.000235 per GiB-hour), running for 730 hours (one month). The monthly cost would be (2 × 0.03465 × 730) + (8 × 0.003938 × 730) + (100 × 0.17) ≈ $50.59 (vCPUs) + $22.99 (memory) + $17.00 (disk) = $90.58, excluding any premium OS or egress fees. This formula—total cost = (vCPU-hours × vCPU-rate) + (GiB-hours × memory-rate) + (provisioned GiB-months × disk-rate)—highlights the granular, resource-based billing structure.^[107]^[102]

Discounts and Reservations

Google Compute Engine offers several mechanisms to reduce costs for long-term or predictable workloads, including sustained use discounts, committed use discounts, reservations, and promotional credits. These options allow users to optimize expenses without altering their infrastructure, by applying automatic reductions or guaranteeing resource availability. Sustained use discounts (SUDs) are applied automatically to eligible resources that run for more than 25% of a billing month, providing tiered savings that increase with higher utilization levels, up to 30% off on-demand prices for full-month usage.^[111] SUDs apply to virtual machine instances, persistent disk, and certain other resources, but only if no other discounts like committed use are active on the same usage.^[111] Committed use discounts (CUDs) enable deeper savings through contractual commitments to specific resource usage over 1- or 3-year periods, without upfront payments. Resource-based CUDs target predictable workloads on particular machine types, offering discounts up to 57% compared to on-demand pricing, depending on the commitment term and resource.^[112] In contrast, flexible CUDs are spend-based commitments that apply broadly across Compute Engine, Google Kubernetes Engine, and Cloud Run, providing a flat 28% discount for 1-year terms and 46% for 3-year terms on eligible vCPU and memory usage.^[113] These can be combined with reservations to ensure capacity while maximizing savings, and they automatically cover the highest-discount eligible usage first.^[114] Reservations allow users to pre-provision capacity in specific zones or regions, guaranteeing availability for critical workloads even during high demand.^[115] Users are charged at standard on-demand rates for reserved resources, but reservations can be paired with CUDs or SUDs for additional discounts, and they support features like future reservations for planning up to one year ahead.^[115] To aid management, Compute Engine provides reservation recommendations based on historical usage, helping identify opportunities to optimize idle capacity.^[116] New customers receive $300 in free credits upon signup, applicable to Compute Engine and other Google Cloud services for the first 90 days, alongside an always-free tier that includes one e2-micro VM instance per month in select regions.^[2] In November 2025, Google made generally available enhanced reservation observability features, including API fields to verify which reservation a VM consumes and list VMs using a specific reservation, improving transparency for capacity management.^[21]

References

[1]
Compute Engine overview | Google Cloud Documentation
Compute Engine is an infrastructure as a service (IaaS) product that offers self-managed virtual machine (VM) instances and bare metal instances. Compute Engine ...PMU overview · Instance creation overview · Stop or suspend VMs overview
[2]
Compute Engine | Google Cloud
Compute Engine is a computing and hosting service that lets you create and run virtual machines on Google infrastructure, comparable to Amazon EC2 and Azure ...
[3]
Google Compute Engine launches, expanding Google's cloud ...
Jun 28, 2012 · Google App Engine has been at the heart of Google's cloud offerings since our launch in 2008, and we're excited to begin providing developers ...
[4]
Google Compute Engine is now Generally Available with expanded ...
Dec 2, 2013 · Google Compute Engine is now Generally Available with expanded OS support, transparent maintenance, and lower prices · Expanded operating system ...
[5]
Compute Engine instances - Google Cloud Documentation
This page provides an overview of Compute Engine instances. A Compute Engine instance can be either a virtual machine (VM) or bare metal instance that is ...Create and start an instance · Windows workloads · Microsoft Licensing on Google...
[6]
Regions and zones | Compute Engine - Google Cloud Documentation
Authenticate workloads to Google Cloud API using service accounts · Authenticate workloads to other workloads over mTLS · Agent for Compute Workloads overview ...Resource Quotas · Location Selection Tips · Available Regions And Zones
[7]
Google Announces Cloud Infrastructure Service: Google Compute ...
Jun 28, 2012 · The cloud service, which Google describes as in a 'Limited Preview Release,' allows users to run large-scale computing workloads on Linux ...Missing: June | Show results with:June
[8]
Google Taps KVM for Compute Engine - Database Trends and ...
Google recently announced the Google Compute Engine, a service running on KVM that enables developers to have quick access to vast numbers of virtual ...
[9]
Google's New IaaS Offering Runs Linux VMs in the Cloud - InfoQ
Jun 28, 2012 · As is typical for Google, GCE will be made available on a limited preview basis, which you must sign up to receive. No date for general release ...
[10]
Free Ride Will Soon Be Over For Google Compute Engine Limited ...
Feb 13, 2013 · Google Compute Engine Limited Preview beta customers who want to continue using the cloud service will have to fork over their credit-card ...
[11]
Google Compute Engine is now open to all
May 15, 2013 · To get started, go to the Google Cloud Console, select Compute Engine and click the “New Instance” button. Fill out the required information ...Missing: beta | Show results with:beta
[12]
[PDF] Early Observations on Performance of Google Compute Engine for ...
Dec 23, 2013 · Corresponding to the single-CPU GCE type (i.e. n1-standard-1-d), we only show the single-thread memory performance of different VM types. As ...<|control11|><|separator|>
[13]
Expanded Windows Support on Google Cloud Platform
Dec 8, 2014 · Monday, December 8, 2014. Our customers, large and small, have put a number of things on their holiday wish lists, including better support ...
[14]
Google Compute Engine Goes Into General Availability - Forbes
Dec 3, 2013 · GCE has only been in beta for a little over 12 months, such a short time to GA indicates the seriousness with which Google approaches this ...
[15]
Introducing Sustained Use Discounts - Google Cloud Platform Blog
Apr 4, 2014 · Introducing Sustained Use Discounts - Automatically pay less for sustained workloads on Compute Engine. Friday, April 4, 2014. At Google Cloud ...
[16]
Introducing Preemptible VMs, a new class of compute available at ...
May 18, 2015 · Preemptible VMs are the same as regular instances except for one key difference - they may be shut down at any time.Missing: 2016 | Show results with:2016
[17]
Announcing GPUs for Google Cloud Platform
Nov 16, 2016 · Early in 2017, Google Cloud Platform will offer GPUs worldwide for Google Compute Engine and Google Cloud Machine Learning users.
[18]
The new Google Cloud region in Jakarta is now open
Jun 24, 2020 · With this region, Google Cloud now offers 24 regions and 73 zones across 17 countries worldwide. Having a region in Jakarta will help new and ...
[19]
Introducing Google Cloud Confidential Computing & VMs
Jul 14, 2020 · As of June 16, 2022, Confidential VMs are generally available on compute optimized C2D VMs. Learn more here. At Google, we believe the future of ...
[20]
Compute Engine release notes - Google Cloud Documentation
On January 21, 2026, all remaining accounts will automatically migrate to the new model. You can opt in before that date to start receiving the expanded ...
[21]
Introducing Flex-start VMs for the Compute Engine Instance API.
Sep 25, 2025 · Flex-start VMs can run uninterrupted for a maximum of seven days and consume preemptible quota. A new way to request capacity.
[22]
G4 VMs powered by NVIDIA RTX 6000 Blackwell GPUs are GA
Oct 20, 2025 · The G4 VM is available now, bringing GPU availability to more Google Cloud regions than ever before, for applications that are latency sensitive ...
[23]
Compute Engine instance lifecycle - Google Cloud Documentation
This document explains the lifecycle of a Compute Engine instance, covering the various states it can go through from creation to deletion.
[24]
Create and start a Compute Engine instance
In the Google Cloud console, go to the Create an instance page. Go to Create an instance · To configure instance properties, use the options in the navigation ...
[25]
Overview of creating Compute Engine instances
Compute Engine lets you create and run instances on Google infrastructure. This document provides an overview of the various configuration parameters that ...
[26]
Instance groups | Compute Engine - Google Cloud Documentation
Compute Engine offers two kinds of VM instance groups, managed and unmanaged: Managed instance groups (MIGs) let you operate apps on multiple identical VMs. You ...Create a MIG with autoscaling · Work with managed instances · About regional MIGs
[27]
Autoscaling groups of instances | Compute Engine
Managed instance groups (MIGs) offer autoscaling capabilities that let you automatically add or delete virtual machine (VM) instances from a MIG based on ...Understand autoscaler decisions · Using an autoscaling policy...
[28]
About Flex-start VMs - Compute - Google Cloud Documentation
This document provides an overview of Flex-start VMs, detailing their key characteristics, as well as the requirements and limitations that you apply when ...Missing: GA September
[29]
Bare metal instances on Compute Engine
Learn about the features and available machine series for bare metal instances on Compute Engine.
[30]
Set the number of threads per core | Compute Engine
On Compute Engine, each virtual CPU (vCPU) is implemented as a single hardware multithread, and two vCPUs share each physical CPU core by default.
[31]
CPU platforms | Compute Engine - Google Cloud Documentation
The machine type of your compute instance determines the number of vCPUs and amount of memory allocated to the instance. CPU processor, Processor SKU, Supported ...X86 Processors · Intel Processors · Cpu Features<|control11|><|separator|>
[32]
https://docs.cloud.google.com/compute/docs/general-purpose-machines
[33]
Machine families resource and comparison guide | Compute Engine
This document describes the machine families, machine series, and machine types that you can choose from to create a virtual machine (VM) instance or bare ...Missing: initial IaaS KVM
[34]
Allocation quotas | Compute Engine - Google Cloud Documentation
This quota is visible in the Google Cloud console on the Quotas page. Compute Engine automatically sets this quota to be 10 times your regular CPU quota. You ...
[35]
Network bandwidth | Compute Engine - Google Cloud Documentation
Read about Google Cloud's network bandwidth and Tier_1 networking for compute instances, ingress and egress rates, receive and transmit queues, and queue ...Missing: 2024 | Show results with:2024<|separator|>
[36]
Compute Engine quota and limits overview
A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software ...
[37]
Google Compute Engine FAQ | Google Cloud Documentation
How do I find out how much quota I have used or have left? Check your quota limits and usage in the quota page on the Google Cloud console. If you reach the ...
[38]
Regions and zones | Compute Engine | Google Cloud
### Summary of Regions and Zones in Google Cloud Compute Engine
[39]
https://docs.cloud.google.com/about/locations
[40]
https://cloud.google.com/blog/products/infrastructure/google-cloud-launches-42nd-cloud-region-in-sweden
[41]
General-purpose machine family for Compute Engine | Google Cloud
### Summary of Custom Machine Types for General-Purpose Series (as of November 2025)
[42]
N4D VMs based on AMD Turin now GA | Google Cloud Blog
N4D now GA: Gain up to 3.5x price-performance for scale-out workloads. November 11, 2025 ...
[43]
https://cloud.google.com/solutions/best-practices-compute-engine-region-selection
[44]
https://cloud.google.com/compute/docs/general-purpose-machines
[45]
Create a VM with a custom machine type | Compute Engine | Google Cloud
### Summary of Custom Machine Types on Google Cloud Compute Engine
[46]
Sole-tenancy overview | Compute Engine | Google Cloud
### Summary: Custom Machine Types Support for Sole-Tenant Nodes
[47]
Axion-based N4A VMs now in preview | Google Cloud Blog
Compute. Unlock 2x better price-performance with Axion-based N4A VMs, now in preview. November 6, 2025. Nate Baum. Senior Product Manager. Mo ...
[48]
About Persistent Disk | Compute Engine
This document describes the features, types, performance and benefits of Persistent Disk volumes. If you need block storage for a virtual machine (VM) ...Persistent Disk types · Zonal Persistent Disk · Regional Persistent Disk
[49]
Persistent Disk performance overview | Compute Engine
For N2D, the maximum write throughput per instance for SSD Persistent Disk is 600 MiBps. Sharing disks between instances. Attaching a disk to multiple instances ...Iops Limits For Zonal... · Iops Limits For Regional... · Baseline Performance
[50]
Extreme Persistent Disk | Compute Engine
When you create an Extreme Persistent Disk, you can provision 2,500 to 120,000 IOPS. If you need more than 125,000 IOPS, we recommend Google Cloud Hyperdisk.
[51]
Create a new Persistent Disk volume | Compute Engine
In the Google Cloud console, activate Cloud Shell. · Use the gcloud compute disks create command to create the zonal Persistent Disk volume. · After you create ...
[52]
Encrypt disks with customer-supplied encryption keys | Compute Engine | Google Cloud Documentation
### Summary of Encryption for Persistent Disks
[53]
About Local SSD disks | Compute Engine
Local SSD disks offer superior I/O operations per second (IOPS), and very low latency compared to the persistent storage provided by Google Cloud Hyperdisk and ...<|control11|><|separator|>
[54]
Google Cloud Hyperdisk overview | Compute Engine
This document describes the features of Google Cloud Hyperdisk. Hyperdisk is the fastest and most efficient durable disk for Compute Engine.Hyperdisk features · Choose a Hyperdisk type for... · Durability of Hyperdisk
[55]
About Hyperdisk ML | Compute Engine - Google Cloud Documentation
This document describes the features of Hyperdisk ML, which offers the highest throughput of all Google Cloud Hyperdisk types.Missing: date | Show results with:date
[56]
Google Cloud Platform Technology Nuggets — October 16–31 ...
Nov 4, 2024 · Hyperdisk ML, the AI/ML-focused block storage service is generally available. Google has been announced as a Leader in Gartner Magic Quadrant ...
[57]
Choose a load balancer - Google Cloud Documentation
External versus internal load balancing; Global versus regional load balancing; Premium versus Standard Network Service Tiers; Proxy versus passthrough load ...Load balancing aspects · Proxy versus passthrough load...
[58]
IP addresses | Compute Engine - Google Cloud Documentation
Compute Engine automatically assigns a single IPv4 address from the primary IPv4 subnet ranges. You assign a specific internal IPv4 address when you create a ...Internal IP addresses · External IP addresses · Regional and global IP...
[59]
Configure static external IP addresses | Compute Engine
In the Google Cloud console, go to the IP addresses page. Go to IP addresses. Find the address in the list and check the Type column for the type of IP address.
[60]
Cloud Load Balancing overview - Google Cloud Documentation
Single anycast IP address. With Cloud Load Balancing, a single anycast IP address is the frontend for all of your backend instances in regions around the world.Types of Google Cloud load... · Summary of types of Google...
[61]
External Application Load Balancer overview
Global Anycast external IP addresses over Premium Tier; Can access backends across multiple regions; Supports Cloud CDN; Supports Cloud Armor. Classic ...Cloud Run backends... · Cloud Storage (backend... · Set up a global external...
[62]
Load balancing and scaling | Compute Engine
Google Cloud offers several different types of load balancing that differ in capabilities, usage scenarios, and how you configure them. See Google Cloud load ...
[63]
Scaling based on load balancing serving capacity | Compute Engine
You can use autoscaling in conjunction with load balancing by setting up an autoscaler that scales based on the load of your instances. An external or internal ...
[64]
Cloud Load Balancing release notes - Google Cloud Documentation
Global and cross-region load balancers now support enabling traffic isolation on the service load balancing policy. By default, these load balancers use the ...
[65]
OS images | Compute Engine | Google Cloud
### Public OS Images for Google Compute Engine
[66]
Operating system details | Compute Engine | Google Cloud
Summary of each segment:
[67]
Google Cloud latest news and announcements
Jan 1, 2025 · We are excited to announce the general availability of Windows Server 2025 on Google Compute Engine. You can now run Windows Server 2025 Data ...<|control11|><|separator|>
[68]
Operating systems lifecycle | Compute Engine | Google Cloud
### Summary of Deprecated and End-of-Life OS Images in Google Compute Engine
[69]
About archive and standard disk snapshots | Compute Engine | Google Cloud
### Disk Snapshots in Google Compute Engine
[70]
Create archive and standard disk snapshots | Compute Engine | Google Cloud
### Summary of Snapshots in Google Cloud Compute Engine
[71]
https://cloud.google.com/compute/docs/images/os-details
[72]
Restore from a snapshot | Compute Engine | Google Cloud
### Restoration Process for Disk Snapshots
[73]
Machine images | Compute Engine | Google Cloud
### Summary of Machine Images in Google Compute Engine
[74]
Live migration process during maintenance events | Compute Engine
Live migration lets Google Cloud perform maintenance without interrupting a workload, rebooting an instance, or modifying any of the instance's properties ...
[75]
Analyze the provisioned IOPS and throughput for Hyperdisk volumes
You can change the provisioned IOPS or throughput once every 6 hours for Hyperdisk ML or once every 4 hours for all other Hyperdisk types.
[76]
Optimize Hyperdisk performance | Compute Engine
After you provision your Google Cloud Hyperdisk volumes, your application and operating system might require performance tuning to meet your performance needs.
[77]
GPU machine types | Compute Engine - Google Cloud Documentation
This document outlines the NVIDIA GPU models available on Compute Engine, which you can use to accelerate machine learning (ML), data processing, ...
[78]
Introduction to Cloud TPU - Google Cloud Documentation
Cloud TPUs are optimized for specific workloads. In some situations, you might want to use GPUs or CPUs on Compute Engine instances to run your machine learning ...Tpus · Best Practices For Model... · Padding
[79]
https://cloud.google.com/compute/docs/disks/restore-snapshot
[80]
Supported resource types | Cloud Deployment Manager
Deployment Manager uses the underlying APIs of each Google Cloud service to deploy your resources. For example, to create Compute Engine virtual machine ...
[81]
Deployment Manager Fundamentals - Google Cloud Documentation
Cloud Deployment Manager will reach end of support on March 31, 2026. If you currently use Deployment Manager, please migrate to Infrastructure Manager or ...Configuration · Templates · Types
[82]
Provision Compute Engine resources with Terraform
This page introduces you to using Terraform with Compute Engine, including an introduction to how Terraform works and some resources to help you get started ...
[83]
Getting Started with the Google Cloud provider - Terraform Registry
A Google Compute Engine VM instance is named google_compute_instance in Terraform. The google part of the name identifies the provider for Terraform.
[84]
Cloud Monitoring documentation - Google Cloud
Learn how to collect and monitor metrics from an Apache web server installed on a Compute Engine virtual machine (VM) instance by using the Ops Agent.
[85]
Cloud Monitoring - Google Cloud
Cloud Monitoring offers automatic out-of-the-box metric collection dashboards for Google Cloud services. It also supports monitoring of hybrid and multicloud ...Ops Agent · Alerting overview · Cloud Monitoring overview · Monitoring API usage
[86]
Cloud Logging documentation - Google Cloud
This hands-on lab shows you how to view your Cloud Run functions with their execution times, execution counts, and memory usage in the Google Cloud console.Cloud Logging overview · Firewall Rules Logging · Cloud Audit Logs overview
[87]
Cloud Logging overview | Google Cloud Documentation
This document provides an overview of Cloud Logging, which is a real-time log-management system with storage, search, analysis, and monitoring support.Visualize and monitor your log... · Log storage and retention · Categories of log data
[88]
Create a MIG with autoscaling enabled | Compute Engine
This document describes how to create an autoscaled managed instance group (MIG) that automatically adds and removes VMs based on average CPU utilization ...Before you begin · Create a MIG and enable...
[89]
Compute Engine managed instance groups get scale-in controls
Oct 21, 2020 · New scale-in controls in Compute Engine let you limit the VM deletion rate by preventing the autoscaler from reducing a MIG's size by more VM instances.
[90]
google_compute_autoscaler | Resources | hashicorp/google
Autoscalers allow you to automatically scale virtual machine instances in managed instance groups according to an autoscaling policy that you define.
[91]
About Patch | VM Manager - Google Cloud Documentation
The OS Config service enables patch management in your environment while the OS Config agent uses the update mechanism for each operating system to apply ...
[92]
Using Compute Engine's OS patch management service
Jan 13, 2021 · In this blog, we share a step-by-step guide on how to set up a project with a schedule to automatically patch filtered VM instances.
[93]
Best practices for OS patch management on Compute Engine.
Sep 14, 2020 · Google Cloud's OS patch management service is a powerful tool to help you install updates at scale across the whole fleet safely and effectively.
[94]
Scripting gcloud CLI commands | Google Cloud SDK
Google Cloud SDK comes with a variety of tools like filtering, formatting, and the --quiet flag, enabling you to effectively handle output and automate tasks.
[95]
VM instance pricing | Google Cloud
This page describes the cost of running a Compute Engine VM instance with any of the following machine types, as well as other VM instance-related pricing.Missing: GA October 2025
[96]
Pricing | Compute Engine: Virtual Machines (VMs) - Google Cloud
This page is a list of Compute Engine pricing in a single place for convenience. It is intended for reference purposes and does not provide detailed pricing ...
[97]
Spot VMs pricing - Google Cloud
Prices adjust based on market trends and supply and demand for Spot VMs capacity. You pay the Spot price that is in effect when your instances are running.Missing: premium egress 2025
[98]
https://docs.cloud.google.com/compute/vm-manager/docs/patch
[99]
Disk and image pricing | Google Cloud
Hyperdisk ML provisioned space. $0.000109589 / 1 gibibyte hour. Hyperdisk ML provisioned throughput. $0.000164384 / 1 hour. If you pay in a currency other than ...Premium Images · Disk Pricing · Persistent Disk And...
[100]
https://cloud.google.com/blog/products/management-tools/best-practices-for-os-patch-management-on-compute-engine/
[101]
Network pricing
Summary of each segment:
[102]
Sustained use discounts - Compute - Google Cloud Documentation
The discount increases incrementally with usage and you can get up to a 30% net discount off of the resource cost for virtual machine (VM) instances that run ...Eligible Resources And... · Incremental Usage Levels · View Sustained Use Discounts
[103]
Pricing Overview | Google Cloud
With Google Cloud's pay-as-you-go pricing structure, you only pay for the services you use. No up-front fees. No termination charges. Pricing varies by product ...Compute Engine · Price list · Storage · Cloud SQL pricing
[104]
Committed use discounts (CUDs) for Compute Engine
This document explains Google Cloud's committed use discounts (CUDs) and the types of CUDs that you can receive for Compute Engine. Google Cloud offers CUDs ...
[105]
Combine reservations with committed use discounts | Compute Engine
Committed use discounts (CUDs) provide deeply discounted prices for your Compute Engine resources in exchange for 1-year or 3-year committed-use contracts ...
[106]
About reservations | Compute Engine - Google Cloud Documentation
You're charged for the reserved resources at the same on-demand rate as running VMs, including any applicable discounts, as long as the reservation exists.
[107]
Reservation recommendations | Compute Engine
Compute Engine provides reservation recommendations to help you identify idle or underutilized on-demand reservations for the previous seven days so that you ...Customize Recommendations · Choose The Right... · The Observation Period<|control11|><|separator|>