Fact-checked by Grok 2 weeks ago

Amazon Neptune

Amazon Neptune is a fully managed graph database service provided by Amazon Web Services (AWS) that enables the storage, querying, and analysis of highly connected datasets, supporting both property graph and Resource Description Framework (RDF) models to handle billions of relationships with millisecond latency.^[1] It is designed for applications requiring complex traversals and pattern matching, such as recommendation engines, fraud detection, knowledge graphs, network security analysis, and drug discovery.^[1] Neptune adheres to open standards for graph technologies, including the Apache TinkerPop Gremlin for property graphs, the openCypher query language (compatible with Neo4j), and the W3C's SPARQL for RDF data, allowing developers to use familiar tools without vendor lock-in.^[1] As a fully managed service, AWS handles all infrastructure tasks, including hardware provisioning, software patching, backups to Amazon S3, point-in-time recovery, and replication across multiple Availability Zones, ensuring greater than 99.99% availability.^[1] It supports up to 15 read replicas per cluster for high-throughput workloads and uses SSD-backed storage for optimized performance.^[1] Security features in Neptune include encryption at rest and in transit using AWS Key Management Service (KMS), integration with Amazon Virtual Private Cloud (VPC), and fine-grained access control via AWS Identity and Access Management (IAM).^[1] The service was first announced in preview at AWS re:Invent 2017 and became generally available on May 30, 2018, initially in select AWS regions.^[2] Since its launch, Neptune has expanded to support advanced use cases like GraphRAG for AI applications and integrates with services such as Amazon Bedrock for knowledge bases and agentic AI.^[3]

History

Announcement and Initial Development

Amazon Neptune was announced on November 29, 2017, during the AWS re:Invent conference as a fully managed graph database service designed to simplify building and running applications that work with highly connected datasets.^[4] The service was introduced to address the limitations of traditional relational databases in modeling complex relationships, which often result in intricate join operations, increased development costs, and suboptimal query performance.^[4] Developed from the ground up by Amazon Web Services (AWS), Neptune was optimized to handle billions of relationships across property graph and Resource Description Framework (RDF) models, delivering millisecond latency for queries at scale.^[4] It integrates seamlessly with the AWS ecosystem, running within an Amazon Virtual Private Cloud (VPC) for secure deployment and supporting data loading from Amazon S3 to enable efficient ingestion of large datasets in formats like CSV for property graphs and Turtle for RDF.^[4] This foundational design emphasized high availability, durability, and ease of management, allowing developers to focus on application logic rather than infrastructure maintenance.^[2] Following the announcement, Neptune entered a limited preview phase in late 2017, where early adopters could sign up to access the core engine supporting Apache TinkerPop Gremlin for property graphs and SPARQL for RDF queries.^[4] During this period, AWS incorporated customer feedback to refine capabilities such as read replicas, failover mechanisms, and encryption at rest.^[2] The service achieved general availability on May 30, 2018, initially in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Ireland) regions, marking the completion of its initial development and rollout for production use.^[2]

Key Milestones and Updates

Subsequent milestones included the launch of Amazon Neptune Serverless on October 26, 2022, which introduced automatic scaling capabilities to handle variable workloads without manual provisioning.^[5] This was followed by the introduction of Amazon Neptune Analytics on November 29, 2023, enabling fast, in-memory graph analytics for large-scale queries using optimized engines.^[6] The service's engine versions have evolved steadily from the initial 1.0.1.0 release in 2018, progressing through multiple minor and patch updates to the current 1.4.6.1 as of September 18, 2025.^[7] Key enhancements in this history include the upgrade to Apache TinkerPop 3.4.1 on July 26, 2019, which added support for advanced Gremlin features such as improved traversal patterns and the GraphBinary serialization format for efficient data exchange.^[8] Later versions incorporated performance optimizations, notably in engine 1.4.6.0 released on September 2, 2025, which improved SPARQL update operations and openCypher mutation performance for CREATE, MERGE, and SET queries.^[9] In 2025, Amazon Neptune underwent several updates focused on reliability and security, including operating system upgrades to enhance performance and address vulnerabilities.^[10] On April 2, 2025, AWS updated the service level agreement to provide a 99.99% monthly uptime for Multi-AZ deployments, reflecting improvements in high-availability configurations.^[11] Neptune also supports full-text search integration via Amazon OpenSearch Service, enabling hybrid graph and text queries in Gremlin and SPARQL.^[12] Later in 2025, Neptune introduced public endpoints on September 4, allowing secure access from outside VPCs without VPNs or bastions, available from engine version 1.4.6.x.^[13] Additionally, the service expanded to new regions including Asia Pacific (Malaysia) on April 9, 2025, and Canada West (Calgary) on May 28, 2025.^[14]

Features

Data Models and Query Languages

Amazon Neptune supports two primary graph data models: the property graph model and the Resource Description Framework (RDF) model. These models allow users to represent and query highly connected datasets without requiring separate databases for each, as Neptune's engine natively handles both within a unified storage layer. The property graph model organizes data into vertices (nodes) and edges (relationships), where vertices and edges can have associated properties as key-value pairs. Vertices are identified by unique identifiers, edges connect a source vertex to a target vertex with a label describing the relationship type, and properties store additional attributes such as strings, numbers, or lists. This structure facilitates modeling complex networks like social graphs or recommendation systems.^[15] In contrast, the RDF model represents data as triples consisting of a subject, predicate, and object, forming statements about resources identified by URIs or literals. Neptune extends this to quads by including a graph identifier, enabling named graphs for partitioning data and supporting multiple RDF datasets in a single instance. This model is particularly suited for semantic web applications, knowledge graphs, and linked data scenarios, adhering to W3C RDF 1.1 standards. Both models are stored using a common quad-based internal representation (subject-predicate-object-graph), which optimizes storage efficiency and query performance across paradigms.^[15]^[16] For querying the property graph model, Neptune supports Apache TinkerPop Gremlin, an imperative traversal language that allows step-by-step navigation of vertices and edges. Gremlin enables complex traversals, aggregations, and transformations, compatible with TinkerPop 3 implementations in languages like Java, Python, and Groovy. Additionally, Neptune provides support for openCypher, a declarative query language originally from Neo4j and open-sourced under Apache 2.0, which uses pattern-matching syntax (e.g., MATCH clauses with motifs like ()-[]->()) for expressing graph queries in an SQL-like manner. openCypher, compliant with version 9 of the openCypher specification, allows developers familiar with relational querying to perform reads and updates on property graphs without choosing between languages—both Gremlin and openCypher can access the same data.^[17]^[18]^[19] The RDF model is queried using W3C SPARQL 1.1, a declarative language for retrieving and manipulating RDF data through patterns in SELECT, CONSTRUCT, ASK, and DESCRIBE queries, as well as updates via INSERT, DELETE, and LOAD operations. SPARQL supports federated queries, entailment regimes, and functions for filtering and aggregating results, making it ideal for semantic querying and inference. Neptune's implementation complies with SPARQL 1.1 Query Language recommendations, including support for property paths and subqueries.^[20]^[21] Neptune's query engine is natively optimized for both models, leveraging index-free adjacency for fast traversals and SSD-backed storage to achieve low-latency execution of Gremlin, openCypher, and SPARQL queries on graphs with billions of relationships. This unified architecture eliminates the need for model-specific databases, enabling seamless switching between query languages based on application needs.^[1]

Performance and Scalability

Amazon Neptune achieves high query throughput, capable of processing over 100,000 queries per second on large graphs, enabling efficient handling of demanding graph workloads.^[3] This performance is supported by its in-memory optimized architecture, which includes a buffer pool cache that stores frequently accessed graph data in memory to reduce disk I/O and accelerate traversals.^[22] Additionally, Neptune offers optional indexing features, such as the Object-Subject-Graph-Predicate (OSGP) index, which is particularly beneficial for datasets with a large number of unique predicates, allowing for faster predicate-based lookups without scanning the entire graph.^[23] For scalability, Neptune provides automatic storage scaling that grows the cluster volume up to 128 TiB as data increases, ensuring seamless capacity expansion without manual intervention.^[24] Read scalability is enhanced through the addition of up to 15 low-latency read replicas that share the same underlying storage as the primary instance, distributing read traffic to maintain performance under high load.^[25] Write operations employ quorum-based durability, replicating data across six copies in three Availability Zones (AZs), where four acknowledgments are required for commit, balancing consistency with fault tolerance.^[26] Neptune's reliability is underpinned by a 99.99% availability Service Level Agreement (SLA) for Multi-AZ deployments, minimizing downtime for production environments.^[11] Failover recovery typically occurs in under 60 seconds when using replicas, supporting a low recovery time objective for resilient operations.^[27] For elastic workloads, Neptune Serverless offers automatic compute scaling, but the core database focuses on these fixed-capacity mechanisms for consistent performance.^[25]

Security and Compliance

Amazon Neptune provides robust security features to protect data in graph databases, emphasizing network isolation, access controls, and encryption mechanisms. Neptune clusters are deployed within an Amazon Virtual Private Cloud (VPC), which enables network isolation by restricting access to resources solely within the VPC boundaries. This setup uses private endpoints to ensure that database endpoints are not publicly accessible unless explicitly configured, preventing unauthorized external connections and allowing secure communication between Neptune and other AWS services or EC2 instances in the same VPC.^[28] Access to Neptune is managed through integration with AWS Identity and Access Management (IAM), which supports fine-grained permissions for controlling API actions such as creating, modifying, or deleting database resources. IAM policies can be attached to users, groups, or roles to enforce least-privilege access, ensuring that only authorized entities can perform specific operations on the cluster. Additionally, Neptune supports IAM database authentication, allowing users to authenticate to the database using IAM credentials rather than traditional passwords, which enhances security by leveraging short-lived tokens and eliminating the need to manage database-level credentials. Data protection in Neptune includes encryption both at rest and in transit. At rest, all data, automated backups, snapshots, and replicas are encrypted using keys managed by AWS Key Management Service (KMS), where customers can use their own customer-managed keys for greater control over key lifecycle and access. In transit, Neptune enforces Transport Layer Security (TLS) to encrypt connections between clients and the database endpoint, safeguarding data during query execution and replication. Encryption at rest can be enabled during cluster creation using AWS Key Management Service (KMS) keys and cannot be disabled once activated. Neptune adheres to numerous compliance standards, with over 20 certifications applicable through AWS services in scope, including FedRAMP Moderate, HIPAA (via Business Associate Agreement), PCI DSS Level 1, and various SOC reports (SOC 1, SOC 2, and SOC 3). Compliance validation reports and audit artifacts are available for download via AWS Artifact, allowing customers to verify adherence to regulatory requirements. Furthermore, audit logging is facilitated through AWS CloudTrail, which captures API calls and management events for Neptune clusters, enabling detailed monitoring, compliance auditing, and forensic analysis of security-related activities.^[29]^[30]

Storage and Replication

Amazon Neptune employs a custom, distributed storage engine optimized for graph databases, utilizing a shared architecture with NVMe SSD-based cluster volumes that automatically scale to accommodate growing data needs.^[31] This engine incorporates write-ahead logging (WAL) to ensure transaction durability, where internal transaction logs are maintained separately from the primary data storage, helping to prevent data loss during failures while influencing the overall storage high-water mark usage.^[31] For enhanced reliability, Neptune replicates each piece of data across six copies distributed over three Availability Zones (AZs) within a region, providing a high degree of durability with minimal risk of data loss even in the event of AZ failures.^[31]^[32] Volume management in Neptune is fully automated and seamless, beginning with a minimum allocation of 10 GiB and expanding in 10 GiB increments up to a maximum of 128 TiB per cluster volume in most regions, or 64 TiB in AWS China Regions and AWS GovCloud (US).^[31]^[33] This scaling occurs without downtime or manual intervention as data volume increases, though storage cannot be shrunk directly; reduction requires exporting and reloading data into a new cluster.^[31] Neptune also offers I/O-optimized storage configurations, available since engine version 1.3.0.0, tailored for workloads with high input/output demands, delivering predictable performance and lower latency compared to standard storage options.^[34] Storage costs are based on the provisioned high-water mark, billed in GiB-month increments, ensuring efficient resource utilization without over-provisioning.^[31] Replication in Neptune prioritizes both durability and read scalability through a combination of synchronous and asynchronous mechanisms. Synchronous multi-AZ replication is inherent to the cluster volume design, where writes to the primary DB instance are durably committed only after successful replication to the six copies across three AZs, enabling automatic failover with low recovery time objectives.^[31]^[1] For read-heavy applications, asynchronous read replicas—up to 15 per cluster—can be provisioned in additional AZs, each connecting to the shared cluster volume without duplicating data storage; these replicas handle read-only queries to offload traffic from the primary instance and support horizontal scaling.^[1] This approach maintains consistency while distributing query loads, though replicas may experience slight replication lag under high write throughput.^[1] Backup capabilities in Neptune ensure data protection through continuous, automated mechanisms. Automated snapshots are enabled by default with a configurable retention period of 1 to 35 days, stored durably in Amazon S3 and used for full cluster recovery or cross-region replication.^[35]^[36] Complementing snapshots, point-in-time recovery (PITR) allows restoration to any second within the backup retention window—up to 35 days—leveraging continuous transaction log backups to enable recovery to any point within the backup retention window with minimal data loss.^[35]^[25] These features operate transparently, with no performance impact during backup operations, and support encryption if the cluster is configured for it.^[35]

Specialized Offerings

Neptune Serverless

Amazon Neptune Serverless is an on-demand, fully managed deployment option for the Neptune graph database service that automatically adjusts compute and memory capacity to match workload demands, eliminating the need for manual provisioning.^[37] Launched on October 26, 2022, it enables seamless scaling from idle states to handling thousands of queries per second without downtime or over-provisioning, making it suitable for applications with unpredictable traffic patterns.^[38] Capacity in Neptune Serverless is measured in Neptune Capacity Units (NCUs), where each NCU provides approximately 2 GiB of memory along with proportional CPU and networking resources.^[39] Users configure a minimum and maximum NCU range—minimum of 1.0 NCU in 0.5 NCU increments for fine-grained control, up to a maximum of 128 NCUs (equivalent to 256 GiB of memory)—and the system scales dynamically in fractions of a second based on real-time monitoring of CPU, memory, and network utilization.^[39] When idle, the cluster scales down to the minimum capacity to minimize costs, while bursts trigger rapid upscaling to maintain performance.^[38] Neptune Serverless supports the same core data models and query languages as the provisioned Neptune offering, including property graphs with Gremlin and openCypher, as well as RDF models with SPARQL.^[40] It is designed for operational workloads such as development environments, multi-tenant applications, and production graphs with variable query volumes, like fraud detection or knowledge graphs, where automatic scaling ensures efficiency without capacity planning overhead.^[41] Pricing is based on NCU-hours used, with details covered in the serverless pricing model.^[42]

Neptune Analytics

Amazon Neptune Analytics is a serverless, fully managed graph analytics service launched on November 29, 2023, designed to enable rapid analysis of large graph datasets without the need for infrastructure management.^[6] It allows users to perform complex graph queries and analytics on datasets with billions of relationships, delivering results in seconds through its memory-optimized architecture.^[43] The service supports multiple graph query languages, including Apache TinkerPop Gremlin, openCypher, and SPARQL, enabling flexible querying across property graph and RDF models.^[43] Key capabilities include built-in vector indexes for efficient similarity searches integrated into graph traversals, as well as machine learning integrations that leverage embeddings for advanced pattern detection and recommendations.^[33] For data ingestion, it offers a bulk loader for loading data from Amazon S3 buckets, alongside support for streaming ingestion to handle real-time data updates; each graph can utilize up to 4096 GB of RAM (4096 m-NCUs) for in-memory processing. As of July 30, 2024, it supports configurations starting from 32 m-NCUs.^[43]^[44]^[45] Unique features of Neptune Analytics include support for GraphRAG workflows via integration with Amazon Bedrock, which enhances retrieval-augmented generation by combining graph traversals with generative AI for more contextual responses.^[25] Additionally, it provides query cancellation capabilities and status tracking through APIs, allowing users to monitor and interrupt long-running analytics jobs as needed.^[43] These elements make it particularly suited for exploratory analytics on knowledge graphs, fraud detection, and recommendation systems.^[46]

Availability and Deployment

Regional and Global Support

Amazon Neptune is available in over 30 AWS regions worldwide as of 2025, enabling customers to deploy graph databases in locations that align with their data residency and latency requirements.^[47] Recent expansions include the Asia Pacific (Melbourne) region (ap-southeast-4) and Canada West (Calgary) region (ca-west-1), both launched on May 28, 2025, to support growing demand in the Asia-Pacific and North American markets.^[14] This broad regional footprint spans North America, South America, Europe, the Middle East, Africa, Asia Pacific, China, and AWS GovCloud (US) regions, totaling 31 supported areas.^[47] Neptune Analytics, the serverless graph analytics offering, has also seen regional growth, with availability extended to the AWS Canada (Central) region (ca-central-1) and Australia (Sydney) region (ap-southeast-2) in October 2025.^[48] These additions enhance options for real-time graph analytics workloads in key international markets, complementing the core Neptune database's global presence. For cross-region data distribution, Amazon Neptune Global Database provides low-latency replication across multiple regions, achieving sub-1-second replication lag to support globally distributed applications.^[49] As of July 2025, this feature expanded to five additional regions, including Europe (Frankfurt) (eu-central-1), Asia Pacific (Singapore) (ap-southeast-1), Asia Pacific (Osaka) (ap-northeast-3), Asia Pacific (Jakarta) (ap-southeast-3), and Israel (Tel Aviv) (il-central-1).^[50] However, storage capacities vary by region: while most areas support up to 128 TiB per cluster volume, deployments in China regions (Beijing and Ningxia) and AWS GovCloud (US) regions are capped at 64 TiB.^[31]

High Availability Configurations

Amazon Neptune provides high availability through multi-AZ deployments that distribute database instances and storage across multiple Availability Zones (AZs) within an AWS Region, ensuring resilience against AZ-level failures.^[31] In a Multi-AZ configuration, the primary DB instance handles both reads and writes, while read replicas are placed in different AZs to enable automatic failover if the primary fails.^[26] The underlying cluster volume replicates data into six copies across three AZs, providing high durability and automatic repair of corrupted segments using redundant copies.^[31] A Multi-AZ deployment requires a VPC with subnets in at least two AZs, and Neptune automatically distributes instances across these zones for fault tolerance.^[51] Upon detecting a primary instance failure, Neptune initiates an automatic failover to a read replica in another AZ, typically restoring service in less than 120 seconds and often under 60 seconds, with no manual intervention required.^[27] This process promotes the selected read replica to primary, minimizing downtime while maintaining data consistency due to the shared storage volume.^[26] For enhanced read scaling and availability, clusters support up to 15 read replicas per cluster, each sharing the same cluster volume as the primary and exhibiting minimal replication lag, typically under 100 milliseconds.^[1] These replicas can be added or removed without impacting the underlying data replication across AZs, and in disaster scenarios, a read replica can be manually promoted to a standalone DB instance.^[26] Disaster recovery in Neptune leverages point-in-time recovery and snapshot management to restore clusters from failures or data corruption.^[1] Continuous automated backups are stored durably in Amazon S3, enabling point-in-time recovery to any second within the retention period, which can be configured up to 35 days.^[1] User-initiated snapshots, also stored in S3, support cross-region copying for broader recovery options, allowing restoration in a different AWS Region to mitigate regional outages.^[52] This cross-region snapshot copy process, while potentially taking hours depending on data volume, provides a low-overhead method for disaster recovery without ongoing replication.^[53] Neptune's Multi-AZ configurations are backed by a 99.99% monthly uptime Service Level Agreement (SLA), applicable to DB instances, clusters, and graphs deployed across multiple AZs.^[11] Under this SLA, AWS commits to commercially reasonable efforts to achieve the uptime target, with service credits available for downtime: 10% for uptime between 99.0% and 99.99%, 25% for 95.0% to 99.0%, and 100% for below 95.0%.^[11] Credits are calculated based on the total charges for the affected Multi-AZ resources and must be requested via the AWS Support Center within two billing cycles.^[11] Single-AZ deployments, in contrast, qualify for a lower 99.5% uptime SLA.^[11]

Pricing

Instance-Based Models

Amazon Neptune's instance-based pricing model applies to provisioned database instances, where users pay for the compute capacity they allocate, including both primary instances for read-write workloads and read replicas for scaling reads and failover support.^[42] On-demand pricing charges an hourly rate based on the instance type selected, with rates varying by region; for example, in US East (N. Virginia), a db.r5.large instance costs $0.348 per hour under standard configuration or $0.4698 per hour for I/O-optimized configuration (as of November 2025).^[42] Other instance types, such as db.r5.xlarge at $0.696 per hour (standard) or db.r5.24xlarge at $16.704 per hour (standard), follow similar scaling, allowing users to choose based on workload requirements like memory and vCPU needs.^[42] For long-term commitments, Amazon Neptune supports Reserved Instances and Savings Plans, which can provide significant savings compared to on-demand pricing through 1- or 3-year terms, applicable to provisioned instances without upfront capacity reservations in some cases.^[42] These options help optimize costs for predictable workloads by committing to a consistent spend level across Neptune and other AWS services. Beyond compute, additional costs include storage at $0.10 per GB-month for standard configuration or $0.225 per GB-month for I/O-optimized, which provides higher throughput for intensive graph traversals.^[42] I/O requests are charged at $0.20 per million for standard instances, though I/O-optimized eliminates this fee while increasing storage and instance rates.^[42] Backup storage is free up to 100% of the total database storage for up to seven days, with excess or retained snapshots costing $0.021 per GB-month.^[42] New AWS customers can access a limited free tier for Neptune provisioned instances, offering 750 hours of db.t3.medium or db.t4g.medium usage, 10 million I/O requests, 1 GB of storage, and 1 GB of backup storage within the first 30 days of account creation.^[42] Data transfer within the same Availability Zone remains free, supporting efficient intra-region operations without additional charges.^[42]

Instance Type	Standard On-Demand Rate (US East, $/hour)	I/O-Optimized On-Demand Rate (US East, $/hour)
db.r5.large	0.348	0.4698
db.r5.xlarge	0.696	0.9396
db.r5.24xlarge	16.704	22.5552

Serverless and Analytics Models

Amazon Neptune Serverless employs a pay-per-use pricing model based on Neptune Capacity Units (NCUs), where each NCU provides approximately 2 GB of memory along with associated CPU and networking resources.^[42]^[39] Billing occurs per NCU-second with an effective rate of $0.1608 per NCU-hour for standard configuration or $0.217 per NCU-hour for I/O-optimized configuration in the US East (N. Virginia) region (as of November 2025), enabling fine-grained scaling starting from a minimum of 1 NCU to handle variable workloads without provisioning fixed capacity.^[42] When paused, Neptune Serverless incurs no compute charges, though storage costs continue to apply, allowing users to minimize expenses during idle periods.^[37] Neptune Analytics uses a similar capacity-based approach but bills per memory-optimized NCU-hour (m-NCU-hour), with each m-NCU equivalent to 1 GB of memory plus compute and networking.^[42] Available configurations include 16 m-NCUs at $0.48 per hour, 32 m-NCUs at $0.96 per hour, 64 m-NCUs at $1.92 per hour, 128 m-NCUs at $3.84 per hour, and 256 m-NCUs at $7.68 per hour in the US East (N. Virginia) region (as of November 2025).^[42] For example, running a 256 GB graph (256 m-NCUs) for 2 hours costs $15.36.^[42] Paused graphs are billed at 10% of the normal compute rate, preserving data and settings while reducing costs for intermittent analytics workloads.^[54] Storage and I/O costs for both Serverless and Analytics models align with those of core Neptune instances, at $0.10 per GB-month for standard storage and $0.20 per million I/O requests.^[42] Additional expenses may arise from Neptune Workbench for machine learning tasks, priced at $0.23 per hour for an ml.t3.xlarge notebook instance.^[42] These models require no upfront commitments, offering flexibility for unpredictable loads and potential cost savings compared to provisioned instances through automatic scaling and pausing capabilities.^[42]

Applications and Integrations

Primary Use Cases

Amazon Neptune is particularly suited for applications involving highly connected data, where traditional relational databases struggle with complex relationship queries. Its graph data model enables efficient traversal and analysis of relationships, supporting use cases that require understanding interconnections at scale. Primary applications include recommendation engines, knowledge graphs, network analysis, and real-time querying of dynamic datasets.^[1] In recommendation engines, Neptune facilitates personalized content delivery by modeling user-item interactions as graphs and performing traversals to identify patterns and similarities. For instance, it supports collaborative filtering on datasets like Yelp reviews to suggest businesses based on user preferences and connections, or movie recommendations by querying actor-director collaborations and viewer histories. This approach leverages Neptune's ability to handle billions of relationships for scalable personalization in e-commerce and media platforms.^[55]^[56] Knowledge graphs built with Neptune represent entities and their semantic relationships, enabling advanced AI and search functionalities. These graphs serve as structured representations of domain knowledge, supporting applications like identity resolution to create unified customer profiles across touchpoints. In AI contexts, they enhance search accuracy by linking concepts for more relevant results, such as in drug discovery where molecular interactions are mapped for research insights. Neptune's support for RDF and property graph models allows querying these networks with standards like SPARQL or Gremlin.^[57]^[58] Network analysis with Neptune uncovers insights in interconnected systems, including fraud detection, social networks, and supply chain optimization. For fraud detection, it models relationships between entities like accounts and transactions to identify anomalous patterns, such as rings in financial data or identity fraud in ride-sharing. In social networks, algorithms like community detection help pinpoint influencers or connection suggestions by analyzing user interactions. Supply chain optimization uses Neptune to visualize supplier dependencies and risks, enabling proactive disruption management through relationship queries.^[59]^[60]^[61]^[62]^[63]^[25] Real-time querying in Neptune supports applications with highly connected, dynamic data, such as IoT ecosystems and cybersecurity threat modeling. In IoT scenarios, it processes device interactions and event streams to detect patterns in industrial networks, integrating with time-series data for operational insights. For cybersecurity, Neptune enables near-real-time analysis of network traffic and threat relationships to model and mitigate risks like intrusions. Its millisecond-latency queries ensure responsiveness in these time-sensitive environments.^[3]^[1]

Ecosystem and AWS Integrations

Amazon Neptune integrates seamlessly with various AWS services to enhance data ingestion, machine learning capabilities, and AI-driven applications. For instance, it supports bulk data loading from Amazon Simple Storage Service (S3), allowing users to ingest large volumes of graph data in formats such as CSV, RDF triples, or Gremlin-compatible files directly into a Neptune database cluster using the Neptune bulk loader API.^[64] This integration facilitates efficient population of graphs from external sources, with support for encrypted S3 buckets via AWS Key Management Service (KMS) keys.^[65] Additionally, Neptune leverages Amazon Bedrock for Graph Retrieval-Augmented Generation (GraphRAG), where Neptune Analytics serves as the graph and vector store within Bedrock Knowledge Bases, enabling the creation of knowledge graphs that improve generative AI responses by incorporating relational context alongside embeddings.^[66] Neptune ML extends graph machine learning workflows by integrating with Amazon SageMaker and the Deep Graph Library (DGL), supporting tasks such as node classification to predict categorical labels for vertices based on graph structure and features.^[67] In this process, graph data is exported from Neptune to S3 in CSV format, preprocessed, and then trained into models using SageMaker, allowing inference queries directly on the Neptune database for scalable predictions without data movement.^[68] Neptune also integrates with GraphStorm for scalable graph machine learning, enabling distributed training on large graphs as of October 2025.^[69] Security is bolstered through AWS Identity and Access Management (IAM) roles, which grant fine-grained permissions for Neptune to access S3 during exports or loads, and KMS for encrypting data at rest and in transit across these integrations.^[70]^[71] For third-party ecosystem support, Neptune aligns with the Apache TinkerPop framework through its native Gremlin query language compatibility, enabling the use of TinkerPop-compatible tools like Gremlin Console or drivers for graph traversal and manipulation.^[72] It also integrates with popular large language model (LLM) frameworks such as LangChain and LlamaIndex, simplifying the development of GraphRAG applications by providing graph retrievers that query Neptune databases to augment prompts with structured relational data from knowledge graphs.^[25] These integrations allow developers to chain Neptune queries with LLM calls for tasks like natural language querying over graphs, enhancing AI agent capabilities in domains such as customer 360 views.^[73] Neptune Workbench serves as a key tool for visualization and prototyping, offering fully managed Jupyter notebooks hosted on SageMaker that connect directly to Neptune clusters for interactive querying, graph rendering, and exploratory analysis using libraries like NetworkX or PyVis.^[74] For advanced data pipelines, Neptune supports integrations with streaming services to enable real-time applications like fraud detection or recommendation systems.^[75] Furthermore, exported graph data from Neptune can be fed into SageMaker for broader analytics, such as training custom models beyond Neptune ML or performing vector similarity searches in conjunction with embeddings generated via Bedrock.^[76]

Adoption

Notable Customers

Amazon Neptune has been adopted by several prominent organizations to handle complex relational data in their operations. Uber's Advanced Technologies Group uses Neptune for versioning high-definition maps in autonomous vehicle development, managing billions of relationships with millisecond query times. Careem, a subsidiary of Uber, leverages Neptune for fraud detection in ride-sharing by constructing identity graphs.^[77]^[61] NBCUniversal employs Neptune for personalizing content experiences by building customer identity graphs that link user identifiers across devices, connecting preferences with media libraries and achieving up to 40% cost savings compared to legacy systems.^[78] Wiz utilizes Neptune to derive security insights through graph-based threat modeling in its Security Graph, which visualizes relationships across cloud environments to prioritize risks and support remediation for organizations managing diverse technology stacks.^[79] Other notable adopters include ADP, which models dynamic team structures across microservices; Alexa, expanding knowledge graphs for millions of users; and Trend Micro, enhancing cybersecurity with knowledge graphs integrated with Amazon Bedrock.^[80]

Real-World Implementations

Careem, a subsidiary of Uber, leverages Amazon Neptune to construct an identity graph that connects user attributes such as device IDs, IP addresses, and phone numbers, facilitating real-time fraud detection in ride-sharing operations. This graph-based approach addresses the limitations of traditional rule-based systems by employing deep learning models trained on Neptune data, achieving 85% precision and over 50% recall in identifying fraudulent identities—even for users without prior transaction history—thus enabling proactive risk mitigation across millions of daily rides.^[61] NBCUniversal has implemented Amazon Neptune to power real-time content recommendation systems, scaling to process over 30,000 requests per second while managing complex relationships in connected data for insights. By migrating from a legacy NoSQL system, the deployment resolved scalability bottlenecks and query complexity issues, resulting in a 40% reduction in operational costs, doubled write throughput, and read response times improved by an order of magnitude, which boosted user engagement through more relevant suggestions.^[78]^[77] Wiz employs Amazon Neptune to model cloud security metadata as a graph, analyzing hundreds of billions of relationships to uncover network exposures and propagation paths for vulnerabilities, such as unpatched resources accessible via lateral movement. This implementation overcomes challenges in contextual risk assessment across hybrid environments, allowing Wiz to deliver prioritized, actionable security insights that enable organizations to remediate threats proactively and strengthen overall cloud postures without agent-based scanning.^[79] As of 2025, Amazon Neptune's integrations with AI services like Amazon Bedrock have driven expanded adoptions in AI-enhanced graph applications, such as BMW's knowledge graph for generative AI-driven commercial intelligence. BMW uses Neptune to organize over 10 petabytes of interconnected data across 1,000 use cases, integrating with Bedrock's GraphRAG capabilities to generate more accurate and contextually relevant business insights for its 9,000 global users, accelerating decision-making in areas like market analysis and product development.^[80]

References

[1]
What Is Amazon Neptune? - Amazon Neptune - AWS Documentation
Amazon Neptune is a fast, reliable, fully managed graph database service for building applications with highly connected datasets. It is fully managed.
[2]
Amazon Neptune Generally Available | AWS News Blog
May 30, 2018 · Amazon Neptune is now Generally Available in US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Ireland).
[3]
Managed Graph Database – Amazon Neptune – AWS
Amazon Neptune is a fast, fully managed database service powering graph use cases such as identity graphs, knowledge graphs, and fraud detection.Neptune Documentation · Pricing · FAQs · Getting Started with Amazon...
[4]
Amazon Neptune – A Fully Managed Graph Database Service
Nov 29, 2017 · Amazon Neptune runs within your Amazon Virtual Private Cloud (Amazon VPC) and allows you to encrypt your data at rest, giving you complete ...<|control11|><|separator|>
[5]
Amazon Neptune Serverless is now generally available - AWS
Oct 26, 2022 · Neptune Serverless is now available in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (N. California), US West ( ...
[6]
Amazon Neptune Analytics is now generally available - AWS
Nov 29, 2023 · Today, AWS announces the general availability of Amazon Neptune Analytics, a new analytics database engine. Neptune Analytics makes it ...
[7]
Engine releases for Amazon Neptune
Planning for Amazon Neptune major engine version life-span · September 28, 2024 ; Amazon Neptune Engine Version 1.0.2.0 (2019-11-08) · November 11, 2025 ; Amazon ...
[8]
Amazon Neptune Engine Updates 2019-07-26
Upgraded to TinkerPop 3.4.1 (see TinkerPop Upgrade Information , and TinkerPop 3.4.1 Change Log ). For Neptune customers, these changes provide new ...
[9]
Amazon Neptune Engine version 1.4.6.0 (2025-09-02)
General Improvements · Improved SPARQL performance for update operations. · Improved OpenCypher performance for CREATE , MERGE , and SET (mutations) operations.
[10]
Amazon Neptune operating system upgrades
Amazon Neptune upgrades the operating system to a newer version to improve database performance and customers overall security posture.
[11]
Amazon Neptune Service Level Agreement
Apr 2, 2025 · “Multi-AZ DB Cluster” means a Neptune cluster consisting of two or more Neptune instances in two or more AWS Availability Zones. “Single DB ...
[12]
Full text search in Amazon Neptune using Amazon OpenSearch ...
Neptune integrates with Amazon OpenSearch Service (OpenSearch Service) to support full-text search in both Gremlin and SPARQL queries.
[13]
Neptune Graph Data Model - AWS Documentation
Amazon Neptune graph database enables querying highly connected datasets ... query languages, providing high availability and fully managed database service.
[14]
https://docs.aws.amazon.com/neptune/latest/userguide/doc-history.html
[15]
Accessing a Neptune graph with Gremlin - Amazon Neptune
### Summary of Gremlin Support in Amazon Neptune
[16]
Accessing the Neptune Graph with openCypher - AWS Documentation
Neptune supports building graph applications using openCypher, currently one of the most popular query languages for developers working with graph databases.
[17]
https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-gremlin.html
[18]
Accessing the Neptune graph with SPARQL - Amazon Neptune
### SPARQL Support in Amazon Neptune
[19]
https://s3.amazonaws.com/artifacts.opencypher.org/openCypher9.pdf
[20]
Queries and buffer pool caching | AWS Database Blog
May 26, 2023 · The Neptune buffer pool cache is a feature that is always on, and helps optimize query performance by caching the most recently used graph ...
[21]
Neptune Lab Mode - AWS Documentation
Neptune can now maintain a fourth index, namely the OSGP index, which is useful for data sets having a large number of predicates (see Enabling an OSGP Index).Using Lab Mode · OSGP index · Transaction Semantics
[22]
Performance and Scaling in Amazon Neptune - AWS Documentation
Neptune storage automatically scales with the data in your cluster volume. As your data grows, your cluster volume storage grows, up to 128 TiB in all ...
[23]
Managed Graph Database – Amazon Neptune Features – AWS
Learn more about the key features of Amazon Neptune, including high performance and scalability and open graph APIs (Apache TinkerPop and RDF/SPARQL).<|separator|>
[24]
Amazon Neptune DB Clusters and Instances
Neptune uses quorum writes that make six copies of your data across three Availability Zones, and four out of those six storage nodes must acknowledge a ...
[25]
Fault tolerance for a Neptune DB cluster - AWS Documentation
However, service is typically restored in less than 120 seconds, and often less than 60 seconds. To increase the availability of your DB cluster, we recommend ...
[26]
Securing your Amazon Neptune database with Amazon VPC
This section of the Amazon Neptune user guide explains how to use Amazon Virtual Private Cloud (Amazon VPC) to secure your Neptune database, including how ...
[27]
Compliance considerations for Amazon Neptune
This section of the Amazon Neptune user guide discusses compliance considerations for your Neptune database, including information about security ...
[28]
Services in Scope
### Summary of Amazon Neptune Compliance Programs and Certifications
[29]
Amazon Neptune storage, reliability and availability
Neptune storage scales automatically with data growth, up to 128 TiB. Instances scale by modifying DB instance class. Read scaling achieved by adding up to 15 ...
[30]
A personalized 'shop-by-style' experience using PyTorch on ...
Jul 3, 2019 · Amazon Neptune provides a high degree of durability by replicating our data six times across three Availability Zones at the storage layer.
[31]
Amazon Neptune FAQs - Managed Graph Database
Does Amazon Neptune have a service level agreement (SLA)? ... Yes, Neptune Analytics offers Multi-AZ deployments with enhanced availability and durability.
[32]
Choosing storage types for Amazon Neptune - AWS Documentation
Neptune offers two types of storage with a different pricing model: I/O–Optimized storage – With I/O–Optimized storage, available from engine version 1.3.0.0, ...
[33]
Backing up and restoring an Amazon Neptune DB cluster
For Amazon Neptune DB clusters, the default backup retention period is one day regardless of how the DB cluster is created. You cannot disable automated backups ...<|control11|><|separator|>
[34]
Amazon Neptune Documentation
With Amazon Neptune, you can create graph applications that can query billions of relationships in milliseconds. Amazon Neptune allows you to use the ...Amazon Neptune Documentation · High Availability And... · Machine Learning
[35]
Amazon Neptune Serverless - AWS Documentation
Not available in early engine versions – Neptune Serverless is only available in engine releases 1.2.0.1 or later. Not compatible with the Neptune lookup cache ...
[36]
Introducing Amazon Neptune Serverless – A Fully Managed Graph ...
Oct 26, 2022 · Introducing Amazon Neptune Serverless – A Fully Managed Graph Database that Adjusts Capacity for Your Workloads. March 2, 2023: Post updated ...
[37]
Capacity scaling in a Neptune Serverless DB cluster
When the load on a serverless instance reaches the limit of current capacity, or when Neptune detects any other performance issues, the instance scales up ...
[38]
Using Amazon Neptune Serverless - AWS Documentation
You can create a new Neptune DB cluster as serverless, or convert an existing one. You can also convert DB instances to and from serverless.
[39]
[PDF] Deep dive into Amazon Neptune Serverless - awsstatic.com
Amazon Neptune Serverless is a fully managed, purpose-built graph database in the cloud, with instant provisioning, cost-effectiveness, and no hardware ...
[40]
Amazon Neptune pricing
Neptune Analytics is an analytic database engine that performs fast analytics operations over tens of billions of graph connections in seconds.Amazon Neptune Pricing · Neptune Analytics · Pricing Examples<|control11|><|separator|>
[41]
Analyze large amounts of graph data to get insights and find trends ...
Nov 29, 2023 · Since the launch of Neptune in May 2018, thousands of customers have embraced the service for storing their graph data and performing updates ...
[42]
https://aws.amazon.com/neptune/pricing/
[43]
What is Neptune Analytics? - Neptune Analytics
### Summary of Neptune Analytics
[44]
Amazon Neptune Limits
A Neptune cluster volume can grow to a maximum size of 128 tebibytes (TiB) in all supported regions. This is true for all engine releases starting with Release ...
[45]
Changes and Updates to Amazon Neptune
Recent updates include engine version 1.4.6.0 (Sept 2, 2025), public endpoints (Aug 27, 2025), and new regions (Asia Pacific (Melbourne) and Canada West ( ...
[46]
Amazon Neptune Analytics is now available in AWS Canada ...
Oct 10, 2025 · Amazon Neptune Analytics is now available in AWS Canada (Central) and Australia (Sydney) Regions. Posted on: Oct 10, 2025. Amazon Neptune ...
[47]
Using Amazon Neptune with a global database - AWS Documentation
Neptune global databases are only available in the following AWS Regions: US East (N. Virginia): us-east-1. US East (Ohio): us-east-2. US West (N. California): ...
[48]
Amazon Neptune Global Database is now in five new regions - AWS
Jul 31, 2025 · Amazon Neptune Global Database is now available in Europe (Frankfurt), Asia Pacific (Singapore), Asia Pacific (Osaka), Asia Pacific (Jakarta) ...
[49]
Building resilient and disaster-tolerant Amazon Neptune deployments
This section of the Amazon Neptune user guide provides guidance on how to build resilient and disaster-tolerant Neptune deployments, including strategies ...
[50]
Copying a DB Cluster Snapshot - Amazon Neptune
Depending on the regions involved and the amount of data to be copied, a cross-region snapshot copy can take hours to complete. If there is a large number of ...
[51]
Using Neptune streams cross-region replication for disaster recovery
A Recovery Time Objective (RTO) is measured by the time it takes to perform a recovery operation. This is the time it takes the DB cluster to fail over to a ...
[52]
Amazon EC2 Reserved Instances Pricing
Reserved Instances provide you with a significant discount (up to 72%) compared to On-Demand Instance pricing.
[53]
Stopping a Neptune Analytics graph - AWS Documentation
While stopped, you're charged only 10% of the normal rate instead of the full compute costs . This can result in significant cost savings for graphs that ...Missing: paused | Show results with:paused
[54]
Using collaborative filtering on Yelp data to build a recommendation ...
Sep 8, 2020 · In this post, we use Neptune to ingest and analyze the Yelp Open Dataset, which contains a subset of business, review, and user data from real Yelp users and ...
[55]
Make relevant movie recommendations using Amazon Neptune ...
Jul 26, 2024 · In this post, we discuss a design for a highly searchable movie content graph database built on Amazon Neptune, a managed graph database service.
[56]
Knowledge Graphs - Amazon Neptune - Amazon Web Services
You can also use the knowledge graph as input to machine learning to build smarter systems to detect fraud or recommend a product.Knowledge Graphs · Technologies For Building... · Graph Database
[57]
Graph and AI - Amazon Neptune
Graph databases are designed to store and navigate connected data. They make it easier to model and manage highly connected data, treat relationships as “first ...Missing: capabilities vector indexes ML integrations Gremlin openCypher SPARQL<|separator|>
[58]
Build a real-time fraud detection solution using Amazon Neptune ML
Feb 8, 2023 · In this post, we demonstrate how you can build a real-time fraud detection solution using Amazon Neptune ML.Neptune Ml Workflow · Making Predictions · Inductive Inference
[59]
Detect fraud with Amazon Neptune and Tom Sawyer Perspectives
Aug 30, 2022 · In this post, we discuss the power of Amazon Neptune in discovering financial fraud, and how graph visualization and analysis applications built ...Detect Fraud With Amazon... · The Neptune Example Fraud... · Discovering Fraud Rings With...
[60]
How Careem is detecting identity fraud using graph-based deep ...
Nov 23, 2021 · In this post, we share how Careem detects identity fraud using graph-based deep learning and Amazon Neptune.Architecture Overview · Data Labeling Strategy And... · Collaboration With Aws On...Missing: recommendation engines
[61]
Supply chain data analysis and visualization using ... - Amazon AWS
May 10, 2023 · In this post, we show how you can use a Neptune graph database to visualize interrelationships of a supply chain using the Neptune workbench.
[62]
Neptune Analytics algorithms - AWS Documentation
Social network influencer identification. Supply chain risk analysis. Similarity. Compare the similarities between different graph structures. Common Neighbors.
[63]
Using the Amazon Neptune bulk loader to ingest data
Amazon Neptune provides a Loader command for loading data from external files directly into a Neptune DB cluster.
[64]
Prerequisites: IAM Role and Amazon S3 Access
Without proper AWS KMS permissions, the bulk load operation fails and returns a LOAD_FAILED response. Neptune does not currently support loading Amazon S3 data ...
[65]
Build a knowledge base with graphs from Amazon Neptune Analytics
GraphRAG is a capability provided with Amazon Bedrock Knowledge Bases that combines graph modeling with generative AI to enhance retrieval-augmented generation ...
[66]
Amazon Neptune ML for machine learning on graphs
The Neptune ML feature makes it possible to build and train useful machine learning models on large graphs in hours instead of weeks.
[67]
Overview of how to use the Neptune ML feature - AWS Documentation
The process involves several key steps - exporting data from Neptune into CSV format, preprocessing the data to prepare it for model training, training the ...
[68]
Creating IAM data-access policies in Amazon Neptune
The following examples show how to create custom IAM policies that use fine-grained access control of data-plane APIs and actions.
[69]
Encrypting data at rest in your Amazon Neptune database
To manage the keys used for encrypting and decrypting your Neptune resources, you use AWS Key Management Service (AWS KMS).
[70]
Working with other AWS services - Amazon Neptune
Graph databases model interconnected data, query relationships, detect fraud patterns, build social networks, optimize logistics routes, analyze scientific ...
[71]
Using knowledge graphs to build GraphRAG applications with ...
Aug 1, 2024 · In this post, we show you how to build GraphRAG applications using Amazon Bedrock and Amazon Neptune with LlamaIndex framework.Set Up Customer 360... · Configure The Retriever For... · Interact With The Knowledge...Missing: tracking | Show results with:tracking
[72]
Using Amazon Neptune with graph notebooks
Neptune offers T3 and T4g instance types that you can get started with for less than $0.10 per hour. You are billed for workbench resources through Amazon ...
[73]
Getting started with Amazon Neptune - AWS Documentation
Writing to Amazon Neptune from an Amazon Kinesis Data Stream – This section can help you handle high write throughput scenarios with Neptune. Warning ...
[74]
Processing the graph data exported from Neptune for training
The data-processing step takes the Neptune graph data created by the export process and creates the information that is used by the Deep Graph Library (DGL) ...
[75]
DAT220 - Real-world customer use cases with Amazon Neptune
• Deeper insights into our data and how users are interacting with it. • GraphQL interface. Page 20 ... Experience with Neptune (cont.) Uber ATG. ○ Read replica ...
[76]
Building a customer identity graph with Amazon Neptune
May 12, 2020 · This post provides an overview of how to build a customer identity graph on AWS. It reviews key business drivers, challenges, use cases, customer success ...Missing: Uber Thomson
[77]
How Wiz reimagines cloud security using a graph in Amazon Neptune
Mar 31, 2023 · Wiz is on a mission to help organizations effectively reduce risks in their cloud and Kubernetes environments. In this post, we share how Wiz reimagines cloud ...
[78]
Amazon Neptune customers
Learn how customers are using Amazon Neptune for their graph database and analytics needs.Amazon Neptune Customers · Altr · Cox AutomotiveMissing: studies Uber NBCUniversal Thomson