• In order to do AI, IoT, and other big data-dependent projects right, companies are beyond the confines of traditional relational databases.
• Two recent, related partnerships between highly specialized “graph” database developers, Neo4J and TigerGraph, and public cloud platform providers, Amazon and Google, underscores the importance surfacing insights that would otherwise remain hidden within traditional database architectures.
Organizations anxious to put AI to work as a means of driving innovation must first invest in big data. AI algorithms and predictive models are nothing without a constant influx of high quality data. The trouble is that not all data is created equal, at least in terms of its ability to match the demands of a given initiative, be that AI, IoT, mobility, or edge computing.
Such specific demands in turn drive the adoption of highly specialized data architecture, extending down to the database itself. There are traditional relational databases as well as those specializing in key-values, document storage, in-memory processing, time-series evaluation, transaction ledgers, and graph analysis. Each in turn solves very specific problems – e.g., self-driving cars won’t work without an underlying database capable of performing time-series analysis.
Two recent, related partnerships between highly specialized “graph” database developers and public cloud platform providers underscores the importance of specialized databases capable of surfacing insights that would otherwise remain hidden within traditional database architectures.
Neo4j, a very early entrant in the graph database market, announced that its database was now available as a fully managed service on Google Cloud Platform (GCP). Similarly, TigerGraph, a comparative newcomer, announced that its graph database was available on Amazon Web Services (AWS), going so far as to offer a full pay-as-you-go pricing model.
Why are graph databases important? Customers seeking to create a predictive model of a well-instrumented task such as fraud detection, for instance, must invest in a data architecture that can readily scale to accommodate a massive amount of real-time (often streaming) data spanning a large number of data points originating across a wide array of data sources. But those discrete data points, if housed in a traditional relational database, don’t tell the entire story.
Where traditional databases focus on the meaning of individual data points, graph databases focus instead on the relationships between those data points. Take a fraud ring, for example, which might employ a series of drivers, each engaging in similar “fake” accidents. It is the ability to identifying the relationships between those drivers that brings their fraudulent activities to light.
What is a graph database? Technically a graph database is a type of NoSQL database. Both of these emerged as a response to the limitations inherent in traditional relational databases like MySQL and PostgreSQL, where relationships between data points demand some forethought — connecting tables and defining database schemas, for example. Conversely, graph (and NoSQL) databases throw those notions of an indexed database out the window and instead offer an array of searchable nodes where relationships between each element are treated as first class database citizens.
Fortunately, enterprise data practitioners are seeing a rapid influx of available options. This is especially true for those seeking a managed service from global, hyperscale public cloud providers. There’s Cosmos DB from Microsoft, Neptune from Amazon, and HANA from SAP. And beyond Neo4J Platform and TigerGraph DB, there’s OrientDB, Teradata Aster, DataStax, and many more pure play and multimodel options.
Take Microsoft’s Cosmos DB as an example. This multimodel database supports document, key-value, and columnar data models. It also provides graph analysis. But unlike its open source, multiplatform-capable rival, Noe4J, Cosmos DB runs as a cloud-only commercial project on Microsoft Azure. Plus Cosmos DB has to support several data models and associated use cases, which can impose some restrictions.
That’s why the ability to stand up a highly specialized third-party database like TigerGraph on AWS with full access to many of the core features normally reserved for AWS’ own services, does far more than cater to the needs of a few highly specialized customer requests. Rather, it allows any and all AWS customers to build a business outcome, say fraud detection, without having to trade functionality or familiarity for mere convenience.