agtypeapache_age is a Postgres Pro extension that provides graph database functionality. AGE is an acronym for A Graph Extension. The goal of the project is to create single storage that can handle both relational and graph model data so that users can use standard ANSI SQL along with openCypher, the graph query language.
User tables in apache_age databases
contain columns using reg* OID-referencing system data types,
therefore upgrading to a major version with
pg_upgrade is not supported.
The apache_age extension is provided with Postgres Pro Enterprise
as a separate pre-built package apache-age-ent-17
(for the detailed installation instructions, see Chapter 17).
Once you have
Postgres Pro Enterprise installed,
create the apache_age extension:
CREATE EXTENSION age;
For every connection of apache_age you start, you will need to load the apache_age library.
LOAD 'age';
Non-superusers must specify the full path to load the apache_age library.
LOAD '$libdir/plugins/age.so';
We recommend adding ag_catalog to your
search_path to simplify your queries. The
rest of this document will assume you have done so. If you do
not, remember to add ag_catalog to your Cypher
query function calls.
SET search_path = ag_catalog, '$user', public;
In order to use apache_age, non-superusers need
USAGE privileges on the
ag_catalog schema (example for user
db_user):
GRANT USAGE ON SCHEMA ag_catalog TO db_user; GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA ag_catalog TO db_user;
A graph consists of a set of vertices and edges, where each individual node and edge possesses a map of properties. A vertex is the basic object of a graph, that can exist independently of everything else in the graph. An edge creates a directed connection between two vertices.
To create a graph, use the create_graph
function, located in the ag_catalog
namespace.
create_graph(graph_name text) returns void
#This function will not return any results. The graph is created if there is no error message. Tables needed to set up the graph are created automatically.
SELECT * FROM ag_catalog.create_graph('graph_name');
As a superuser, you can grant privilege on a specific
existing graph to a non-superuser (example for graph
graph1 and user db_user):
GRANT USAGE ON SCHEMA graph1 TO db_user; GRANT ALL PRIVILEGES ON SCHEMA graph1 TO db_user; GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA graph1 TO db_user; GRANT ALL PRIVILEGES ON TABLE graph1._ag_label_vertex TO db_user;
To delete a graph, use the drop_graph
function, located in the ag_catalog
namespace.
drop_graph(graph_name text, cascade boolean) returns void
#
This function will not return any results. If there is no
error message the graph has been deleted. It is recommended to
set the cascade option to true, otherwise
everything in the graph must be manually dropped
with SQL DDL commands.
SELECT * FROM ag_catalog.drop_graph('graph_name', true);
When creating graphs with apache_age,
a Postgres Pro namespace will be
generated for every individual graph. The name and namespace of
the created graphs can be seen within the
ag_graph table from the
ag_catalog namespace:
SELECT create_graph('new_graph');
NOTICE: graph 'new_graph' has been created
create_graph
--------------
(1 row)
SELECT * FROM ag_catalog.ag_graph;
name | namespace
-----------+-----------
new_graph | new_graph
(1 row)
After creating the graph, two tables are going to be created
under the graph namespace to store vertices and edges:
_ag_label_vertex and
_ag_label_edge. These will be the parent
tables of any new vertex or edge label. The query below shows
how to retrieve the edge and vertex labels for all the graphs in
the database.
-- Before creating a new vertex label.
SELECT * FROM ag_catalog.ag_label;
name | graph | id | kind | relation | seq_name
------------------+-------+----+------+----------------------------+-------------------------
_ag_label_vertex | 68484 | 1 | v | new_graph._ag_label_vertex | _ag_label_vertex_id_seq
_ag_label_edge | 68484 | 2 | e | new_graph._ag_label_edge | _ag_label_edge_id_seq
(2 rows)
-- Creating a new vertex label.
SELECT create_vlabel('new_graph', 'Person');
NOTICE: VLabel 'Person' has been created
create_vlabel
---------------
(1 row)
-- After creating a new vertex label.
SELECT * FROM ag_catalog.ag_label;
name | graph | id | kind | relation | seq_name
------------------+-------+----+------+----------------------------+-------------------------
_ag_label_vertex | 68484 | 1 | v | new_graph._ag_label_vertex | _ag_label_vertex_id_seq
_ag_label_edge | 68484 | 2 | e | new_graph._ag_label_edge | _ag_label_edge_id_seq
Person | 68484 | 3 | v | new_graph.'Person' | Person_id_seq
(3 rows)
Whenever a vertex label is created with the
create_vlabel() function, a new table is
generated within the new_graph namespace:
new_graph.'. The same
works for the label'create_elabel() function for
the creation of edge labels. Creating vertices and edges with
Cypher will automatically make these tables.
-- Creating two vertices and one edge.
SELECT * FROM cypher('new_graph', $$
CREATE (:Person {name: 'Daedalus'})-[:FATHER_OF]->(:Person {name: 'Icarus'})
$$) AS (a agtype);
a
---
(0 rows)
-- Showing the newly created tables.
SELECT * FROM ag_catalog.ag_label;
name | graph | id | kind | relation | seq_name
------------------+-------+----+------+----------------------------+-------------------------
_ag_label_vertex | 68484 | 1 | v | new_graph._ag_label_vertex | _ag_label_vertex_id_seq
_ag_label_edge | 68484 | 2 | e | new_graph._ag_label_edge | _ag_label_edge_id_seq
Person | 68484 | 3 | v | new_graph.'Person' | Person_id_seq
FATHER_OF | 68484 | 4 | e | new_graph.'FATHER_OF' | FATHER_OF_id_seq
(4 rows)
It is recommended that no DML or DDL commands are executed in the namespace that is reserved for the graph.
Cypher queries are constructed using a function called
cypher in ag_catalog,
which returns a Postgres Pro
SETOF records.
cypher(graph_name name, query_string cstring, parameters agtype) returns setof record
#
Executes the Cypher query passed as an argument.
If the Cypher query does not return results, a
record definition still needs to be defined.
The parameter map specified as parameters
can only be used with
prepared
statements. An error will be thrown otherwise.
SELECT * FROM cypher('graph_name', $$
/* Cypher Query Here */
$$) AS (result1 agtype, result2 agtype);
Cypher may not be used as part of an expression, use a subquery instead. See Advanced Cypher Queries for information about how to use Cypher queries with expressions.
Calling Cypher in the SELECT clause as an
independent column is not allowed. However, Cypher may be used
when it belongs as a conditional.
SELECT
cypher('graph_name', $$
MATCH (v:Person)
RETURN v.name
$$);
ERROR: cypher(...) in expressions is not supported
LINE 3: cypher('graph_name', $$
^
HINT: Use subquery instead if possible.
agtype #
apache_age uses a custom data type called
agtype, which is the only data
type returned by apache_age.
agtype is a superset of json and a custom
implementation of jsonb.
In Cypher, null is used to represent
missing or undefined values. Conceptually,
null means “a missing unknown
value”, and it is treated differently from
other values. For example, getting a property from a vertex
that does not have said property produces
null. Most expressions that take
null as input will produce
null. This includes boolean expressions
that are used as predicates in the WHERE
clause. In this case, anything that is not true is interpreted
as being false. null is not equal to
null. Not knowing two values does not imply
that they are the same value. So the expression
null = null yields null
and not true.
The following query returns null as an empty space.
SELECT *
FROM cypher('graph_name', $$
RETURN NULL
$$) AS (null_result agtype);
null_result
--------------
(1 row)
The concept of NULL in agtype and
Postgres Pro is the same as it is in Cypher.
The integer type stores whole numbers, i.e. numbers
without fractional components. Integer data type is a 64-bit field
that stores values from -9,223,372,036,854,775,808 to
9,223,372,036,854,775,807. Attempts to store values outside
this range will result in an error.
The type integer is the common choice, as it offers
the best balance between range, storage size, and performance. The
smallint type is generally used only if
disk space is at a premium. The bigint type
is designed to be used when the range of the integer type is
insufficient.
SELECT *
FROM cypher('graph_name', $$
RETURN 1
$$) AS (int_result agtype);
int_result
--------------
1
(1 row)
The data type float is an inexact,
variable-precision numeric type, conforming to the IEEE 754
Standard.
Inexact means that some values cannot be converted exactly to the internal format and are stored as approximations, so that storing and retrieving a value might show slight discrepancies. Managing these errors and how they propagate through calculations is the subject of an entire branch of mathematics and computer science and will not be discussed here, except for the following points:
If you require exact storage and calculations (such as for
monetary amounts), use the numeric type instead.
If you want to do complicated calculations with these types for anything important, especially if you rely on certain behavior in boundary cases (infinity, underflow), you should evaluate the implementation carefully.
Comparing two floating-point values for equality might not always work as expected.
Values that are too large or too small will cause an error. Rounding might take place if the precision of an input number is too high. Numbers too close to zero that are not representable as distinct from zero will cause an underflow error.
In addition to ordinary numeric values, the floating-point types have several special values:
Infinity
-Infinity
NaN
These represent the IEEE 754 special values “infinity”, “negative infinity”, and “not-a-number”, respectively. When writing these values as constants in a Cypher command, you must put quotes around them and typecast them, for example:
SET x.float_value = '-Infinity'::float
On input, these strings are recognized in a case-insensitive manner.
Note that IEEE 754 specifies that NaN should not compare
equal to any other floating-point value (including NaN).
However, in order to allow floats to be sorted correctly,
apache_age evaluates
'NaN'::float = 'NaN'::float to true. See
Comparability
and Equality for more details.
To use a float, denote a decimal value.
SELECT *
FROM cypher('graph_name', $$
RETURN 1.0
$$) AS (float_result agtype);
float_result
--------------
1.0
(1 row)
The type numeric can store numbers with a
very large number of digits. It is especially recommended for
storing monetary amounts and other quantities where exactness is
required. Calculations with numeric values yield exact
results where possible, e.g., addition, subtraction, multiplication.
However, calculations on numeric values are very slow
compared to the integer types, or to the floating-point types.
We use the following terms below: The
precision of a numeric
is the total count of significant digits in the whole number,
that is, the number of digits to both sides of the decimal point.
The scale of a numeric is the
count of decimal digits in the fractional part, to the right of the
decimal point. So the number 23.5141 has a precision of 6 and a
scale of 4. Integers can be considered to have a scale of zero.
Specifying NUMERIC
without any precision or scale creates a column in which numeric
values of any precision and scale can be stored, up to the
implementation limit on precision. A column of this kind will
not coerce input values to any particular scale, whereas
numeric columns with a declared scale will coerce
input values to that scale. (The SQL standard
requires a default scale of 0, i.e., coercion to integer
precision. We find this a bit useless. If you are concerned
about portability, always specify the precision and scale
explicitly.)
The maximum allowed precision when explicitly specified in the
type declaration is 1000; NUMERIC without a specified
precision is subject to the limits described in Table 8.2.
If the scale of a value to be stored is greater than the declared scale of the column, the system will round the value to the specified number of fractional digits. Then, if the number of digits to the left of the decimal point exceeds the declared precision minus the declared scale, an error is raised.
Numeric values are physically stored without any extra leading or
trailing zeroes. Thus, the declared precision and scale of a column
are maximums, not fixed allocations. (In this sense the numeric
type is more akin to varchar(
than to n)char(.) The actual storage
requirement is two bytes for each group of four decimal digits,
plus three to eight bytes overhead.
n)
In addition to ordinary numeric values, the numeric
type allows the special value NaN, meaning
“not-a-number”. Any operation on NaN
yields another NaN. When writing this value
as a constant in an SQL command, you must put quotes around it,
for example UPDATE table SET x = 'NaN'. On input,
the string NaN is recognized in a case-insensitive manner.
In most implementations of the “not-a-number” concept,
NaN is not considered equal to any other numeric
value (including NaN). However, in order to allow floats to be
sorted correctly, apache_age evaluates
'NaN'::numeric = 'NaN':numeric to true.
See Comparability
and Equality for more details.
When rounding values, the numeric type rounds ties away
from zero, while (on most machines) the real
and double precision types round ties to the nearest even
number.
When creating a numeric data type, the
::numeric data annotation is required.
SELECT *
FROM cypher('graph_name', $$
RETURN 1.0::numeric
$$) AS (numeric_result agtype);
numeric_result
--------------
1.0::numeric
(1 row)
apache_age provides the standard Cypher
type boolean. The boolean
type can have several states: true,
false, and a third state,
unknown, which is represented by
the agtype null value.
Boolean constants can be represented in Cypher queries by
the keywords TRUE,
FALSE, and NULL.
SELECT *
FROM cypher('graph_name', $$
RETURN TRUE
$$) AS (boolean_result agtype);
boolean_result
--------------
true
(1 row)
Unlike Postgres Pro, in
apache_age boolean
outputs as the full word, i.e. true and
false as opposed to t
and f.
agtype string literals can contain the following
escape sequences:
Table H.1. Escape Sequences
| Escape Sequence | Character |
|---|---|
| \t | Tab |
| \b | Backspace |
| \n | Newline |
| \r | Carriage Return |
| \f | Form Feed |
| \' | Single Quote |
| \" | Double Quote |
| \\ | Backslash |
| \uXXXX |
Unicode UTF-16 code point (4 hex digits must follow
\u)
|
Use single (') quotes to identify a string. The output will use double (") quotes.
SELECT *
FROM cypher('graph_name', $$
RETURN 'This is a string'
$$) AS (string_result agtype);
string_result
--------------
"This is a string"
(1 row)
All examples will use the
WITH
clause and
RETURN
clause.
A literal list is created by using brackets and separating the elements in the list with commas.
SELECT *
FROM cypher('graph_name', $$
WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst
$$) AS (lst agtype);
lst
------------------------------------
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
(1 row)
A list can hold the value null, unlike
when a null is an independent value, it
will appear as the word null in a list.
SELECT *
FROM cypher('graph_name', $$
WITH [null] as lst
RETURN lst
$$) AS (lst agtype);
lst
--------
[null]
(1 row)
To access individual elements in the list, we use the square brackets again. This will extract from the start index and up to but not including the end index.
SELECT *
FROM cypher('graph_name', $$
WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst[3]
$$) AS (element agtype);
element
---------
3
(1 row)
Map elements in lists:
SELECT *
FROM cypher('graph_name', $$
WITH [0, {key: 'key_value'}, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst
$$) AS (map_value agtype);
map_value
-------------------------------------------------------
[0, {"key": "key_value"}, 2, 3, 4, 5, 6, 7, 8, 9, 10]
(1 row)
Accessing map elements in lists:
SELECT *
FROM cypher('graph_name', $$
WITH [0, {key: 'key_value'}, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst[1].key
$$) AS (map_value agtype);
map_value
-------------
"key_value"
(1 row)
You can also use negative numbers, to start from the end of the list instead.
SELECT *
FROM cypher('graph_name', $$
WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst[-3]
$$) AS (element agtype);
element
---------
8
(1 row)
Finally, you can use ranges inside the brackets to return ranges of the list.
SELECT *
FROM cypher('graph_name', $$
WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst[0..3]
$$) AS (element agtype);
element
-----------
[0, 1, 2]
(1 row)
Negative index ranges:
SELECT *
FROM cypher('graph_name', $$
WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst[0..-5]
$$) AS (lst agtype);
lst
--------------------
[0, 1, 2, 3, 4, 5]
(1 row)
Positive slices:
SELECT *
FROM cypher('graph_name', $$
WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst[..4]
$$) AS (lst agtype);
lst
--------------
[0, 1, 2, 3]
(1 row)
Negative slices:
SELECT *
FROM cypher('graph_name', $$
WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst[-5..]
$$) AS (lst agtype);
lst
------------------
[6, 7, 8, 9, 10]
(1 row)
Out-of-bound slices are simply truncated, but out-of-bound single elements return null.
SELECT *
FROM cypher('graph_name', $$
WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst[15]
$$) AS (element agtype);
element
---------
(1 row)
SELECT *
FROM cypher('graph_name', $$
WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
RETURN lst[5..15]
$$) AS (element agtype);
element
---------------------
[5, 6, 7, 8, 9, 10]
(1 row)
Maps can be constructed using Cypher.
You can construct a simple map with simple agtype values.
SELECT *
FROM cypher('graph_name', $$
WITH {int_key: 1, float_key: 1.0, numeric_key: 1::numeric, bool_key: true, string_key: 'Value'} as m
RETURN m
$$) AS (m agtype);
m
------------------------------------------------------------------------------------------------------
{"int_key": 1, "bool_key": true, "float_key": 1.0, "string_key": "Value", "numeric_key": 1::numeric}
(1 row)
A map can also contain composite data types, i.e. lists and other maps.
SELECT *
FROM cypher('graph_name', $$
WITH {listKey: [{inner: 'Map1'}, {inner: 'Map2'}], mapKey: {i: 0}} as m
RETURN m
$$) AS (m agtype);
m
-------------------------------------------------------------------------
{"mapKey": {"i": 0}, "listKey": [{"inner": "Map1"}, {"inner": "Map2"}]}
(1 row)
Property access of a map:
SELECT *
FROM cypher('graph_name', $$
WITH {int_key: 1, float_key: 1.0, numeric_key: 1::numeric, bool_key: true, string_key: 'Value'} as m
RETURN m.int_key
$$) AS (int_key agtype);
int_key
---------
1
(1 row)
Accessing list elements in maps:
SELECT *
FROM cypher('graph_name', $$
WITH {listKey: [{inner: 'Map1'}, {inner: 'Map2'}], mapKey: {i: 0}} as m
RETURN m.listKey[0]
$$) AS (m agtype);
m
-------------------
{"inner": "Map1"}
(1 row)
An entity has a unique, comparable identity which defines whether or not two entities are equal.
An entity is assigned a set of properties, each of which are uniquely identified in the set by the irrespective property keys.
Simple entities are assigned a unique graphid.
A graphid is a
unique composition of the entity label id and a unique
sequence assigned to each label. Note that there will be
overlap in IDs when comparing entities from different graphs.
A label is an identifier that classifies vertices and edges into certain categories.
Edges are required to have a label, but vertices do not.
The names of labels between vertices and edges cannot overlap.
See CREATE
clause for information about how to make entities with labels.
Both vertices and edges may have properties. Properties are attribute values, and each attribute name should be defined only as a string type.
A vertex is the basic entity of the graph, with the unique attribute of being able to exist in and of itself.
A vertex may be assigned a label.
A vertex may have zero or more outgoing edges.
A vertex may have zero or more incoming edges.
Table H.2. Data Format
| Attribute Name | Description |
|---|---|
| id | Graph ID for this vertex |
| label | Name of the label this vertex has |
| properties | Properties associated with this vertex |
{id:1; label: 'label_name'; properties: {prop1: value1, prop2: value2}}::vertex
Type casting a map to a vertex:
SELECT *
FROM cypher('graph_name', $$
WITH {id: 0, label: 'label_name', properties: {i: 0}}::vertex as v
RETURN v
$$) AS (v agtype);
v
------------------------------------------------------------------
{"id": 0, "label": "label_name", "properties": {"i": 0}}::vertex
(1 row)
An edge is an entity that encodes a directed connection between exactly two nodes, the source node and the target node. An outgoing edge is a directed relationship from the point of view of its source node. An incoming edge is a directed relationship from the point of view of its target node. An edge is assigned exactly one edge type.
Table H.3. Data Format
| Attribute Name | Description |
|---|---|
| id | Graph ID for this edge |
| startid | Graph ID for the source node |
| endid | Graph ID for the target node |
| label | Name of the label this vertex has |
| properties | Properties associated with this vertex |
{id: 3; startid: 1; endid: 2; label: 'edge_label' properties{prop1: value1, prop2: value2}}::edge
Type casting a map to an edge:
SELECT *
FROM cypher('graph_name', $$
WITH {id: 2, start_id: 0, end_id: 1, label: 'label_name', properties: {i: 0}}::edge as e
RETURN e
$$) AS (e agtype);
e
--------------------------------------------------------------------------------------------
{"id": 2, "label": "label_name", "end_id": 1, "start_id": 0, "properties": {"i": 0}}::edge
(1 row)
A path is a series of alternating vertices and edges. A path must start with a vertex, and have at least one edge.
Type casting a list to a path:
SELECT *
FROM cypher('graph_name', $$
WITH [{id: 0, label: 'label_name_1', properties: {i: 0}}::vertex,
{id: 2, start_id: 0, end_id: 1, label: 'edge_label', properties: {i: 0}}::edge,
{id: 1, label: 'label_name_2', properties: {}}::vertex
]::path as p
RETURN p
$$) AS (p agtype);
p
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{"id": 0, "label": "label_name_1", "properties": {"i": 0}}::vertex, {"id": 2, "label": "edge_label", "end_id": 1, "start_id": 0, "properties": {"i": 0}}::edge, {"id": 1, "label": "label_name_2", "properties": {}}::vertex]::path
(1 row)
apache_age already has good semantics for equality within the primitive types (booleans, strings, integers, and floats) and maps. Furthermore, Cypher has good semantics for comparability and orderability for integers, floats, and strings, within each of the types. However, working with values of different types deviates from Postgres Pro defined logic and the openCypher specification:
Comparability between values of different types is defined.
This deviation is particularly pronounced when it occurs as
part of the evaluation of predicates (in WHERE).
ORDER BY will not fail if the values passed
to it have different types.
The underlying conceptual model is complex and sometimes
inconsistent. This leads to an unclear relationship between
comparison operators, equality, grouping, and ORDER BY.
Comparability and orderability are aligned with each other
consistently, as all types can be ordered and compared. The
difference between equality and equivalence, as exposed by
IN, =,
DISTINCT, and grouping,
in apache_age is limited to
testing two instances of the value null to each other. In
equality, null = null is
null. In equivalence, used by
DISTINCT and when grouping values, two null
values are always treated as being the same value. However,
equality treats null values differently if they are an element of
a list or a map value.
The openCypher specification features four distinct concepts related to equality and ordering:
Comparability is used by the inequality operators (>, <, >=, <=) and defines the underlying semantics of how to compare two values.
Equality is used by the equality operators (=, <>), and
the list membership operator (IN). It
defines the underlying semantics to determine if two values
are the same in these contexts. Equality is also used
implicitly by literal maps in node and relationship patterns,
since such literal maps are merely a shorthand notation for
equality predicates.
Orderability is used by the ORDER BY
clause, and defines the underlying semantics of how to order
values.
Equivalence is used by the DISTINCT
modifier and by grouping in projection clauses
(WITH, RETURN), and
defines the underlying semantics to determine if two values
are the same in these contexts.
Comparison operators need to function as one would expect comparison operators to function — equality and comparability. But, at the same time, they need to allow the sorting of column data — equivalence and orderability.
Unfortunately, it may not be possible to implement separate comparison operators for equality and comparison operations, and, equivalence and orderability operations, in Postgres Pro, for the same query. So we prioritize equivalence and orderability over equality and comparability to allow for ordering of output data.
Comparability is defined between any pair of values, as specified below.
Numbers
Numbers of different types (excluding NaN values and
the Infinities) are compared to each other as if both
numbers would have been coerced to arbitrary precision
bigdecimal (currently outside the
Cypher type system) before comparing them with each other
numerically in ascending order.
Comparison to any value that is not also number follows the rules of orderability.
Floats do not have the required precision to represent
all of the whole numbers in the range of agtype
integer and agtype numeric. When
casting an integer or agtype
numeric to a float, unexpected results can
occur when casting values in the high and low range.
Integers
Integers are compared numerically in ascending order.
Floats
Floats (excluding NaN values and the Infinities) are compared numerically in ascending order.
Positive infinity is of type
FLOAT, equal to itself and
greater than any other number, except NaN values.
Negative infinity is of type
FLOAT, equal to itself and less
than any other number.
NaN values are comparable to each and greater than any other float value.
Numeric
Numerics are compared numerically in ascending order.
Booleans
Booleans are compared such that false is less than true.
Comparison to any value that is not also a boolean follows the rules of orderability.
Strings
Strings are compared in dictionary order,
i.e. characters are compared pairwise in ascending
order from the start of the string to the end.
Characters missing in a shorter string are considered
to be less than any other character. For example,
'a' < 'aa'.
Comparison to any value that is not also a string follows the rules of orderability.
Lists
Lists are compared in sequential order, i.e. list
elements are compared pairwise in ascending order from
the start of the list to the end. Elements missing in
a shorter list are considered to be less than any
other value (including null values),
for example, [1] < [1, 0] but also
[1] < [1, null].
Comparison to any value that is not also a list follows the rules of orderability.
Maps
The comparison order for maps is unspecified and left to implementations.
The comparison order for maps must align with the
equality semantics outlined below. In consequence, any
map that contains an entry that maps its key to a null
value is incomparable. For example,
{a: 1} <= {a: 1, b: null}
evaluates to null.
Comparison to any value that is not also a regular map follows the rules of orderability.
Entities:
Vertices: The comparison order for vertices is
based on the assigned graphid.
Edges: The comparison order for edges is based
on the assigned graphid.
Paths: Paths are compared as if they were a list of alternating nodes and
relationships of the path from the start node to the end node.
For example, given nodes n1,
n2, n3, and
relationships r1 and r2,
and given that n1 < n2 < n3 and
r1 < r2, then the path
p1 from n1 to
n3 via r1 would be less
than the path p2 to n1
from n2 via r2.
Paths are expressed in terms of lists:
p1 < p2 <=> [n1, r1, n3] < [n1, r2, n2] <=> n1 < n1 || (n1 = n1 && [r1, n3] < [r2, n2]) <=> false || (true && [r1, n3] < [r2, n2]) <=> [r1, n3] < [r2, n2] <=> r1 < r2 || (r1 = r2 && n3 < n2) <=> true || (false && false) <=> true
Comparison to any value that is not also a path will return false.
NULL: null is incomparable with any other
value (including other null values.)
agtype Types #
The ordering of different agtype types,
when using <, <=, >, >= from the smallest value
to the largest value is:
Path
Edge
Vertex
Object
Array
String
Bool
Numeric, Integer, Float
NULL
This is subject to change in future releases.
SELECT * FROM cypher('graph_name', $$
CREATE (:Person {name: 'John'}),
(:Person {name: 'Jeff'}),
(:Person {name: 'Joan'}),
(:Person {name: 'Bill'})
$$) AS (result agtype);
Performs case-sensitive prefix searching on strings.
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
WHERE v.name STARTS WITH "J"
RETURN v.name
$$) AS (names agtype);
names
--------
"John"
"Jeff"
"Joan"
(3 rows)
Performs case-sensitive inclusion searching in strings.
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
WHERE v.name CONTAINS "o"
RETURN v.name
$$) AS (names agtype);
names
--------
"John"
"Joan"
(2 rows)
Performs case-sensitive suffix searching on strings.
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
WHERE v.name ENDS WITH "n"
RETURN v.name
$$) AS (names agtype);
names
--------
"John"
"Joan"
(2 rows)
apache_age supports the use of
POSIX regular
expressions using the =~
operator. By default =~ is case sensitve.
The =~ operator when no special
characters are given, act like the =
operator.
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
WHERE v.name =~ 'John'
RETURN v.name
$$) AS (names agtype);
names
--------
"John"
(1 row)
Adding (?i) at the beginning of the
string will make the comparison case insensitive.
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
WHERE v.name =~ '(?i)JoHn'
RETURN v.name
$$) AS (names agtype);
names
--------
"John"
(1 row)
. Wildcard #
The . operator acts as a wildcard
to match any single character.
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
WHERE v.name =~ 'Jo.n'
RETURN v.name
$$) AS (names agtype);
names
--------
"John"
"Joan"
(2 rows)
* Wildcard #
The * wildcard after a character will
match to 0 or more of the previous character.
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
WHERE v.name =~ 'Johz*n'
RETURN v.name
$$) AS (names agtype);
names
--------
"John"
(1 row)
+ Operator #
The + operator matches to 1 or more
the previous character.
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
WHERE v.name =~ 'Bil+'
RETURN v.name
$$) AS (names agtype);
names
--------
"Bill"
(1 row)
. and * Wildcards Together #
You can use the . and *
wildcards together to represent the rest of the string.
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
WHERE v.name =~ 'J.*'
RETURN v.name
$$) AS (names agtype);
names
--------
"John"
"Jeff"
"Joan"
(3 rows)
Operator precedence in apache_age is shown below:
Table H.4. Operator Precedence
| Precedence | Operator | Description |
|---|---|---|
| 1 | . | Property Access |
| 2 | [] | Map and List Subscripting |
| () | Function Call | |
| 3 | STARTS WITH | Case-sensitive prefix searching on strings |
| ENDS WITH | Case-sensitive suffix searching on strings | |
| CONTAINS | Case-sensitive inclusion searching on strings | |
| =~ | Regular expression string matching | |
| 4 | - | Unary Minus |
| 5 | IN | Checking if an element exists in a list |
| IS NULL | Checking a value is NULL | |
| IS NOT NULL | Checking a value is not NULL | |
| 6 | ^ | Exponentiation |
| 7 | * / % | Multiplication, division and remainder |
| 8 | + - | Addition and Subtraction |
| 9 | = <> | For relational = and ≠ respectively |
| < <= | For relational < and ≤ respectively | |
| > >= | For relational > and ≥ respectively | |
| 10 | NOT | Logical NOT |
| 11 | AND | Logical AND |
| 12 | OR | Logical OR |
Generally an aggregation aggr(expr) processes
all matching rows for each aggregation key found in an incoming
record (keys are compared using
equivalence).
In a regular aggregation (i.e. of the form
aggr(expr)), the list of aggregated values is
the list of candidate values with all null values removed from it.
SELECT * FROM cypher('graph_name', $$
CREATE (a:Person {name: 'A', age: 13}),
(b:Person {name: 'B', age: 33, eyes: "blue"}),
(c:Person {name: 'C', age: 44, eyes: "blue"}),
(d1:Person {name: 'D', eyes: "brown"}),
(d2:Person {name: 'D'}),
(a)-[:KNOWS]->(b),
(a)-[:KNOWS]->(c),
(a)-[:KNOWS]->(d1),
(b)-[:KNOWS]->(d2),
(c)-[:KNOWS]->(d2)
$$) as (a agtype);
To calculate aggregated data, Cypher offers aggregation,
analogous to SQL GROUP BY.
Aggregating functions take a set of values and calculate an
aggregated value over them. Examples are
avg()
that calculates the average of multiple numeric values, or
min()
that finds the smallest numeric or string value in a set of
values. When we say below that an aggregating function operates
on a set of values, we mean these to be the result of the
application of the inner expression (such as
n.age) to all the records within the same
aggregation group.
Aggregation can be computed over all the matching subgraphs, or it can be further divided by introducing grouping keys. These are non-aggregate expressions, that are used to group the values going into the aggregate functions.
Assume we have the following RETURN statement:
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
RETURN v.name, count(*)
$$) as (grouping_key agtype, count agtype);
grouping_key | count
--------------+-------
"A" | 1
"D" | 2
"B" | 1
"C" | 1
(4 rows)
We have two return expressions: grouping_key,
and count(*). The first,
grouping_key, is not an aggregate function,
and so it will be the grouping key. The latter,
count(*) is an aggregate expression. The
matching subgraphs will be divided into different buckets,
depending on the grouping key. The aggregate function will then
be run on these buckets, calculating an aggregate value per
bucket.
To use aggregations to sort the result set, the aggregation must
be included in the RETURN to be used in the
ORDER BY.
SELECT *
FROM cypher('graph_name', $$
MATCH (me:Person)-[]->(friend:Person)
RETURN count(friend), me
ORDER BY count(friend)
$$) as (friends agtype, me agtype);
In a distinct aggregation (i.e. of the form
aggr(DISTINCT expr)), the list of aggregated
values is the list of candidate values with all null values
removed from it. Furthermore, in a distinct aggregation, only
one of all equivalent candidate values is included in the list
of aggregated values, i.e. duplicates under equivalence are
removed.
The DISTINCT operator works in conjunction
with aggregation. It is used to make all values unique before
running them through an aggregate function.
SELECT *
FROM cypher('graph_name', $$
MATCH (v:Person)
RETURN count(DISTINCT v.eyes), count(v.eyes)
$$) as (distinct_eyes agtype, eyes agtype);
distinct_eyes | eyes
---------------+------
2 | 3
(1 row)
This feature of not requiring the user to specify their grouping keys for a query allows for ambiguity on what Cypher should qualify as their grouping keys.
SELECT * FROM cypher('graph_name', $$
CREATE (:L {a: 1, b: 2, c: 3}),
(:L {a: 2, b: 3, c: 1}),
(:L {a: 3, b: 1, c: 2})
$$) as (a agtype);
apache_age solution to this problem is to not allow a
WITH or RETURN column to
combine aggregate functions with variables that are not
explicitly listed in another column of the same
WITH or RETURN clause.
SELECT * FROM cypher('graph_name', $$
MATCH (x:L)
RETURN x.a + count(*) + x.b + count(*) + x.c
$$) as (a agtype);
ERROR: 'x' must be either part of an explicitly listed key or used inside an aggregate function
LINE 3: RETURN x.a + count(*) + x.b + count(*) + x.c
Columns that do not include an aggregate function in apache_age are
considered to be the grouping keys for that
WITH or RETURN clause.
For the above query, the user could rewrite the query in several ways that will return results.
SELECT * FROM cypher('graph_name', $$
MATCH (x:L)
RETURN (x.a + x.b + x.c) + count(*) + count(*), x.a + x.b + x.c
$$) as (count agtype, key agtype);
count | key
-------+-----
12 | 6
(1 row)
x.a + x.b + x.c is the grouping key.
Grouping keys created like this must include parenthesis.
SELECT * FROM cypher('graph_name', $$
MATCH (x:L)
RETURN x.a + count(*) + x.b + count(*) + x.c, x.a, x.b, x.c
$$) as (count agtype, a agtype, b agtype, c agtype);
count | a | b | c
-------+---+---+---
10 | 3 | 1 | 2
10 | 2 | 3 | 1
10 | 1 | 2 | 3
(3 rows)
x.a, x.b, and
x.c will be considered different grouping
keys.
Alternatively, the grouping key can be a vertex or edge, and
then any properties of the vertex or edge can be specified
without being explicitly stated in a WITH
or RETURN column.
SELECT * FROM cypher('graph_name', $$
MATCH (x:L)
RETURN count(*) + count(*) + x.a + x.b + x.c, x
$$) as (count agtype, key agtype);
count | key
-------+----------------------------------------------------------------------------------------
8 | {"id": 1407374883553283, "label": "L", "properties": {"a": 3, "b": 1, "c": 2}}::vertex
8 | {"id": 1407374883553281, "label": "L", "properties": {"a": 1, "b": 2, "c": 3}}::vertex
8 | {"id": 1407374883553282, "label": "L", "properties": {"a": 2, "b": 3, "c": 1}}::vertex
(3 rows)
Results will be grouped on x, because it is
safe to assume that properties considered unnecessary for
grouping are unambiguous.
If the grouping key is considered unnecessary for the query
output, the aggregation can be done in a
WITH clause then passing information to the
RETURN clause.
SELECT * FROM cypher('graph_name', $$
MATCH (x:L)
WITH count(*) + count(*) + x.a + x.b + x.c as column, x
RETURN column
$$) as (a agtype);
a
---
8
8
8
(3 rows)
You can use the following instructions to create a graph from the files:
User must create graph and labels before loading data from files.
Following are the details about the functions to create vertices and edges from the file.
Function load_labels_from_file is used to
load vertices from CSV files.
load_labels_from_file('graph_name',
'label_name',
'file_path')
By adding the fourth parameter user can exclude the
id field.
Use this when there is no id field in the file.
load_labels_from_file('graph_name',
'label_name',
'file_path',
false)
Function load_edges_from_file can be used to
load edges from the CSV file. See the file structure below.
Make sure that IDs in the edge file are identical to ones that are in vertices files.
load_edges_from_file('graph_name',
'label_name',
'file_path');
Following is the explanation about the structure for CSV files for vertices and edges.
A CSV file for nodes is formatted as follows:
Table H.5. CSV File Format for Nodes
| Field name | Field description |
|---|---|
| id |
The first column of the file. All values
are a positive integer. This is an optional
field when id_field_exists is
false. However, it should be present when
id_field_exists is
not set to false.
|
| Properties | All other columns contain the properties for the nodes. Header row contains the name of property. |
Similarly, a CSV file for edges is formatted as follows:
Table H.6. CSV File Format for Edges
| Field name | Field description |
|---|---|
| start_id |
Node ID of the node from where the edge is started.
This ID is present in nodes.csv file.
|
| start_vertex_type | Class of the node |
| end_id | End ID of the node at which the edge is terminated. |
| end_vertex_type | Class of the node |
| properties | Properties of the edge. The header contains the property name. |
Load apache_age and create a graph.
LOAD 'age';
SET search_path TO ag_catalog;
SELECT create_graph('agload_test_graph');
Create label Country and load vertices
from the CSV file. Note that this CSV file has
the id field.
SELECT create_vlabel('agload_test_graph','Country');
SELECT load_labels_from_file('agload_test_graph',
'Country',
'/age/regress/age_load/data/countries.csv');
Create label City and load vertices from the
CSV file. Note that this CSV file has
the id field.
SELECT create_vlabel('agload_test_graph','City');
SELECT load_labels_from_file('agload_test_graph',
'City',
'/age/regress/age_load/data/cities.csv');
Create label has_city and load edges from
the CSV file.
SELECT create_elabel('agload_test_graph','has_city');
SELECT load_edges_from_file('agload_test_graph', 'has_city',
'/age/regress/age_load/data/edges.csv');
Check if the graph has been loaded properly.
SELECT table_catalog, table_schema, table_name, table_type
FROM information_schema.tables
WHERE table_schema = 'agload_test_graph';
SELECT COUNT(*) FROM agload_test_graph."Country";
SELECT COUNT(*) FROM agload_test_graph."City";
SELECT COUNT(*) FROM agload_test_graph."has_city";
SELECT COUNT(*) FROM cypher('agload_test_graph', $$MATCH(n) RETURN n$$) as (n agtype);
SELECT COUNT(*) FROM cypher('agload_test_graph', $$MATCH (a)-[e]->(b) RETURN e$$) as (n agtype);
Create label Country2 and load vertices
from the CSV file. Note that this CSV file has no
id field.
SELECT create_vlabel('agload_test_graph','Country2');
SELECT load_labels_from_file('agload_test_graph',
'Country2',
'/age/regress/age_load/data/countries.csv',
false);
Create label City2 and load vertices
from CSV file. Note this CSV file has no id field.
SELECT create_vlabel('agload_test_graph','City2');
SELECT load_labels_from_file('agload_test_graph',
'City2',
'/age/regress/age_load/data/cities.csv',
false);
Check if the graph has been loaded properly and perform difference analysis between IDs created automatically and picked from the files.
Labels Country and
City were created with the
id field in the file.
Labels Country2 and
City2 were created with no
id field in the file.
SELECT COUNT(*) FROM agload_test_graph."Country2";
SELECT COUNT(*) FROM agload_test_graph."City2";
SELECT id FROM agload_test_graph."Country" LIMIT 10;
SELECT id FROM agload_test_graph."Country2" LIMIT 10;
SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country {iso2 : 'BE'})
RETURN id(n), n.name, n.iso2 $$) as ('id(n)' agtype, 'n.name' agtype, 'n.iso2' agtype);
SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country2 {iso2 : 'BE'})
RETURN id(n), n.name, n.iso2 $$) as ('id(n)' agtype, 'n.name' agtype, 'n.iso2' agtype);
SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country {iso2 : 'AT'})
RETURN id(n), n.name, n.iso2 $$) as ('id(n)' agtype, 'n.name' agtype, 'n.iso2' agtype);
SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country2 {iso2 : 'AT'})
RETURN id(n), n.name, n.iso2 $$) as ('id(n)' agtype, 'n.name' agtype, 'n.iso2' agtype);
SELECT drop_graph('agload_test_graph', true);
The MATCH clause allows you to specify the
patterns Cypher will search for in the database. This is the
primary way of getting data into the current set of bindings. It
is worth reading up more on the specification of the patterns
themselves in Section 9.7.
MATCH is often coupled to a
WHERE part, which adds restrictions, or
predicates, to the MATCH patterns, making them
more specific. The predicates are part of the pattern description,
and should not be considered a filter applied only after the
matching is done. This means that WHERE should
always be put together with the MATCH clause it
belongs to.
MATCH can occur at the beginning of the query or
later, possibly after a WITH. If it is the first
clause, nothing will have been bound yet, and Cypher will design a search
to find the results matching the clause and any associated
predicates specified in any WHERE part.
Vertices and edges found by this search are available as bound
pattern elements, and can be used for pattern matching of
sub-graphs. They can also be used in any future clauses, where
Cypher will use the known elements, and from there find further
unknown elements.
Cypher is declarative, and so usually the query itself does not
specify the algorithm to use to perform the search. Predicates in
WHERE parts can be evaluated before pattern
matching, during pattern matching, or after finding matches.
By just specifying a pattern with a single vertex and no labels, all vertices in the graph will be returned.
SELECT * FROM cypher('graph_name', $$
MATCH (v)
RETURN v
$$) as (v agtype);
v
-------------------------------------------------------------------------------
{id: 0; label: 'Person'; properties: {name: 'Charlie Sheen'}}::vertex
{id: 1; label: 'Person'; properties: {name: 'Martin Sheen'}}::vertex
{id: 2; label: 'Person'; properties: {name: 'Michael Douglas'}}::vertex
{id: 3; label: 'Person'; properties: {name: 'Oliver Stone'}}::vertex
{id: 4; label: 'Person'; properties: {name: 'Rob Reiner'}}::vertex
{id: 5; label: 'Movie'; properties: {name: 'Wall Street'}}::vertex
{id: 6; label: 'Movie'; properties: {title: 'The American President'}}::vertex
(7 rows)
Returns all the vertices in the database.
Getting all vertices with a label on them is done with a single node pattern where the vertex has a label on it.
SELECT * FROM cypher('graph_name', $$
MATCH (movie:Movie)
RETURN movie.title
$$) as (title agtype);
title
------------------------
'Wall Street'
'The American President'
(2 rows)
Returns all the movies in the database.
The symbol -[]- means related to, without
regard to type or direction of the edge.
SELECT * FROM cypher('graph_name', $$
MATCH (director {name: 'Oliver Stone'})-[]-(movie)
RETURN movie.title
$$) as (title agtype);
title
-------------
'Wall Street'
(1 row)
Returns all the movies directed by “Oliver Stone”.
To constrain your pattern with labels on vertices, you add it to your vertex in the pattern, using the label syntax.
SELECT * FROM cypher('graph_name', $$
MATCH (:Person {name: 'Oliver Stone'})-[]-(movie:Movie)
RETURN movie.title
$$) as (title agtype);
title
-------------
'Wall Street'
(1 row)
Returns any vertices connected with the
Person “Oliver” that are
labeled Movie.
When the direction of an edge is of interest, it is shown by
using -> or <-.
SELECT * FROM cypher('graph_name', $$
MATCH (:Person {name: 'Oliver Stone'})-[]->(movie)
RETURN movie.title
$$) as (title agtype);
title
-------------
'Wall Street'
(1 row)
Returns any vertices connected with the
Person “Oliver” by an outgoing
edge.
If a variable is required, either for filtering on properties of the edge, or to return the edge, this is how you introduce the variable.
SELECT * FROM cypher('graph_name', $$
MATCH (:Person {name: 'Oliver Stone'})-[r]->(movie)
RETURN type(r)
$$) as (title agtype);
title
----------
'DIRECTED'
(1 row)
Returns the type of each outgoing edge from “Oliver”.
When you know the edge type you want to match on, you can specify it by using a colon together with the edge type.
SELECT * FROM cypher('graph_name', $$
MATCH (:Movie {title: 'Wall Street'})<-[:ACTED_IN]-(actor)
RETURN actor.name
$$) as (actors_name agtype);
actors_name
-----------------
'Charlie Sheen'
'Martin Sheen'
'Michael Douglas'
(3 rows)
Returns all actors that ACTED_IN
“Wall Street”.
If you both want to introduce a variable to hold the edge, and specify the edge type you want, just add them both.
SELECT * FROM cypher('graph_name', $$
MATCH ({title: 'Wall Street'})<-[r:ACTED_IN]-(actor)
RETURN r.role
$$) as (role agtype);
role
--------------
'Gordon Gekko'
'Carl Fox'
'Bud Fox'
(3 rows)
Returns ACTED_IN roles for “Wall
Street”.
Edges can be expressed by using multiple statements in the
form of ()-[]-(), or they can be strung
together.
SELECT * FROM cypher('graph_name', $$
MATCH (charlie {name: 'Charlie Sheen'})-[:ACTED_IN]->(movie)<-[:DIRECTED]-(director)
RETURN movie.title, director.name
$$) as (title agtype, name agtype);
title | name
--------------+--------------
'Wall Street' | 'Oliver Stone'
(1 row)
Returns the movie “Charlie Sheen” acted in and its director.
When the connection between two vertices is of variable length, the list of edges that form the connection can be returned using the following connection.
Rather than describing a long path using a sequence of many vertex and edge descriptions in a pattern, many edges (and the intermediate vertices) can be described by specifying a length in the edge description of a pattern.
(u)-[*2]->(v)
Which describes a right directed path of three vertices and two edges can be rewritten to:
(u)-[]->()-[]->(v)
A range length can also be given:
(u)-[*3..5]->(v)
Which is equivalent to:
(u)-[]->()-[]->()-[]->(v) and (u)-[]->()-[]->()-[]->()-[]->(v) and (u)-[]->()-[]->()-[]->()-[]->()-[]->(v)
The previous example provided gave the path both a lower and
upper bound for the number of edges (and vertices) between
u and v. Either one or
both of these binding values can be excluded.
(u)-[*3..]->(v)
Returns all paths between u and
v that have three or more edges included.
(u)-[*..5]->(v)
Returns all paths between u and
v that have 5 or fewer edges included.
(u)-[*]->(v)
Returns all paths between u and
v.
SELECT * FROM cypher('graph_name', $$
MATCH p = (actor {name: 'Willam Dafoe'})-[:ACTED_IN*2]-(co_actor)
RETURN relationships(p)
$$) as (r agtype);
r
------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{id: 0; label:"ACTED_IN"; properties: {role: "Green Goblin"}}::edge, {id: 1; label: "ACTED_IN; properties: {role: "Spiderman", actor: "Toby Maguire}}::edge]
[{id: 0; label:"ACTED_IN"; properties: {role: "Green Goblin"}}::edge, {id: 2; label: "ACTED_IN; properties: {role: "Spiderman", actor: "Andrew Garfield"}}::edge]
(2 rows)
Returns the list of edges, including the one that “Willam Dafoe” acted in and the two “Spiderman” actors he worked with.
Using WITH, you can manipulate the output
before it is passed on to the following query parts. The
manipulations can be of the shape and/or number of entries in
the result set.
WITH can also, like
RETURN, alias expressions that are introduced
into the results using the aliases as the binding name.
WITH is also used to separate the reading of
the graph from updating of the graph. Every part of a query must
be either read-only or write-only. When going from a writing
part to a reading part, the switch can be done with an optional
WITH clause.
Aggregated results have to pass through a
WITH clause to be able to filter on.
SELECT *
FROM cypher('graph_name', $$
MATCH (david {name: 'David'})-[]-(otherPerson)-[]->()
WITH otherPerson, count(*) AS foaf
WHERE foaf > 1
RETURN otherPerson.name
$$) as (name agtype);
name
--------
"Anders"
(1 row)
The name of the person connected to “David” with the at least more than one outgoing relationship will be returned by the query.
collect #
You can sort your results before passing them to
collect, thus sorting the resulting list.
SELECT *
FROM cypher('graph_name', $$
MATCH (n)WITH n
ORDER BY n.name DESC LIMIT 3
RETURN collect(n.name)
$$) as (names agtype);
names
-------------------------
["Emil","David","Ceasar"]
(1 row)
A list of the names of people in reverse order, limited to 3, is returned.
You can match paths, limit to a certain number, and then match again using those paths as a base, as well as any number of similar limited searches.
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'Anders'})-[]-(m)WITH m
ORDER BY m.name DESC LIMIT 1
MATCH (m)-[]-(o)
RETURN o.name
$$) as (name agtype);
name
-------
"Anders"
"Bossman"
(2 rows)
Starting at “Anders”, find all matching nodes, order by name descending and get the top result, then find all the nodes connected to that top result, and return their names.
In the RETURN part of your query, you define
which parts of the pattern you are interested in. It can be nodes,
relationships, or properties on these.
To return a node, list it in the RETURN
statement.
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'B'})
RETURN n
$$) as (n agtype);
n
---------------------------------------------------
{id: 0; label: '' properties: {name: 'B'}}::vertex
(1 row)
The example will return the node.
To return n edges, just include it in the
RETURN list.
SELECT *
FROM cypher('graph_name', $$
MATCH (n)-[r:KNOWS]->()
WHERE n.name = 'A'
RETURN r
$$) as (r agtype);
r
-------------------------------------------------------------------
{id: 2; startid: 0; endid: 1; label: 'KNOWS' properties: {}}::edge
(1 row)
The relationship is returned by the example.
To return a property, use the dot separator, like this:
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'A'})
RETURN n.name
$$) as (name agtype);
name
------
'A'
(1 row)
The value of the property name gets returned.
When you want to return all vertices, edges and paths found in a
query, you can use the * symbol.
SELECT *
FROM cypher('graph_name', $$
MATCH (a {name: 'A'})-[r]->(b)
RETURN *
$$) as (a agtype, b agtype, r agtype);
a | b | r
--------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------
{"id": 281474976710659, "label": "", "properties": {"age": 55, "name": "A", "happy": "Yes!"}}::vertex | {"id": 1125899906842625, "label": "BLOCKS", "end_id": 281474976710660, "start_id": 281474976710659, "properties": {}}::edge | {"id": 281474976710660, "label": "", "properties": {"name": "B"}}::vertex
{"id": 281474976710659, "label": "", "properties": {"age": 55, "name": "A", "happy": "Yes!"}}::vertex | {"id": 1407374883553281, "label": "KNOWS", "end_id": 281474976710660, "start_id": 281474976710659, "properties": {}}::edge | {"id": 281474976710660, "label": "", "properties": {"name": "B"}}::vertex
(2 rows)
This returns the two vertices, and the edge used in the query.
To introduce a placeholder that is made up of characters that
are not contained in the English alphabet, you can use the
` to enclose the variable, like this:
SELECT *
FROM cypher('graph_name', $$
MATCH (`This isn\'t a common variable`)
WHERE `This isn\'t a common variable`.name = 'A'
RETURN `This isn\'t a common variable`.happy
$$) as (happy agtype);
happy
-------
"Yes!"
(1 row)
The node with name “A” is returned.
If the name of the field should be different from the expression used, you can rename it by changing the name in the column list definition.
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'A'})
RETURN n.name
$$) as (objects_name agtype);
objects_name
-------------
'A'
(1 row)
Returns the property of a node, but renames the field.
If a property might or might not be there, you can still select
it as usual. It will be treated as null if it is missing.
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
RETURN n.age
$$) as (age agtype);
age
-----
55
NULL
(2 rows)
This example returns age when the node has
that property, or null if the property is not there.
Any expression can be used as a return item—literals, predicates, properties, functions, and everything else.
SELECT *
FROM cypher('graph_name', $$
MATCH (a)
RETURN a.age > 30, 'I\'m a literal', id(a)
$$) as (older_than_30 agtype, literal agtype, id agtype);
older_than_30 | literal | id
---------------+-----------------+----
true | 'I'm a literal' | 1
(1 row)
Returns a predicate, a literal and function call with a pattern expression parameter.
DISTINCT retrieves only unique records
depending on the fields that have been selected to output.
SELECT *
FROM cypher('graph_name', $$
MATCH (a {name: 'A'})-[]->(b)
RETURN DISTINCT b
$$) as (b agtype);
b
----------------------------------------------------
{id: 1; label: '' properties: {name: 'B'}}::vertex
(1 row)
The node named “B” is returned by the query, but only once.
ORDER BY is a sub-clause following
WITH, and it specifies that the output should
be sorted and how.
Note that you cannot sort on nodes or relationships, just on
properties on these. ORDER BY relies on
comparisons to sort the output, see
ordering and comparison of values.
In terms of scope of variables, ORDER BY
follows special rules, depending on if the projecting
RETURN or WITH clause is
either aggregating or DISTINCT. If it is an
aggregating or DISTINCT projection, only the
variables available in the projection are available. If the
projection does not alter the output cardinality (which
aggregation and DISTINCT do), variables
available from before the projecting clause are also available.
When the projection clause shadows already existing variables,
only the new variables are available.
Lastly, it is not allowed to use aggregating expressions in the
ORDER BY sub-clause if they are not also
listed in the projecting clause. This last rule is to make sure
that ORDER BY does not change the results,
only the order of them.
ORDER BY is used to sort the output.
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
WITH n.name as name, n.age as age
ORDER BY n.name
RETURN name, age
$$) as (name agtype, age agtype);
name | age
--------+-----
"A" | 34
"B" | 34
"C" | 32
(3 rows)
The nodes are returned, sorted by their name.
You can order by multiple properties by stating each variable in
the ORDER BY clause. Cypher will sort the
result by the first variable listed, and for equal values, go to
the next property in the ORDER BY clause, and
so on.
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
WITH n.name as name, n.age as age
ORDER BY n.age, n.name
RETURN name, age
$$) as (name agtype, age agtype);
name | age
--------+-----
"C" | 32
"A" | 34
"B" | 34
(3 rows)
This returns the nodes, sorted first by their age, and then by their name.
By adding DESC[ENDING] after the variable to
sort on, the sort will be done in reverse order.
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
WITH n.name AS name, n.age AS age
ORDER BY n.name DESC
RETURN name, age
$$) as (name agtype, age agtype);
name | age
--------+-----
"C" | 32
"B" | 34
"A" | 34
(3 rows)
The example returns the nodes, sorted by their name in reverse order.
When sorting the result set, null will always come at the end of
the result set for ascending sorting, and first when doing
descending sort.
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
WITH n.name AS name, n.age AS age, n.height AS height
ORDER BY n.height
RETURN name, age, height
$$) as (name agtype, age agtype, height agtype);
name | age | height
--------+-----+--------
"A" | 34 | 170
"C" | 32 | 185
"B" | 34 | NULL
(3 rows)
The nodes are returned sorted by the length property, with a node without that property last.
SKIP defines from which record to start
including the records in the output.
By using SKIP, the result set will get
trimmed from the top. Please note that no guarantees are made on
the order of the result unless the query specifies the
ORDER BY clause. SKIP
accepts any expression that evaluates to a positive integer.
To return a subset of the result, starting from the top, use this syntax:
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
RETURN n.name
ORDER BY n.name
SKIP 3
$$) as (names agtype);
names
-------
"D"
"E"
(2 rows)
The node is returned, and no property age exists on it.
To return a subset of the result, starting in the middle, use this syntax:
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
RETURN n.name
ORDER BY n.name
SKIP 1
LIMIT 2
$$) as (names agtype);
names
-------
"B"
"C"
(2 rows)
Two vertices from the middle are returned.
Using an expression with SKIP to return a
subset of the rows.
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
RETURN n.name
ORDER BY n.name
SKIP (3 * rand())+ 1
$$) as (a agtype);
names
-------
"C"
"D"
"E"
(3 rows)
The first two vertices are skipped, and only the last three are returned in the result.
LIMIT constrains the number of records in the
output.
LIMIT accepts any expression that evaluates
to a positive integer.
To return a subset of the result, starting from the top, use this syntax:
SELECT *
FROM cypher('graph_name', $$
MATCH (n)RETURN n.name
ORDER BY n.name
LIMIT 3
$$) as (names agtype);
names
"A"
"B"
"C"
3 rows
The node is returned, and no property age exists on it.
LIMIT accepts any expression that evaluates
to a positive integer as long as it is not referring to any
external variables:
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
RETURN n.name
ORDER BY n.name
LIMIT toInteger(3 * rand()) + 1
$$) as (names agtype);
names
-------
"A"
"B"
(2 rows)
Returns one to three top items.
The CREATE clause is used to create graph
vertices and edges.
A CREATE clause that is not followed by
another clause is called a terminal clause. When the Cypher query
ends with a terminal clause, no results will be returned from
the Cypher function call. However, the Cypher function call
still requires a column list definition. When the Cypher query ends with a
terminal node, define a single value in the column list
definition: no data will be returned in this variable.
SELECT *
FROM cypher('graph_name', $$
CREATE /* Create clause here, no following clause */
$$) as (a agtype);
a
---
(0 rows)
Creating a single vertex is done by issuing the following query.
SELECT *
FROM cypher('graph_name', $$
CREATE (n)
$$) as (v agtype);
v
---
(0 rows)
Nothing is returned from this query.
Creating multiple vertices is done by separating them with a comma.
SELECT *
FROM cypher('graph_name', $$
CREATE (n), (m)
$$) as (v agtype);
a
-------
(0 rows)
To add a label when creating a vertex, use the syntax below.
SELECT *
FROM cypher('graph_name', $$
CREATE (:Person)
$$) as (v agtype);
v
-------
(0 rows)
Nothing is returned from this query.
When creating a new vertex with labels, you can add properties at the same time.
SELECT *
FROM cypher('graph_name', $$
CREATE (:Person {name: 'Andres', title: 'Developer'})
$$) as (n agtype);
n
-------
(0 rows)
Nothing is returned from this query.
Creating a single node is done by issuing the following query.
SELECT *
FROM cypher('graph_name', $$
CREATE (a {name: 'Andres'})
RETURN a
$$) as (a agtype);
a
--------------------------------------------------------------------------------
{"id": 281474976710660, "label": "", "properties": {"name": "Andres"}}::vertex
(1 row)
The newly-created node is returned.
To create an edge between two vertices, we first get the two vertices. Once the nodes are loaded, we simply create an edge between them.
SELECT *
FROM cypher('graph_name', $$
MATCH (a:Person), (b:Person)
WHERE a.name = 'Node A' AND b.name = 'Node B'
CREATE (a)-[e:RELTYPE]->(b)
RETURN e
$$) as (e agtype);
e
-----------------------------------------------------------------------
{id: 3; startid: 0, endid: 1; label: 'RELTYPE'; properties: {}}::edge
(1 row)
The created edge is returned by the query.
Setting properties on edges is done in a similar manner to how it is done when creating vertices. Note that the values can be any expression.
SELECT *
FROM cypher('graph_name', $$
MATCH (a:Person), (b:Person)
WHERE a.name = 'Node A' AND b.name = 'Node B'
CREATE (a)-[e:RELTYPE {name:a.name + '<->' + b.name}]->(b)
RETURN e
$$) as (e agtype);
e
---------------------------------------------------------------------------------------------------
{id: 3; startid: 0, endid: 1; label: 'RELTYPE'; properties: {name: 'Node A<->Node B'}}::edge
(1 row)
The newly-created edge is returned by the example query.
When you use CREATE and a pattern, all parts
of the pattern that are not already in scope at this time will
be created.
SELECT *
FROM cypher('graph_name', $$
CREATE p = (andres {name:'Andres'})-[:WORKS_AT]->(neo)<-[:WORKS_AT]-(michael {name:'Michael'})
RETURN p
$$) as (p agtype);
p
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{"id": 281474976710661, "label": "", "properties": {"name": "Andres"}}::vertex, {"id": 1407374883553282, "label": "WORKS_AT", "end_id": 281474976710662, "start_id": 281474976710661, "properties": {}}::edge, {"id": 281474976710662, "label": "", "properties": {}}::vertex, {"id": 1407374883553281, "label": "WORKS_AT", "end_id": 281474976710662, "start_id": 281474976710663, "properties": {}}::edge, {"id": 281474976710663, "label": "", "properties": {"name": "Michael"}}::vertex]::path
(1 row)
This query creates three nodes and two relationships in one go, assigns it to a path variable, and returns it.
The DELETE clause is used to delete graph
elements — nodes, relationships, or paths.
A DELETE clause that is not followed by
another clause is called a terminal clause. When the Cypher query
ends with a terminal clause, no results will be returned from
the Cypher function call. However, the Cypher function call
still requires a column list definition. When the Cypher query ends with a
terminal node, define a single value in the column list
definition: no data will be returned in this variable.
For removing properties, see Section H.1.17.4.
You cannot delete a node without also deleting edges that start
or end on said vertex. Either explicitly delete the vertices,or
use DETACH DELETE.
To delete a vertex, use the DELETE clause.
SELECT *
FROM cypher('graph_name', $$
MATCH (v:Useless)
DELETE v
$$) as (v agtype);
v
-------
(0 rows)
This will delete the vertices (with label Useless) that have no edges. Nothing is returned from this query.
Running a MATCH clause will collect all nodes,
use the DETACH option to delete vertice edges
first, and then delete the vertex itself.
SELECT *
FROM cypher('graph_name', $$
MATCH (v:Useless)
DETACH DELETE v
$$) as (v agtype);
v
-------
(0 rows)
Nothing is returned from this query.
To delete an edge, use the MATCH clause to
find your edges, then add the variable to the DELETE.
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'Andres'})-[r:KNOWS]->()
DELETE r
$$) as (v agtype);
v
-------
(0 rows)
Nothing is returned from this query.
In apache_age, you can return vertices that have been deleted.
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'A'})
DELETE n
RETURN n
$$) as (a agtype);
a
---------------------------------------------------------------------------
{"id": 281474976710659, "label": "", "properties": {"name": "A"}}::vertex
(1 rows)
The SET clause is used to update labels on
nodes and properties on vertices and edges.
A SET clause that is not followed by another
clause is called a terminal clause. When the Cypher query ends
with a terminal clause, no results will be returned from the
Cypher function call. However, the Cypher function call still
requires a column list definition. When the Cypher query ends with a
terminal node, define a single value in the column list
definition: no data will be returned in this variable.
To set a property on a node or relationship, use
SET.
SELECT *
FROM cypher('graph_name', $$
MATCH (v {name: 'Andres'})
SET v.surname = 'Taylor'
$$) as (v agtype);
v
-------
(0 rows)
The newly changed node is returned by the query.
Creating a single vertex is done by issuing the following query.
SELECT *
FROM cypher('graph_name', $$
MATCH (v {name: 'Andres'})
SET v.surname = 'Taylor'
RETURN v
$$) as (v agtype);
v
-----------------------------------------------------------------------------------------------------
{id: 3; label: 'Person'; properties: {surname:"Taylor", name:"Andres", age:36, hungry:true}}::vertex
(1 row)
The newly changed vertex is returned by the query.
Normally you remove a property by using
REMOVE, but it's sometimes handy to do it
using the SET command. One example is if the
property comes from a parameter.
SELECT *
FROM cypher('graph_name', $$
MATCH (v {name: 'Andres'})
SET v.name = NULL
RETURN v
$$) as (v agtype);
v
---------------------------------------------------------------------------------------
{id: 3; label: 'Person'; properties: {surname:"Taylor", age:36, hungry:true}}::vertex
(1 row)
The node is returned by the query, and the name property is now missing.
If you want to set multiple properties in one go, simply separate them with a comma.
SELECT *
FROM cypher('graph_name', $$
MATCH (v {name: 'Andres'})
SET v.position = 'Developer', v.surname = 'Taylor'
RETURN v
$$) as (v agtype);
v
------------------------------------------------------------------------------------------------------------------------------
{"id": 281474976710661, "label": "", "properties": {"name": "Andres", "surname": "Taylor", "position": "Developer"}}: :vertex
(1 row)
The REMOVE clause is used to remove properties
from vertex and edges.
A REMOVE clause that is not followed by
another clause is called a terminal clause. When the Cypher query
ends with a terminal clause, no results will be returned from
the Cypher function call. However, the Cypher function call
still requires a column list definition. When the Cypher query ends with a
terminal node, define a single value in the column list
definition: no data will be returned in this variable.
Cypher does not allow storing null in
properties. Instead, if no value exists, the property is just
not there. So, removing a property value on a node or a
relationship is also done with REMOVE.
SELECT *
FROM cypher('graph_name', $$
MATCH (andres {name: 'Andres'})
REMOVE andres.age
RETURN andres
$$) as (andres agtype);
andres
---------------------------------------------------------------
{id: 3; label: 'Person'; properties: {name:"Andres"}}::vertex
(1 row)
The node is returned, and no property age exists on it.
The MERGE clause ensures that a pattern exists
in the graph. Either the pattern already exists, or it needs to be
created.
MERGE either matches existing nodes, or creates
new data. It is a combination of MATCH and
CREATE.
For example, you can specify that the graph must contain a node
for a user with a certain name. If there is not a node with the
correct name, a new node will be created and its name property
set. When using MERGE on full patterns, the
behavior is that either the whole pattern matches, or the whole
pattern is created. MERGE will not partially
use existing patterns. If partial matches are needed, this can be
accomplished by splitting a pattern up into multiple
MERGE clauses.
As with MATCH, MERGE can
match multiple occurrences of a pattern. If there are multiple
matches, they will all be passed on to later stages of the query.
SELECT * from cypher('graph_name', $$
CREATE (A:Person {name: 'Charlie Sheen', bornIn: 'New York'}),
(B:Person {name: 'Michael Douglas', bornIn: 'New Jersey'}),
(C:Person {name: 'Rob Reiner', bornIn: 'New York'}),
(D:Person {name: 'Oliver Stone', bornIn: 'New York'}),
(E:Person {name: 'Martin Sheen', bornIn: 'Ohio'})
$$) as (result agtype);
By just specifying a pattern with a single vertex and no labels, all vertices in the graph will be returned.
SELECT * FROM cypher('graph_name', $$
MERGE (v:Critic)
RETURN v
$$) as (v agtype);
v
-------------------------------------------------
{id: 0; label: 'Critic': properties:{}}::vertex
(1 row)
If there exists a vertex with the label “Critic”, the vertex returns. Otherwise, the vertex is created and returned.
Merging a vertex node with properties where not all properties match any existing vertex.
SELECT * FROM cypher('graph_name', $$
MERGE (charlie {name: 'Charlie Sheen', age: 10})
RETURN charlie
$$) as (v agtype);
v
------------------------------------------------------------------------------
{id: 0; label: 'Actor': properties:{name: 'Charlie Sheen', age: 10}}::vertex
(1 row)
If there exists a vertex with the label “Critic”, the vertex returns. Otherwise, the vertex is created and returned.
If a vertex with all the properties exists, it is returned. Otherwise, a new vertex with the name “Charlie Sheen” will be created and returned.
Merging a vertex where both label and property constraints match an existing vertex.
SELECT * FROM cypher('graph_name', $$
MERGE (michael:Person {name: 'Michael Douglas'})
RETURN michael.name, michael.bornIn
$$) as (Name agtype, BornIn agtype);
name | bornin
-------------------+--------------
"Michael Douglas" | "New Jersey"
(1 row)
“Michael Douglas” will match the existing vertex,
and the vertex name and
bornIn properties are returned.
Predicates are boolean functions that return true or false for a
given set of input. They are most commonly used to filter out
subgraphs in the WHERE part of a query.
exists(property agtype) returns agtype boolean
#
exists() returns true if
the specified property exists in the node, relationship or map.
This is different from the EXISTS clause.
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
WHERE exists(n.surname)
RETURN n.first_name, n.last_name
$$) as (first_name agtype, last_name agtype);
first_name | last_name
------------+------------
'John' | 'Smith'
'Patty' | 'Patterson'
(2 rows)
exists(path agtype) returns agtype boolean
#
exists() returns true
if for the given path, there already exists the given path.
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
WHERE exists((n)-[]-({name: 'Willem Defoe'}))
RETURN n.full_name
$$) as (full_name agtype);
full_name
--------------
'Toby Maguire'
'Tom Holland'
(2 rows)
id(expression agtype) returns agtype integer
#
id() returns the ID of a vertex or edge.
SELECT *
FROM cypher('graph_name', $$
MATCH (a)
RETURN id(a)
$$) as (id agtype);
id
----
0
1
2
3
(4 rows)
start_id(expression agtype) returns agtype integer
#
start_id() returns the ID of the vertex that
is the starting vertex for the edge.
SELECT *
FROM cypher('graph_name', $$
MATCH ()-[e]->()
RETURN start_id(e)
$$) as (start_id agtype);
start_id
----------
0
1
2
3
(4 rows)
end_id(expression agtype) returns agtype integer
#
end_id() returns the ID of the vertex that
is the ending vertex for the edge.
SELECT *
FROM cypher('graph_name', $$
MATCH ()-[e]->()
RETURN end_id(e)
$$) as (end_id agtype);
end_id
--------
4
5
6
7
(4 rows)
type(edge agtype) returns agtype string
#
type() returns the string representation of
the edge type.
SELECT *
FROM cypher('graph_name', $$
MATCH ()-[e]->()
RETURN type(e)
$$) as (type agtype);
type
------
'KNOWS'
'KNOWS'
(2 rows)
properties(expression agtype) returns agtype map
#
properties() returns an agtype map
containing all the properties of a vertex or edge. If the
argument is already a map, it is returned unchanged.
properties(null) returns null.
SELECT *
FROM cypher('graph_name', $$
CREATE (p:Person {name: 'Stefan', city: 'Berlin'})
RETURN properties(p)
$$) as (type agtype);
type
--------------------------------------
{"city": "Berlin", "name": "Stefan"}
(1 row)
head(list agtype) returns agtype
#
head() returns the first element in
an agtype list. head(null) returns
null. If the first element in the list is
null, head(list) will return
null.
SELECT *
FROM cypher('graph_name', $$
MATCH (a)
WHERE a.name = 'Eskil'
RETURN a.array, head(a.array)
$$) as (lst agtype, lst_head agtype);
lst | lst_head
-----------------------+----------
["one","two","three"] | "one"
(1 row)
last(list agtype) returns agtype
#
last() returns the last element in
an agtype list. last(null) returns
null. If the last element in the list is
null, last(list) will return
null.
SELECT *
FROM cypher('graph_name', $$
MATCH (a)
WHERE a.name = 'Eskil'
RETURN a.array, last(a.array)
$$) as (lst agtype, lst_tail agtype);
lst | lst_tail
-----------------------+----------
["one","two","three"] | "three"
(1 row)
length(path agtype) returns agtype integer
#
length() returns the length of a path.
length(null) returns null.
SELECT *
FROM cypher('graph_name', $$
MATCH p = (a)-[]->(b)-[]->(c)
WHERE a.name = 'Alice'
RETURN length(p)
$$) as (length_of_path agtype);
length_of_path
----------------
2
2
2
(3 rows)
size(list variadic "any") returns agtype integer
#
size() returns the length of a list.
size(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN size(['Alice', 'Bob'])
$$) as (size_of_list agtype);
size_of_list
--------------
2
(1 row)
startNode(edge agtype) returns agtype
#
startNode() returns the start node of an edge.
startNode(null) returns null.
SELECT *
FROM cypher('graph_name', $$
MATCH (x:Developer)-[r]-()
RETURN startNode(r)
$$) as (v agtype);
v
------------------------------------------
Node[0]{name:"Alice",age:38,eyes:"brown"}
Node[0]{name:"Alice",age:38,eyes:"brown"}
(2 rows)
endNode(edge agtype) returns agtype
#
endNode() returns the end node of an edge.
endNode(null) returns null.
SELECT *
FROM cypher('graph_name', $$
MATCH (x:Developer)-[r]-()
RETURN endNode(r)
$$) as (v agtype);
v
-------------------------------------------
Node[2]{name:"Charlie",age:53,eyes:"green"}
Node[1]{name:"Bob",age:25,eyes:"blue"}
(2 rows)
timestamp() returns agtype integer
#
timestamp() returns the difference, measured
in milliseconds, between the current time and midnight, January
1, 1970 UTC. timestamp will return the same
value during one entire query, even for long-running queries.
SELECT *
FROM cypher('graph_name', $$
RETURN timestamp()
$$) as (t agtype);
t
---------------
1613496720760
(1 row)
toBoolean(expression variadic "any") returns agtype boolean
#
toBoolean() converts a string value to a
boolean value. toBoolean(null) returns
null. If expression is a boolean value,
it will be returned unchanged. If the parsing fails,
null will be returned.
SELECT *
FROM cypher('graph_name', $$
RETURN toBoolean('TRUE'), toBoolean('not a boolean')
$$) as (a_bool agtype, not_a_bool agtype);
a_bool | not_a_bool
--------+------------
true | NULL
(1 row)
toFloat(expression variadic "any") returns agtype float
#
toFloat() converts an integer or string value
to a floating point number. toFloat(null) returns
null. If expression is a floating point number,
it will be returned unchanged. If the parsing fails,
null will be returned.
SELECT *
FROM cypher('graph_name', $$
RETURN toFloat('11.5'), toFloat('not a number')
$$) as (a_float agtype, not_a_float agtype);
a_float | not_a_float
---------+-------------
11.5 | NULL
(1 row)
toInteger(expression variadic "any") returns agtype integer
#
toInteger() converts a floating point or
string value to an integer value. toInteger(null)
returns null. If expression is an integer value,
it will be returned unchanged. If the parsing fails,
null will be returned.
SELECT *
FROM cypher('graph_name', $$
RETURN toInteger('42'), toInteger('not a number')
$$) as (an_integer agtype, not_an_integer agtype);
an_integer | not_an_integer
------------+----------------
42 | NULL
(1 row)
coalesce(expression agtype [, expression agtype]*) returns agtype
#
coalesce() returns the first non-null value
in the given list of expressions. null will
be returned if all the arguments are null.
SELECT *
FROM cypher('graph_name', $$
MATCH (a)
WHERE a.name = 'Alice'
RETURN coalesce(a.hairColor, a.eyes), a.hair_color, a.eyes
$$) as (color agtype, hair_color agtype, eyes agtype);
color | hair_color | eyes
-------+------------+--------
“brown”| NULL | “Brown”
(1 row)
SELECT * from cypher('graph_name', $$
CREATE (A:Person {name: 'Alice', age: 38, eyes: 'brown'}),
(B:Person {name: 'Bob', age: 25, eyes: 'blue'}),
(C:Person {name: 'Charlie', age: 53, eyes: 'green'}),
(D:Person {name: 'Daniel', age: 54, eyes: 'brown'}),
(E:Person {name: 'Eskil', age: 41, eyes: 'blue', array: ['one', 'two', 'three']}),
(A)-[:KNOWS]->(B),
(A)-[:KNOWS]->(C),
(B)-[:KNOWS]->(D),
(C)-[:KNOWS]->(D),
(B)-[:KNOWS]->(E)
$$) as (result agtype);
keys(expression agtype) returns agtype list
#
keys() returns a list containing the string
representations for all the property names of a vertex, edge,
or map. keys(null) returns null.
SELECT * from cypher('graph_name', $$
MATCH (a)
WHERE a.name = 'Alice'
RETURN keys(a)
$$) as (result agtype);
result
-------------------------
["age", "eyes", "name"]
(1 row)
A list containing the names of all the properties on the vertex
bound to a is returned.
range(start variadic "any", end variadic "any" [, step variadic "any"]) returns agtype list
#
range() returns a list comprising all integer
values within a range bounded by a start value
start and end value
end, where the difference
step between any two
consecutive values is constant; i.e. an arithmetic progression.
The range is inclusive, and the arithmetic progression will
therefore always contain
start and — depending on the
values of start,
step, and
end — end.
SELECT *
FROM cypher('graph_name', $$
RETURN range(0, 10), range(2, 18, 3)
$$) as (no_step agtype, step agtype);
no_step | step
------------------------------------+-----------------------
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | [2, 5, 8, 11, 14, 17]
(1 row)
Two lists of numbers in the given ranges are returned.
labels(vertex agtype) returns agtype list
#
labels() returns a list containing the string
representations for all the labels of a node.
labels(null) returns
null.
SELECT *
FROM cypher('graph_name', $$
MATCH (a)
WHERE a.name = 'Alice'
RETURN labels(a)
$$) as (edges agtype);
edges
------------
["Person"]
(1 row)
A list containing all the labels of the node bound to
a is returned.
nodes(path agtype) returns agtype list
#
nodes() returns a list containing all
the vertices in a path. nodes(null) returns
null.
SELECT *
FROM cypher('graph_name', $$
MATCH p = (a)-[]->(b)-[]->(c)
WHERE a.name = 'Alice' AND c.name = 'Eskil'
RETURN nodes(p)
$$) as (vertices agtype);
vertices
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{"id": 844424930131969, "label": "Person", "properties": {"age": 38, "eyes": "brown", "name": "Alice"}}::vertex, {"id": 844424930131970, "label": "Person", "properties": {"age": 25, "eyes": "blue", "name": "Bob"}}::vertex, {"id": 844424930131973, "label": "Person", "properties": {"age": 41, "eyes": "blue", "name": "Eskil", "array": ["one", "two", "three"]}}::vertex]
(1 row)
A list containing all the vertices in the path
p is returned.
relationships(path agtype) returns agtype list
#
relationships() returns a list containing all
the relationships in a path. relationships(null)
returns null.
SELECT *
FROM cypher('graph_name', $$
MATCH p = (a)-[]->(b)-[]->(c)
WHERE a.name = 'Alice' AND c.name = 'Eskil'
RETURN relationships(p)
$$) as (edges agtype);
edges
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{"id": 1125899906842625, "label": "KNOWS", "end_id": 844424930131970, "start_id": 844424930131969, "properties": {}}::edge, {"id": 1125899906842629, "label": "KNOWS", "end_id": 844424930131973, "start_id": 844424930131970, "properties": {}}::edge]
(1 row)
A list containing all the edges in the path p
is returned.
toBooleanList(list variadic "any") returns agtype list
#
toBooleanList() converts a list of values and
returns a list of boolean values. If any values are not
convertible to boolean they will be null in
the list returned.
Any null element in list is preserved. Any
boolean value in list is preserved. If the list is
null, null
will be returned.
SELECT * FROM cypher('expr', $$
RETURN toBooleanList(['true', 'false', 'true'])
$$) AS (toBooleanList agtype);
toBooleanList
--------------------
[true, false, true]
(1 row)
rand() returns agtype float
#
rand() returns a random floating point number
in the range from 0 (inclusive) to 1 (exclusive); i.e.[0,1). The
numbers returned follow an approximate uniform distribution.
SELECT *
FROM cypher('graph_name', $$
RETURN rand()
$$) as (random_number agtype);
random_number
-------------------
0.3586784748902053
(1 row)
abs(list variadic "any") returns agtype
#
abs() returns the absolute value of the given
number. abs(null) returns null.
If expression is negative,
-(
(i.e. the negation of expression) is returned.
expression)
SELECT *
FROM cypher('graph_name', $$
MATCH (a), (e) WHERE a.name = 'Alice' AND e.name = 'Eskil'
RETURN a.age, e.age, abs(a.age - e.age)
$$) as (alice_age agtype, eskil_age agtype, difference agtype);
alice_age | eskil_age | difference
-----------+-----------+------------
38 | 41 | 3
(1 row)
The absolute value of the age difference is returned.
ceil(expression variadic "any") returns agtype float
#
ceil() returns the smallest floating point
number that is greater than or equal to the given number and
equal to a mathematical integer. ceil(null)
returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN ceil(0.1)
$$) as (ceil_value agtype);
ceil_value
------------
1.0
(1 row)
The ceiling of 0.1 is returned.
floor(expression variadic "any") returns agtype float
#
floor() returns the greatest floating point
number that is less than or equal to the given number and equal
to a mathematical integer. floor(null)
returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN floor(0.1)
$$) as (flr agtype);
flr
-----
0.0
(1 row)
The floor of 0.1 is returned.
round(expression variadic "any") returns agtype float
#
round() returns the value of the given number
rounded to the nearest integer. round(null)
returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN round(3.141592)
$$) as (rounded_value agtype);
rounded_value
---------------
3.0
(1 row)
sign(expression variadic "any") returns agtype integer
#
sign() returns the signum of the given
number: 0 if the number is 0, -1 for any negative number, and 1
for any positive number. sign(null) returns
null.
SELECT *
FROM cypher('graph_name', $$
RETURN sign(-17), sign(0.1), sign(0)
$$) as (negative_sign agtype, positive_sign agtype, zero_sign agtype);
negative_sign | positive_sign | zero_sign
---------------+---------------+-----------
-1 | 1 | 0
(1 row)
The signs of -17 and 0.1 are returned.
e() returns agtype float
#
e() returns the base of the natural logarithm,
e.
SELECT *
FROM cypher('graph_name', $$
RETURN e()
$$) as (e agtype);
e
-------------------
2.718281828459045
(1 row)
sqrt(expression variadic "any") returns agtype float
#
sqrt() returns the square root of a number.
SELECT *
FROM cypher('graph_name', $$
RETURN sqrt(144)
$$) as (results agtype);
results
---------
12.0
(1 row)
exp(expression variadic "any") returns agtype float
#
exp() returns e^n,
where e is the base of the
natural logarithm, and n is the value of the argument
expression. exp(null) returns
null.
SELECT *
FROM cypher('graph_name', $$
RETURN exp(2)
$$) as (e agtype);
e
------------------
7.38905609893065
(1 row)
e to the power of 2 is returned.
log(expression variadic "any") returns agtype float
#
log() returns the natural logarithm of a
number. log(null) returns
null. log(0) returns
null.
SELECT *
FROM cypher('graph_name', $$
RETURN log(27)
$$) as (natural_logarithm agtype);
natural_logarithm
-------------------
3.295836866004329
(1 row)
The natural logarithm of 27 is returned.
log10(expression variadic "any") returns agtype float
#
log10() returns the common logarithm (base
10) of a number. log10(null) returns
null. log10(0) returns
null.
SELECT *
FROM cypher('graph_name', $$
RETURN log10(27)
$$) as (common_logarithm agtype);
common_logarithm
--------------------
1.4313637641589874
(1 row)
The common logarithm of 27 is returned.
degrees(expression variadic "any") returns agtype float
#
degrees() converts radians to degrees.
degrees(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN degrees(3.14159)
$$) as (deg agtype);
deg
-------------------
179.9998479605043
(1 row)
The number of degrees close to pi is returned.
radians(expression variadic "any") returns agtype float
#
radians() converts degrees to radians.
radians(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN radians(180)
$$) as (rad agtype);
rad
-------------------
3.141592653589793
(1 row)
The number of degrees close to pi is returned.
pi() returns agtype float
#
pi() returns the mathematical constant pi.
SELECT *
FROM cypher('graph_name', $$
RETURN pi()
$$) as (p agtype);
p
-------------------
3.141592653589793
(1 row)
The constant pi is returned.
sin(expression variadic "any") returns agtype float
#
sin() returns the sine of a number.
sin(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN sin(0.5)
$$) as (s agtype);
s
-------------------
0.479425538604203
(1 row)
The sine of 0.5 is returned.
cos(expression variadic "any") returns agtype float
#
cos() returns the sine of a number.
cos(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN cos(0.5)
$$) as (c agtype);
c
--------------------
0.8775825618903728
(1 row)
The cosine of 0.5 is returned.
tan(expression variadic "any") returns agtype float
#
tan() returns the tangent of a number.
tan(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN tan(0.5)
$$) as (t agtype);
t
--------------------
0.5463024898437905
(1 row)
The tangent of 0.5 is returned.
cot(expression variadic "any") returns agtype float
#
cot() returns the cotangent of a number.
cot(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN cot(0.5)
$$) as (t agtype);
t
-------------------
1.830487721712452
(1 row)
The cotangent of 0.5 is returned.
asin(expression variadic "any") returns agtype float
#
asin() returns the arcsine of a number.
asin(null) returns null.
If (expression < -1) or (expression > 1), then
asin(expression) returns
null.
SELECT *
FROM cypher('graph_name', $$
RETURN asin(0.5)
$$) as (arc_s agtype);
arc_s
--------------------
0.5235987755982989
(1 row)
The arcsine of 0.5 is returned.
acos(expression variadic "any") returns agtype float
#
acos() returns the arcsine of a number.
acos(null) returns null.
If (expression < -1) or (expression > 1), then
acos(expression) returns
null.
SELECT *
FROM cypher('graph_name', $$
RETURN acos(0.5)
$$) as (arc_c agtype);
arc_c
--------------------
1.0471975511965979
(1 row)
The arccosine of 0.5 is returned.
atan(expression variadic "any") returns agtype float
#
atan() returns the arctangent of a number.
atan(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN atan(0.5)
$$) as (arc_t agtype);
arc_t
--------------------
0.4636476090008061
(1 row)
The arctangent of 0.5 is returned.
atan2(expression1 variadic "any", expression2 variadic "any") returns agtype float
#
atan2() returns the arctangent of a set of
coordinates in radians.
atan2(null, null),
atan2(null, expression2) and
atan(expression1, null) all return null.
SELECT *
FROM cypher('graph_name', $$
RETURN atan2(0.5, 0.6)
$$) as (arc_t2 agtype);
arc_t2
--------------------
0.6947382761967033
(1 row)
The arctangent of 0.5 and 0.6 is returned.
replace(original, search variadic "any", replace variadic "any") returns agtype string
#
replace() returns a string in which all
occurrences of a specified string in the original string have
been replaced by another (specified) string.
If any argument is null,
null will be returned.
If search is not found in original,
original will be returned.
SELECT *
FROM cypher('graph_name', $$
RETURN replace('hello', 'l', 'w')
$$) as (str_array agtype);
str_array
-----------
"hewwo"
(1 row)
split(original variadic "any", split_delimiter variadic "any") returns list of agtype strings
#
split() returns a list of strings resulting
from the splitting of the original string around matches of the
given delimiter.
split(null, splitDelimiter) and
split(original, null) both return
null.
SELECT *
FROM cypher('graph_name', $$
RETURN split('one,two', ',')
$$) as (split_list agtype);
split_list
----------------
["one", "two"]
(1 row)
left(original variadic "any", length variadic "any") returns agtype string
#
left() returns a string containing the
specified number of leftmost characters of the original string.
left(null, length) and
left(null, null) both return
null.
left(original, null) will raise an error.
If length is not a positive integer, an
error is raised.
If length exceeds the size of
original, original is
returned.
SELECT *
FROM cypher('graph_name', $$
RETURN left('Hello', 3)
$$) as (new_str agtype);
new_str
---------
"Hel"
(1 row)
right(original variadic "any", length variadic "any") returns agtype string
#
right() returns a string containing the
specified number of rightmost characters of the original string.
right(null, length) and
right(null, null) both return
null.
right(original, null) will raise an
error.
If length is not a positive integer, an
error is raised.
If length exceeds the size of
original, original is
returned.
SELECT *
FROM cypher('graph_name', $$
RETURN right('hello', 3)
$$) as (new_str agtype);
new_str
---------
"llo"
(1 row)
substring(original variadic "any", start variadic "any" [, length variadic "any"]) returns agtype string
#
substring() returns a substring of the
original string, beginning with a 0-based index start and
length.
start uses a zero-based index.
If length is omitted, the function
returns the substring starting at the position given by
start and extending to the end of
original.
If original is null,
null is returned.
If either start or
length is null or a
negative integer, an error is raised.
If start is 0, the substring will start
at the beginning of original.
If length is 0, the empty string will be
returned.
SELECT *
FROM cypher('graph_name', $$
RETURN substring('hello', 1, 3), substring('hello', 2)
$$) as (sub_str1 agtype, sub_str2 agtype);
sub_str1 | sub_str2
----------+----------
"ell" | "llo"
(1 row)
rTrim(original variadic "any") returns agtype string
#
rTrim() returns the original string with
trailing whitespace removed. rTrim(null)
returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN rTrim(' hello ')
$$) as (right_trimmed_str agtype);
right_trimmed_str
-------------------
" hello"
(1 row)
lTrim(original variadic "any") returns agtype string
#
lTrim() returns the original string with
leading whitespace removed. lTrim(null)
returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN lTrim(' hello ')
$$) as (left_trimmed_str agtype);
left_trimmed_str
------------------
"hello "
(1 row)
trim(original variadic "any") returns agtype string
#
trim() returns the original string with
leading and trailing whitespace removed.
trim(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN trim(' hello ')
$$) as (trimmed_str agtype);
trimmed_str
-------------
"hello"
(1 row)
toLower(original variadic "any") returns agtype string
#
toLower() returns the original string in
lowercase.
toLower(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN toLower('HELLO')
$$) as (lower_str agtype);
lower_str
-----------
"hello"
(1 row)
toUpper(original variadic "any") returns agtype string
#
toUpper() returns the original string in
uppercase.
toUpper(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN toUpper('hello')
$$) as (upper_str agtype);
upper_str
-----------
"HELLO"
(1 row)
reverse(original variadic "any") returns agtype string
#
reverse() returns a string in which the order
of all characters in the original string have been reversed.
reverse(null) returns null.
SELECT *
FROM cypher('graph_name', $$
RETURN reverse('hello')
$$) as (reverse_str agtype);
reverse_str
-------------
"olleh"
(1 row)
toString(expression agtype) returns string
#
toString() converts an integer,
float or boolean value to a string.
toString(null) returns null.
If expression is a string, it will be returned
unchanged.
SELECT *
FROM cypher('graph_name', $$
RETURN toString(11.5),toString('a string'), toString(true)
$$) as (float_to_str agtype, str_to_str agtype, bool_to_string agtype);
float_to_str | str_to_str | bool_to_string
--------------+------------+----------------
"11.5" | "a string" | "true"
(1 row)
Functions that activate auto aggregation.
LOAD 'age';
SET search_path TO ag_catalog;
SELECT create_graph('graph_name');
SELECT * FROM cypher('graph_name', $$
CREATE (a:Person {name: 'A', age: 13}),
(b:Person {name: 'B', age: 33, eyes: 'blue'}),
(c:Person {name: 'C', age: 44, eyes: 'blue'}),
(d1:Person {name: 'D', eyes: 'brown'}),
(d2:Person {name: 'D'}),
(a)-[:KNOWS]->(b),
(a)-[:KNOWS]->(c),
(a)-[:KNOWS]->(d1),
(b)-[:KNOWS]->(d2),
(c)-[:KNOWS]->(d2)
$$) as (a agtype);
min(expression variadic "any") returns agtype
#
min() returns the minimum value in a set of
values.
Any null values are excluded from the calculation.
In a mixed set, any string value is always considered to be
lower than any numeric value, and any list is always
considered to be lower than any string.
Lists are compared in dictionary order, i.e. list elements
are compared pairwise in ascending order from the start of
the list to the end.
min(null) returns null.
SELECT *
FROM cypher('graph_name', $$
MATCH (v:Person)
RETURN min(v.age)
$$) as (min_age agtype);
min_age
---------
13
(1 row)
The lowest of all the values in the property age
is returned.
To clarify the following example, assume the next three commands are run first:
SELECT * FROM cypher('graph_name', $$
CREATE (:min_test {val:'d'})
$$) as (result agtype);
SELECT * FROM cypher('graph_name', $$
CREATE (:min_test {val:['a', 'b', 23]})
$$) as (result agtype);
SELECT * FROM cypher('graph_name', $$
CREATE (:min_test {val:[1, 'b', 23]})
$$) as (result agtype);
The example below shows using min() with lists:
SELECT *
FROM cypher('graph_name', $$
MATCH (v:min_test)
RETURN min(v.val)
$$) as (min_val agtype);
min_val
----------------
["a", "b", 23]
(1 row)
The lowest of all the values in the set is returned
(in this case, the list ["a", "b", 23]), as
the two lists are considered to be lower values than the
string "d", and the string "a" is considered to be a lower
value than the numerical value 1.
max(expression variadic "any") returns agtype float
#
max() returns the maximum value in a set of
values. Any null values are excluded from the calculation.
In a mixed set, any numeric value is always considered to be
higher than any string value, and any string value is always
considered to be higher than any list.
Lists are compared in dictionary order, i.e. list elements
are compared pairwise in ascending order from the start of
the list to the end.
max(null) returns null.
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN max(n.age)
$$) as (max_age agtype);
max_age
---------
44
(1 row)
The highest of all the values in the property age
is returned.
stDev(expression variadic "any") returns agtype float
#
stDev() returns the standard deviation for
the given value over a group. It uses a standard two-pass
method, with N - 1 as the denominator, and should be used when
taking a sample of the population for an unbiased estimate. When
the standard deviation of the entire population is being
calculated, stDevP
should be used. Any null values are excluded from the calculation.
stDev(null) returns 0.0 (zero).
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN stDev(n.age)
$$) as (stdev_age agtype);
stdev_age
--------------------
15.716233645501712
(1 row)
The standard deviation of the values in the property
age is returned.
stDevP(expression variadic "any") returns agtype float
#
stDev() returns the standard deviation for
the given value over a group. It uses a standard two-pass
method, with N as the denominator, and should be used when
calculating the standard deviation for an entire population.
When the standard deviation of only a sample of the population
is being calculated, stDev
should be used. Any null values are excluded from the calculation.
stDevP(null) returns 0.0 (zero).
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN stDevP(n.age)
$$) as (stdevp_age agtype);
stdevp_age
--------------------
12.832251036613439
(1 row)
The population standard deviation of the values in the property
age is returned.
percentileCont(expression agtype, percentile agtype) returns agtype float
#
percentileCont() returns the percentile of
the given value over a group, with a percentile from 0.0 to 1.0.
It uses a linear interpolation method, calculating a weighted
average between two values if the desired percentile lies
between them. For nearest values using a rounding method, see
percentileDisc.
Any null values are excluded from the calculation.
percentileCont(null, percentile) returns
null.
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN percentileCont(n.age, 0.4)
$$) as (percentile_cont_age agtype);
percentile_cont_age
---------------------
29.0
(1 row)
The 40th percentile of the values in the property
age is returned, calculated with a weighted
average. In this case, 0.4 is the median, or 40th percentile.
percentileDisc(expression agtype, percentile agtype) returns agtype float
#
percentileDisc() returns the percentile of
the given value over a group, with a percentile from 0.0 to 1.0.
It uses a rounding method and calculates the nearest value to
the percentile. For interpolated values, see
percentileCont.
Any null values are excluded from the calculation.
percentileDisc(null, percentile) returns
null.
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN percentileDisc(n.age, 0.5)
$$) as (percentile_disc_age agtype);
percentile_disc_age
---------------------
33.0
(1 row)
The 50th percentile of the values in the property
age is returned.
count(expression agtype) returns agtype integer
#
count() returns the number of values or
records, and appears in two variants:
count(*) returns the number of matching
records.
count(expr) returns the number of
non-null values returned by an expression.
count(*) includes records returning
null. count(expr) ignores
null values. count(null)
returns 0 (zero). count(*) can be used
to return the number of nodes; for example, the number of nodes
connected to some node n.
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'A'})-[]->(x)
RETURN n.age, count(*)
$$) as (age agtype, number_of_people agtype);
age | number_of_people
-----+------------------
13 | 3
(1 row)
The age property of the start node n
(with a name value of "A") and the number of nodes related
to n are returned.
count(*) can be used to group and count
relationship types, returning the number of relationships of
each type.
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'A'})-[r]->()
RETURN type(r), count(*)
$$) as (label agtype, count agtype);
label | count
---------+-------
"KNOWS" | 3
(1 row)
The relationship type and the number of relationships with that type are returned.
Instead of simply returning the number of records with
count(*), it may be more useful to return
the actual number of values returned by an expression.
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'A'})-[]->(x)
RETURN count(x)
$$) as (count agtype);
count
-------
3
(1 row)
The number of nodes connected to the start node
n is returned.
count(expression) can be used to return the
number of non-null values returned by the expression.
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN count(n.age)
$$) as (count agtype);
count
-------
3
(1 row)
The number of nodes with the label Person
that have a non-null value for the age property
is returned.
In the following example we are trying to find all our friends of
friends, and count them. The first aggregate function,
count(DISTINCT friend_of_friend), will only
count a friend_of_friend once, as
DISTINCT removes the duplicates.
The second aggregate function,
count(friend_of_friend), will consider the
same friend_of_friend multiple times.
SELECT *
FROM cypher('graph_name', $$
MATCH (me:Person)-[]->(friend:Person)-[]->(friend_of_friend:Person)
WHERE me.name = 'A'
RETURN count(DISTINCT friend_of_friend), count(friend_of_friend)
$$) as (friend_of_friends_distinct agtype, friend_of_friends agtype);
friend_of_friends_distinct | friend_of_friends
----------------------------+-------------------
1 | 2
(1 row)
Both B and C know D and thus D will get counted twice when not
using DISTINCT.
avg(expression agtype) returns agtype integer
#
avg() returns the average of a set of numeric
values. Any null values are excluded from the calculation.
avg(null) returns null.
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN avg(n.age)
$$) as (avg_age agtype);
avg_age
---------
30.0
(1 row)
The average of all the values in the property age
is returned.
sum(expression agtype) returns agtype float
#
sum() returns the sum of a set of numeric
values. Any null values are excluded from the calculation.
sum(null) returns null.
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN sum(n.age)
$$) as (total_age agtype);
total_age
-----------
90
(1 row)
The sum of all the values in the property age
is returned.
Users may add custom functions to
apache_age. When using the Cypher
function, all function calls with a Cypher query use the default
namespace of: ag_catalog. However, if a user
wants to use a function outside this namespace, they may do so by
adding the namespace before the function name.
Syntax: namespace_name.function_name
SELECT *
FROM cypher('graph_name', $$
RETURN pg_catalog.sqrt(25)
$$) as (result agtype);
result
--------
25
(1 row)
All queries so far have followed the same pattern: a
SELECT clause followed by a single Cypher call
in the FROM clause. However, a Cypher query can
be used in many other ways. This section highlights some more
advanced ways of using the Cypher call within a more complex
SQL/Cypher Hybrid Query.
There are no restrictions to using Cypher with CTEs (Common Table Expression).
WITH graph_query as (
SELECT *
FROM cypher('graph_name', $$
MATCH (n)
RETURN n.name, n.age
$$) as (name agtype, age agtype)
)
SELECT * FROM graph_query;
name | age
-----------+-----
"Andres" | 36
"Tobias" | 25
"Peter" | 35
(3 rows)
A Cypher query can be part of a JOIN clause.
Cypher queries using the CREATE,
SET, REMOVE clauses cannot
be used in SQL queries with joins, as they affect
the Postgres Pro transaction system.
One possible solution is to protect the query with CTEs.
SELECT id,
graph_query.name = t.name as names_match,
graph_query.age = t.age as ages_match
FROM schema_name.sql_person AS t
JOIN cypher('graph_name', $$
MATCH (n:Person)
RETURN n.name, n.age, id(n)
$$) as graph_query(name agtype, age agtype, id agtype)
ON t.person_id = graph_query.id;
id | names_match | ages_match
---+-------------+------------
1 | True | True
2 | False | True
3 | True | False
(3 rows)
Cypher cannot be used in an expression, the query must exists in
the FROM clause of a query. However, if the
Cypher query is placed in a subquery, it will behave as any SQL
style query.
= #
When writing a Cypher query that is known to return 1 column and
1 row, the = comparison operator may be used.
SELECT t.name FROM schema_name.sql_person AS t
where t.name = (
SELECT a
FROM cypher('graph_name', $$
MATCH (v)
RETURN v.name
$$) as (name varchar(50))
ORDER BY name
LIMIT 1);
name | age
----------+-----
"Andres" | 36
(1 rows)
When writing a Cypher query that is known to return 1 column,
but may have multiple rows, the IN operator
may be used.
SELECT t.name, t.age FROM schema_name.sql_person as t
where t.name in (
SELECT *
FROM cypher('graph_name', $$
MATCH (v:Person)
RETURN v.name
$$) as (a agtype));
name | age
-----------+-----
"Andres" | 36
"Tobias" | 25
"Peter" | 35
(3 rows)
When writing a Cypher query that may have more than 1 column and
row returned, the EXISTS operator may be
used.
SELECT t.name, t.age
FROM schema_name.sql_person as t
WHERE EXISTS (
SELECT *
FROM cypher('graph_name', $$
MATCH (v:Person)
RETURN v.name, v.age
$$) as (name agtype, age agtype)
WHERE name = t.name AND age = t.age
);
name | age
-----------+-----
"Andres" | 36
"Tobias" | 25
"Peter" | 35
(3 rows)
There is no restriction to the number of graphs an SQL statement can query, allowing users to query more than one graph at the same time.
SELECT graph_1.name, graph_1.age, graph_2.license_number
FROM cypher('graph_1', $$
MATCH (v:Person)
RETURN v.name, v.age
$$) as graph_1(col_1 agtype, col_2 agtype, col_3 agtype)
JOIN cypher('graph_2', $$
MATCH (v:Doctor)
RETURN v.name, v.license_number
$$) as graph_2(name agtype, license_number agtype)
ON graph_1.name = graph_2.name
name | age | license_number
-----------+-----+----------------
"Andres" | 36 | 1234567890
(1 rows)
Cypher can run a read query within a prepared statement. When using parameters with stored procedures, an SQL parameter must be placed in the Cypher function call. See The apache_age Query Format for details.
A Cypher parameter is in the format of a "$"
followed by an identifier. Unlike Postgres Pro
parameters, Cypher parameters start with a letter, followed by
an alphanumeric string of arbitrary length.
Example: $parameter_name
Preparing prepared statements in Cypher is an extension of
Postgres Pro stored procedure system. Use the
PREPARE clause to create a query with the
Cypher function call in it. Do not place
Postgres Pro style
parameters in the Cypher query call, instead place Cypher
parameters in the query and place a
Postgres Pro parameter as the
third argument in the Cypher function call.
PREPARE cypher_stored_procedure(agtype) AS
SELECT *
FROM cypher('expr', $$
MATCH (v:Person)
WHERE v.name = $name //Cypher parameter
RETURN v
$$, $1) //An SQL Parameter must be placed in the Cypher function call
AS (v agtype);
When executing the prepared statement, place an agtype map with
the parameter values where the Postgres Pro
parameter in the Cypher function call is. The value must be an
agtype map or an error will be thrown. Exclude the "$"
for parameter names.
EXECUTE cypher_prepared_statement('{'name': 'Tobias'}');
Cypher commands can be run in PL/pgSQL functions without restriction.
SELECT *
FROM cypher('imdb', $$
CREATE (toby:actor {name: 'Toby Maguire'}),
(tom:actor {name: 'Tom Holland'}),
(willam:actor {name: 'Willam Dafoe'}),
(robert:actor {name: 'Robert Downey Jr'}),
(spiderman:movie {title: 'Spiderman'}),
(no_way_home:movie {title: 'Spiderman: No Way Home'}),
(homecoming:movie {title: 'Spiderman: Homecoming'}),
(ironman:movie {title: 'Ironman'}),
(tropic_thunder:movie {title: 'Tropic Thunder'}),
(toby)-[:acted_in {role: 'Peter Parker', alter_ego: 'Spiderman'}]->(spiderman),
(willam)-[:acted_in {role: 'Norman Osborn', alter_ego: 'Green Goblin'}]->(spiderman),
(toby)-[:acted_in {role: 'Toby Maguire'}]->(tropic_thunder),
(robert)-[:acted_in {role: 'Kirk Lazarus'}]->(tropic_thunder),
(robert)-[:acted_in {role: 'Tony Stark', alter_ego: 'Ironman'}]->(homecoming),
(tom)-[:acted_in {role: 'Peter Parker', alter_ego: 'Spiderman'}]->(homecoming),
(tom)-[:acted_in {role: 'Peter Parker', alter_ego: 'Spiderman'}]->(no_way_home),
(toby)-[:acted_in {role: 'Peter Parker', alter_ego: 'Spiderman'}]->(no_way_home),
(willam)-[:acted_in {role: 'Norman Osborn', alter_ego: 'Green Goblin'}]->(no_way_home)
$$) AS (a agtype);
The example of function creation is shown below.
CREATE OR REPLACE FUNCTION get_all_actor_names()
RETURNS TABLE(actor agtype)
LANGUAGE plpgsql
AS $BODY$
BEGIN
LOAD 'age';
SET search_path TO ag_catalog;
RETURN QUERY
SELECT *
FROM ag_catalog.cypher('imdb', $$
MATCH (v:actor)
RETURN v.name
$$) AS (a agtype);
END
$BODY$;
Execute the query:
SELECT * FROM get_all_actor_names();
actor
--------------------
"Toby Maguire"
"Tom Holland"
"Willam Dafoe"
"Robert Downey Jr"
(4 rows)
It is recommended to add the LOAD 'age' command
and setting the search_path to the function
declaration to ensure that the CREATE FUNCTION
command works consistently.
CREATE OR REPLACE FUNCTION get_actors_who_played_role(role agtype)
RETURNS TABLE(actor agtype, movie agtype)
LANGUAGE plpgsql
AS $function$
DECLARE sql VARCHAR;
BEGIN
load 'age';
SET search_path TO ag_catalog;
sql := format('
SELECT *
FROM cypher(''imdb'', $$
MATCH (actor)-[:acted_in {role: %s}]->(movie:movie)
RETURN actor.name, movie.title
$$) AS (actor agtype, movie agtype);
', role);
RETURN QUERY EXECUTE sql;
END
$function$;
SELECT * FROM get_actors_who_played_role('"Peter Parker"');
actor | movie
----------------+--------------------------
"Toby Maguire" | "Spiderman: No Way Home"
"Toby Maguire" | "Spiderman"
"Tom Holland" | "Spiderman: Homecoming"
"Tom Holland" | "Spiderman: No Way Home"
(4 rows)
apache_age does not support SQL being directly written in Cypher. However, with user-defined functions you can write SQL queries and call them in a Cypher command.
This applies to void and scalar-value functions only. Set returning functions are not currently supported.
Create a function:
CREATE OR REPLACE FUNCTION public.get_event_year(name agtype) returns agtype AS $$
SELECT year::agtype
FROM history AS h
WHERE h.event_name = name::text
LIMIT 1;
$$ LANGUAGE sql;
SELECT * FROM cypher('graph_name', $$
MATCH (e:event)
WHERE e.year < public.get_event_year(e.name)
RETURN n.name
$$) as (n agtype);
n
-------------------
"Apache Con 2021"
(1 row)