Skip to content

Latest commit

 

History

History
1802 lines (1424 loc) · 84.6 KB

gremlin-variants.asciidoc

File metadata and controls

1802 lines (1424 loc) · 84.6 KB

Gremlin Drivers and Variants

gremlin house of mirrors

At this point, readers should be well familiar with the Introduction to this Reference Documentation and will likely be thinking about implementation details specific to the graph provider they have selected as well as the programming language they intend to use. The choice of programming language could have implications to the architecture and design of the application and the choice itself may have limits imposed upon it by the chosen graph provider. For example, a Remote Gremlin Provider will require the selection of a driver to interact with it. On the other hand, a graph system that is designed for embedded use, like TinkerGraph, needs the Java Virtual Machine (JVM) environment which is easily accessed with a JVM programming language. If however the programming language is not built for the JVM then it will require Gremlin Server in the architecture as well.

TinkerPop provides an array of drivers in different programming languages as a way to connect to a remote Gremlin Server or Remote Gremlin Provider. Drivers allow the developer to make requests to that remote system and get back results from the TinkerPop-enabled graphs hosted within. A driver can submit Gremlin strings and Gremlin bytecode over this sub-protocol. Gremlin strings are written in the scripting language made available by the remote system that the driver is connecting to (typically, Groovy-based). This connection approach is quite similar to what developers are likely familiar with when using JDBC and SQL.

The preferred approach is to use bytecode-based requests, which essentially allows the ability to craft Gremlin directly in the programming language of choice. As Gremlin makes use of two fundamental programming constructs: function composition and function nesting, it is possible to embed the Gremlin language in any modern programming language. It is a far more natural way to program, because it enables IDE interaction, compile time checks, and language level checks that can help prevent errors prior to execution. The differences between these two approaches were outlined in the Connecting Via Drivers Section, which applies to Gremlin Server, but also to Remote Gremlin Providers.

In addition to the languages and drivers that TinkerPop supports, there are also third-party implementations, as well as extensions to the Gremlin language that might be specific to a particular graph provider. That listing can be found on the TinkerPop home page. Their description is beyond the scope of this documentation.

Tip
When possible, it is typically best to align the version of TinkerPop used on the client with the version supported on the server. While it is not impossible to have a different version between client and server, it may require additional configuration and/or a deeper knowledge of that changes introduced between versions. It’s simply safer to avoid the conflict, when allowed to do so.
Important
Gremlin-Java is the canonical representation of Gremlin and any (proper) Gremlin language variant will emulate its structure as best as possible given the constructs of the host language. A strong correspondence between variants ensures that the general Gremlin reference documentation is applicable to all variants and that users moving between development languages can easily adopt the Gremlin variant for that language.
gremlin variant architecture

The following sections describe each language variant and driver that is officially TinkerPop a part of the project, providing more detailed information about usage, configuration and known limitations.

Gremlin-Java

gremlin java drawing Apache TinkerPop’s Gremlin-Java implements Gremlin within the Java language and can be used by any Java Virtual Machine. Gremlin-Java is considered the canonical, reference implementation of Gremlin and serves as the foundation by which all other Gremlin language variants should emulate. As the Gremlin Traversal Machine that processes Gremlin queries is also written in Java, it can be used in all three connection methods described in the Connecting Gremlin Section.

<dependency>
   <groupId>org.apache.tinkerpop</groupId>
   <artifactId>gremlin-core</artifactId>
   <version>x.y.z</version>
</dependency>

<!-- when using Gremlin Server or Remote Gremlin Provider a driver is required -->
<dependency>
   <groupId>org.apache.tinkerpop</groupId>
   <artifactId>gremlin-driver</artifactId>
   <version>x.y.z</version>
</dependency>

<!--
alternatively the driver is packaged as an uberjar with shaded non-optional dependencies including gremlin-core and
tinkergraph-gremlin which are not shaded.
-->
<dependency>
   <groupId>org.apache.tinkerpop</groupId>
   <artifactId>gremlin-driver</artifactId>
   <version>x.y.z</version>
   <classifier>shaded</classifier>
   <!-- The shaded JAR uses the original POM, therefore conflicts may still need resolution -->
   <exclusions>
      <exclusion>
         <groupId>io.netty</groupId>
         <artifactId>*</artifactId>
      </exclusion>
   </exclusions>
</dependency>

Connecting

The pattern for connecting is described in Connecting Gremlin and it basically distills down to creating a GraphTraversalSource. For embedded mode, this involves first creating a Graph and then spawning the GraphTraversalSource:

Graph graph = ...;
GraphTraversalSource g = traversal().withEmbedded(graph);

Using "g" it is then possible to start writing Gremlin. The "g" allows for the setting of many configuration options which affect traversal execution. The Traversal Section describes some of these options and some are only suitable with embedded style usage. For remote options however there are some added configurations to consider and this section looks to address those.

When connecting to Gremlin Server or Remote Gremlin Providers it is possible to configure the DriverRemoteConnection manually as shown in earlier examples where the host and port are provided as follows:

GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using("localhost",8182,"g"));

It is also possible to create it from a configuration. The most basic way to do so involves the following line of code:

GraphTraversalSource g = traversal().withRemote('conf/remote-graph.properties');

The remote-graph.properties file simply provides connection information to the GraphTraversalSource which is used to configure a RemoteConnection. That file looks like this:

gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
gremlin.remote.driver.clusterFile=conf/remote-objects.yaml
gremlin.remote.driver.sourceName=g

The RemoteConnection is an interface that provides the transport mechanism for "g" and makes it possible to for that mechanism to be altered (typically by graph providers who have their own protocols). TinkerPop provides one such implementation called the DriverRemoteConnection which enables transport over Gremlin Server protocols using the TinkerPop driver. The driver is configured by the specified gremlin.remote.driver.clusterFile and the local "g" is bound to the GraphTraversalSource on the remote end with gremlin.remote.driver.sourceName which in this case is also "g".

There are other ways to configure the traversal using withRemote() as it has other overloads. It can take an Apache Commons Configuration object which would have keys similar to those shown in the properties file and it can also take a RemoteConnection instance directly. The latter is interesting in that it means it is possible to programmatically construct all aspects of the RemoteConnection. For TinkerPop usage, that might mean directly constructing the DriverRemoteConnection and the driver instance that supplies the transport mechanism. For example, the command shown above could be re-written using programmatic construction as follows:

Cluster cluster = Cluster.open();
GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using(cluster, "g"));

Please consider the following example:

g = traversal().withRemote('conf/remote-graph.properties')
g.V().elementMap()
g.close()
GraphTraversalSource g = traversal().withRemote("conf/remote-graph.properties");
List<Map> list = g.V().elementMap();
g.close();

Note the call to close() above. The call to withRemote() internally instantiates a connection via the driver that can only be released by "closing" the GraphTraversalSource. It is important to take that step to release network resources associated with g.

If working with multiple remote TraversalSource instances it is more efficient to construct Cluster and Client objects and then re-use them.

cluster = Cluster.open('conf/remote-objects.yaml')
client = cluster.connect()
g = traversal().withRemote(DriverRemoteConnection.using(client, "g"))
g.V().elementMap()
g.close()
client.close()
cluster.close()

If the Client instance is supplied externally, as is shown above, then it is not closed implicitly by the close of "g". Closing "g" will have no effect on "client" or "cluster". When supplying them externally, the Client and Cluster objects must also be closed explicitly. It’s worth noting that the close of a Cluster will close all Client instances spawned by the Cluster.

Some connection options can also be set on individual requests made through the Java driver using with() step on the TraversalSource. For instance to set request timeout to 500 milliseconds:

GraphTraversalSource g = traversal().withRemote(conf);
List<Vertex> vertices = g.with(Tokens.ARGS_EVAL_TIMEOUT, 500L).V().out("knows").toList()

The following options are allowed on a per-request basis in this fashion: batchSize, requestId, userAgent and evaluationTimeout (formerly scriptEvaluationTimeout which is also supported but now deprecated). Use of Tokens to reference these options is preferred.

Common Imports

There are a number of classes, functions and tokens that are typically used with Gremlin. The following imports provide most of the common functionality required to use Gremlin:

import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.process.traversal.IO;
import static org.apache.tinkerpop.gremlin.process.traversal.AnonymousTraversalSource.traversal;
import static org.apache.tinkerpop.gremlin.process.traversal.Operator.*;
import static org.apache.tinkerpop.gremlin.process.traversal.Order.*;
import static org.apache.tinkerpop.gremlin.process.traversal.P.*;
import static org.apache.tinkerpop.gremlin.process.traversal.Pop.*;
import static org.apache.tinkerpop.gremlin.process.traversal.SackFunctions.*;
import static org.apache.tinkerpop.gremlin.process.traversal.Scope.*;
import static org.apache.tinkerpop.gremlin.process.traversal.TextP.*;
import static org.apache.tinkerpop.gremlin.structure.Column.*;
import static org.apache.tinkerpop.gremlin.structure.Direction.*;
import static org.apache.tinkerpop.gremlin.structure.T.*;
import static org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.__.*;

Configuration

The following table describes the various configuration options for the Gremlin Driver:

Key Description Default

connectionPool.channelizer

The fully qualified classname of the client Channelizer that defines how to connect to the server.

Channelizer.WebSocketChannelizer

connectionPool.enableSsl

Determines if SSL should be enabled or not. If enabled on the server then it must be enabled on the client.

false

connectionPool.keepAliveInterval

Length of time in milliseconds to wait on an idle connection before sending a keep-alive request. Set to zero to disable this feature.

180000

connectionPool.keyStore

The private key in JKS or PKCS#12 format.

none

connectionPool.keyStorePassword

The password of the keyStore if it is password-protected.

none

connectionPool.keyStoreType

JKS (Java 8 default) or PKCS12 (Java 9+ default)

none

connectionPool.maxContentLength

The maximum length in bytes that a message can be sent to the server. This number can be no greater than the setting of the same name in the server configuration.

65536

connectionPool.maxInProcessPerConnection

The maximum number of in-flight requests that can occur on a connection.

4

connectionPool.maxSimultaneousUsagePerConnection

The maximum number of times that a connection can be borrowed from the pool simultaneously.

16

connectionPool.maxSize

The maximum size of a connection pool for a host.

8

connectionPool.maxWaitForConnection

The amount of time in milliseconds to wait for a new connection before timing out.

3000

connectionPool.maxWaitForClose

The amount of time in milliseconds to wait for pending messages to be returned from the server before closing the connection.

3000

connectionPool.minInProcessPerConnection

The minimum number of in-flight requests that can occur on a connection.

1

connectionPool.minSimultaneousUsagePerConnection

The maximum number of times that a connection can be borrowed from the pool simultaneously.

8

connectionPool.minSize

The minimum size of a connection pool for a host.

2

connectionPool.reconnectInterval

The amount of time in milliseconds to wait before trying to reconnect to a dead host.

1000

connectionPool.resultIterationBatchSize

The override value for the size of the result batches to be returned from the server.

64

connectionPool.sslCipherSuites

The list of JSSE ciphers to support for SSL connections. If specified, only the ciphers that are listed and supported will be enabled. If not specified, the JVM default is used.

none

connectionPool.sslEnabledProtocols

The list of SSL protocols to support for SSL connections. If specified, only the protocols that are listed and supported will be enabled. If not specified, the JVM default is used.

none

connectionPool.sslSkipCertValidation

Configures the TrustManager to trust all certs without any validation. Should not be used in production.

false

connectionPool.trustStore

File location for a SSL Certificate Chain to use when SSL is enabled. If this value is not provided and SSL is enabled, the default TrustManager will be used.

none

connectionPool.trustStorePassword

The password of the trustStore if it is password-protected

none

connectionPool.validationRequest

A script that is used to test server connectivity. A good script to use is one that evaluates quickly and returns no data. The default simply returns an empty string, but if a graph is required by a particular provider, a good traversal might be g.inject().

''

connectionPool.connectionSetupTimeoutMillis

Duration of time in milliseconds provided for connection setup to complete which includes WebSocket protocol handshake and SSL handshake.

15000

hosts

The list of hosts that the driver will connect to.

localhost

jaasEntry

Sets the AuthProperties.Property.JAAS_ENTRY properties for authentication to Gremlin Server.

none

nioPoolSize

Size of the pool for handling request/response operations.

available processors

password

The password to submit on requests that require authentication.

none

path

The URL path to the Gremlin Server.

/gremlin

port

The port of the Gremlin Server to connect to. The same port will be applied for all hosts.

8192

protocol

Sets the AuthProperties.Property.PROTOCOL properties for authentication to Gremlin Server.

none

serializer.className

The fully qualified class name of the MessageSerializer that will be used to communicate with the server. Note that the serializer configured on the client should be supported by the server configuration.

none

serializer.config

A Map of configuration settings for the serializer.

none

username

The username to submit on requests that require authentication.

none

workerPoolSize

Size of the pool for handling background work.

available processors * 2

Please see the Cluster.Builder javadoc to get more information on these settings.

Transactions

Transactions with Java are best described in The Traversal - Transactions section of this documentation as Java covers both embedded and remote use cases.

Serialization

Remote systems like Gremlin Server and Remote Gremlin Providers respond to requests made in a particular serialization format and respond by serializing results to some format to be interpreted by the client. For JVM-based languages, there are three options for serialization: Gryo, GraphSON and GraphBinary. It is important that the client and server have the same serializers configured in the same way or else one or the other will experience serialization exceptions and fail to always communicate. Discrepancy in serializer registration between client and server can happen fairly easily as different graph systems may automatically include serializers on the server-side, thus leaving the client to be configured manually. As an example:

IoRegistry registry = ...; // an IoRegistry instance exposed by a specific graph provider
TypeSerializerRegistry typeSerializerRegistry = TypeSerializerRegistry.build().addRegistry(registry).create();
MessageSerializer serializer = new GraphBinaryMessageSerializerV1(typeSerializerRegistry);
Cluster cluster = Cluster.build().
                          serializer(serializer).
                          create();
Client client = cluster.connect();
GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using(client, "g"));

The IoRegistry tells the serializer what classes from the graph provider to auto-register during serialization. Gremlin Server roughly uses this same approach when it configures its serializers, so using this same model will ensure compatibility when making requests. Obviously, it is possible to switch to GraphSON or Gryo by using the appropriate MessageSerializer (e.g. GraphSONMessageSerializerV3d0 or GryoMessageSerializerV3d0 respectively) in the same way and building that into the Cluster object.

Note
Gryo is no longer the preferred binary serialization format for Gremlin Server - please prefer GraphBinary.

The Lambda Solution

Supporting anonymous functions across languages is difficult as most languages do not support lambda introspection and thus, code analysis. In Gremlin-Java and with embedded usage, lambdas can be leveraged directly:

g.V().out("knows").map(t -> t.get().value("name") + " is the friend name") (1)
g.V().out("knows").sideEffect(System.out::println) (2)
g.V().as("a").out("knows").as("b").select("b").by((Function<Vertex, Integer>) v -> v.<String>value("name").length()) (3)
  1. A Java Function is used to map a Traverser<S> to an object E.

  2. Gremlin steps that take consumer arguments can be passed Java method references.

  3. Gremlin-Java may sometimes require explicit lambda typing when types can not be automatically inferred.

When sending traversals remotely to Gremlin Server or Remote Gremlin Providers, the static methods of Lambda should be used and should denote a particular JSR-223 ScriptEngine that is available on the remote end (typically, this is Groovy). Lambda creates a string-based lambda that is then converted into a lambda/closure/anonymous-function/etc. by the respective lambda language’s JSR-223 ScriptEngine implementation.

g.V().out("knows").map(Lambda.function("it.get().value('name') + ' is the friend name'"))
g.V().out("knows").sideEffect(Lambda.consumer("println it"))
g.V().as("a").out("knows").as("b").select("b").by(Lambda.<Vertex,Integer>function("it.value('name').length()"))

Finally, Gremlin Bytecode that includes lambdas requires that the traversal be processed by the ScriptEngine. To avoid continued recompilation costs, it supports the encoding of bindings, which allow Gremlin Server to cache traversals that will be reused over and over again save that some parameterization may change. Thus, instead of translating, compiling, and then executing each submitted bytecode request, it is possible to simply execute. To express bindings in Java, use Bindings.

b = Bindings.instance()
g.V(b.of('id',1)).out('created').values('name').map{t -> "name: " + t.get() }
g.V(b.of('id',4)).out('created').values('name').map{t -> "name: " + t.get() }
g.V(b.of('id',4)).out('created').values('name').getBytecode()
g.V(b.of('id',4)).out('created').values('name').getBytecode().getBindings()
cluster.close()

Both traversals are abstractly defined as g.V(id).out('created').values('name').map{t → "name: " + t.get() } and thus, the first submission can be cached for faster evaluation on the next submission.

Warning
It is generally advised to avoid lambda usage. Please consider A Note On Lambdas for more information.

Submitting Scripts

gremlin java TinkerPop comes equipped with a reference client for Java-based applications. It is referred to as gremlin-driver, which enables applications to send requests to Gremlin Server and get back results.

Gremlin scripts are sent to the server from a Client instance. A Client is created as follows:

Cluster cluster = Cluster.open();  (1)
Client client = cluster.connect(); (2)
  1. Opens a reference to localhost - note that there are many configuration options available in defining a Cluster object.

  2. Creates a Client given the configuration options of the Cluster.

Once a Client instance is ready, it is possible to issue some Gremlin Groovy scripts:

ResultSet results = client.submit("[1,2,3,4]");  (1)
results.stream().map(i -> i.get(Integer.class) * 2);       (2)

CompletableFuture<List<Result>> results = client.submit("[1,2,3,4]").all();  (3)

CompletableFuture<ResultSet> future = client.submitAsync("[1,2,3,4]"); (4)

Map<String,Object> params = new HashMap<>();
params.put("x",4);
client.submit("[1,2,3,x]", params); (5)
  1. Submits a script that simply returns a List of integers. This method blocks until the request is written to the server and a ResultSet is constructed.

  2. Even though the ResultSet is constructed, it does not mean that the server has sent back the results (or even evaluated the script potentially). The ResultSet is just a holder that is awaiting the results from the server. In this case, they are streamed from the server as they arrive.

  3. Submit a script, get a ResultSet, then return a CompletableFuture that will be called when all results have been returned.

  4. Submit a script asynchronously without waiting for the request to be written to the server.

  5. Parameterized request are considered the most efficient way to send Gremlin to the server as they can be cached, which will boost performance and reduce resources required on the server.

Per Request Settings

There are a number of overloads to Client.submit() that accept a RequestOptions object. The RequestOptions provide a way to include options that are specific to the request made with the call to submit(). A good use-case for this feature is to set a per-request override to the evaluationTimeout so that it only applies to the current request.

Cluster cluster = Cluster.open();
Client client = cluster.connect();
RequestOptions options = RequestOptions.build().timeout(500).create();
List<Result> result = client.submit("g.V().repeat(both()).times(100)", options).all().get();

The preferred method for setting a per-request timeout for scripts is demonstrated above, but those familiar with bytecode may try g.with(EVALUATION_TIMEOUT, 500) within a script. Gremlin Server will respect timeouts set this way in scripts as well. With scripts of course, it is possible to send multiple traversals at once in the same script. In such events, the timeout for the request is interpreted as a sum of all timeouts identified in the script.

RequestOptions options = RequestOptions.build().timeout(500).create();
List<Result> result = client.submit("g.with(EVALUATION_TIMEOUT, 500).addV().iterate();" +
                                    "g.addV().iterate();
                                    "g.with(EVALUATION_TIMEOUT, 500).addV();", options).all().get();

In the above example, RequestOptions defines a timeout of 500 milliseconds, but the script has three traversals with two internal settings for the timeout using with(). The request timeout used by the server will therefore be 1000 milliseconds (overriding the 500 which itself was an override for whatever configuration was on the server).

Aliases

Scripts submitted to Gremlin Server automatically have the globally configured Graph and TraversalSource instances made available to them. Therefore, if Gremlin Server configures two TraversalSource instances called "g1" and "g2" a script can simply reference them directly as:

client.submit("g1.V()")
client.submit("g2.V()")

While this is an acceptable way to submit scripts, it has the downside of forcing the client to encode the server-side variable name directly into the script being sent. If the server configuration ever changed such that "g1" became "g100", the client-side code might have to see a significant amount of change. Decoupling the script code from the server configuration can be managed by the alias method on Client as follows:

Client g1Client = client.alias("g1")
Client g2Client = client.alias("g2")
g1Client.submit("g.V()")
g2Client.submit("g.V()")

The above code demonstrates how the alias method can be used such that the script need only contain a reference to "g" and "g1" and "g2" are automatically rebound into "g" on the server-side.

Domain Specific Languages

Creating a Domain Specific Language (DSL) in Java requires the @GremlinDsl Java annotation in gremlin-core. This annotation should be applied to a "DSL interface" that extends GraphTraversal.Admin:

@GremlinDsl
public interface SocialTraversalDsl<S, E> extends GraphTraversal.Admin<S, E> {
}
Important
The name of the DSL interface should be suffixed with "TraversalDSL". All characters in the interface name before that become the "name" of the DSL.

In this interface, define the methods that the DSL will be composed of:

@GremlinDsl
public interface SocialTraversalDsl<S, E> extends GraphTraversal.Admin<S, E> {
    public default GraphTraversal<S, Vertex> knows(String personName) {
        return out("knows").hasLabel("person").has("name", personName);
    }

    public default <E2 extends Number> GraphTraversal<S, E2> youngestFriendsAge() {
        return out("knows").hasLabel("person").values("age").min();
    }

    public default GraphTraversal<S, Long> createdAtLeast(int number) {
        return outE("created").count().is(P.gte(number));
    }
}
Important
Follow the TinkerPop convention of using <S,E> in naming generics as those conventions are taken into account when generating the anonymous traversal class. The processor attempts to infer the appropriate type parameters when generating the anonymous traversal class. If it cannot do it correctly, it is possible to avoid the inference by using the GremlinDsl.AnonymousMethod annotation on the DSL method. It allows explicit specification of the types to use.

The @GremlinDsl annotation is used by the Java Annotation Processor to generate the boilerplate class structure required to properly use the DSL within the TinkerPop framework. These classes can be generated and maintained by hand, but it would be time consuming, monotonous and error-prone to do so. Typically, the Java compilation process is automatically configured to detect annotation processors on the classpath and will automatically use them when found. If that does not happen, it may be necessary to make configuration changes to the build to allow for the compilation process to be aware of the following javax.annotation.processing.Processor implementation:

org.apache.tinkerpop.gremlin.process.traversal.dsl.GremlinDslProcessor

The annotation processor will generate several classes for the DSL:

  • SocialTraversal - A Traversal interface that extends the SocialTraversalDsl proxying methods to its underlying interfaces (such as GraphTraversal) to instead return a SocialTraversal

  • DefaultSocialTraversal - A default implementation of SocialTraversal (typically not used directly by the user)

  • SocialTraversalSource - Spawns DefaultSocialTraversal instances.

  • __ - Spawns anonymous DefaultSocialTraversal instances.

Using the DSL then just involves telling the Graph to use it:

SocialTraversalSource social = traversal(SocialTraversalSource.class).withEmbedded(graph);
social.V().has("name","marko").knows("josh");

The SocialTraversalSource can also be customized with DSL functions. As an additional step, include a class that extends from GraphTraversalSource and with a name that is suffixed with "TraversalSourceDsl". Include in this class, any custom methods required by the DSL:

public class SocialTraversalSourceDsl extends GraphTraversalSource {

    public SocialTraversalSourceDsl(Graph graph, TraversalStrategies traversalStrategies) {
        super(graph, traversalStrategies);
    }

    public SocialTraversalSourceDsl(Graph graph) {
        super(graph);
    }

    public SocialTraversalSourceDsl(RemoteConnection connection) {
        super(connection);
    }

    public GraphTraversal<Vertex, Vertex> persons(String... names) {
        GraphTraversalSource clone = this.clone();

        // Manually add a "start" step for the traversal in this case the equivalent of V(). GraphStep is marked
        // as a "start" step by passing "true" in the constructor.
        clone.getBytecode().addStep(GraphTraversal.Symbols.V);
        GraphTraversal<Vertex, Vertex> traversal = new DefaultGraphTraversal<>(clone);
        traversal.asAdmin().addStep(new GraphStep<>(traversal.asAdmin(), Vertex.class, true));

        traversal = traversal.hasLabel("person");
        if (names.length > 0) traversal = traversal.has("name", P.within(names));

        return traversal;
    }
}

Then, back in the SocialTraversal interface, update the GremlinDsl annotation with the traversalSource argument to point to the fully qualified class name of the SocialTraversalSourceDsl:

@GremlinDsl(traversalSource = "com.company.SocialTraversalSourceDsl")
public interface SocialTraversalDsl<S, E> extends GraphTraversal.Admin<S, E> {
    ...
}

It is then possible to use the persons() method to start traversals:

SocialTraversalSource social = traversal(SocialTraversalSource.class).withEmbedded(graph);
social.persons("marko").knows("josh");
Note
Using Maven, as shown in the gremlin-archetype-dsl module, makes developing DSLs with the annotation processor straightforward in that it sets up appropriate paths to the generated code automatically.

Application Examples

The available Maven archetypes are as follows:

  • gremlin-archetype-dsl - An example project that demonstrates how to build Domain Specific Languages with Gremlin in Java.

  • gremlin-archetype-server - An example project that demonstrates the basic structure of a Gremlin Server project, how to connect with the Gremlin Driver, and how to embed Gremlin Server in a testing framework.

  • gremlin-archetype-tinkergraph - A basic example of how to structure a TinkerPop project with Maven.

Use Maven to generate these example projects with a command like:

$ mvn archetype:generate -DarchetypeGroupId=org.apache.tinkerpop -DarchetypeArtifactId=gremlin-archetype-server \
      -DarchetypeVersion=x.y.z -DgroupId=com.my -DartifactId=app -Dversion=0.1 -DinteractiveMode=false

This command will generate a new Maven project in a directory called "app" with a pom.xml specifying a groupId of com.my. Please see the README.asciidoc in the root of each generated project for information on how to build and execute it.

Gremlin-Groovy

gremlin groovy drawing Apache TinkerPop’s Gremlin-Groovy implements Gremlin within the Apache Groovy language. As a JVM-based language variant, Gremlin-Groovy is backed by Gremlin-Java constructs. Moreover, given its scripting nature, Gremlin-Groovy serves as the language of Gremlin Console and Gremlin Server.

compile group: 'org.apache.tinkerpop', name: 'gremlin-core', version: 'x.y.z'
compile group: 'org.apache.tinkerpop', name: 'gremlin-driver', version: 'x.y.z'

Differences

In Groovy, as, in, and not are reserved words. Gremlin-Groovy does not allow these steps to be called statically from the anonymous traversal __ and therefore, must always be prefixed with __. For instance: g.V().as('a').in().as('b').where(__.not(__.as('a').out().as('b')))

Since Groovy has access to the full JVM as Java does, it is possible to construct Date-like objects directly, but the Gremlin language does offer a datetime() function that is exposed in the Gremlin Console and as a function for Gremlin scripts sent to Gremlin Server. The function accepts the following forms of dates and times using a default time zone offset of UTC(+00:00):

  • 2018-03-22

  • 2018-03-22T00:35:44

  • 2018-03-22T00:35:44Z

  • 2018-03-22T00:35:44.741

  • 2018-03-22T00:35:44.741Z

  • 2018-03-22T00:35:44.741+1600

Gremlin-Python

gremlin python drawing Apache TinkerPop’s Gremlin-Python implements Gremlin within the Python language and can be used on any Python virtual machine including the popular CPython machine. Python’s syntax has the same constructs as Java including "dot notation" for function chaining (a.b.c), round bracket function arguments (a(b,c)), and support for global namespaces (a(b()) vs a(__.b())). As such, anyone familiar with Gremlin-Java will immediately be able to work with Gremlin-Python. Moreover, there are a few added constructs to Gremlin-Python that make traversals a bit more succinct.

To install Gremlin-Python, use Python’s pip package manager.

pip install gremlinpython
pip install gremlinpython[kerberos]     # Optional, not available on Microsoft Windows

Connecting

The pattern for connecting is described in Connecting Gremlin and it basically distills down to creating a GraphTraversalSource. A GraphTraversalSource is created from the anonymous traversal() method where the "g" provided to the DriverRemoteConnection corresponds to the name of a GraphTraversalSource on the remote end.

g = traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))

If you need to send additional headers in the websockets connection, you can pass an optional headers parameter to the DriverRemoteConnection constructor.

g = traversal().withRemote(DriverRemoteConnection(
    'ws://localhost:8182/gremlin', 'g', headers={'Header':'Value'}))

Gremlin-Python supports plain text and Kerberos SASL authentication, you can set it on the connection options.

# Plain text authentication
g = traversal().withRemote(DriverRemoteConnection(
    'ws://localhost:8182/gremlin', 'g', username='stephen', password='password'))

# Kerberos authentication
g = traversal().withRemote(DriverRemoteConnection(
    'ws://localhost:8182/gremlin', 'g', kerberized_service='gremlin@hostname.your.org'))

The value specified for the kerberized_service should correspond to the first part of the principal name configured for the gremlin service, but with the slash replaced by an at sign. The Gremlin-Python client reads the kerberos configurations from your system. It finds the KDC’s hostname and port from the krb5.conf file at the default location or as indicated in the KRB5_CONFIG environment variable. It finds credentials from the credential cache or a keytab file at the default locations or as indicated in the KRB5CCNAME or KRB5_KTNAME environment variables.

If you authenticate to a remote Gremlin Server or Remote Gremlin Provider, this server normally has SSL activated and the websockets url will start with 'wss://'. If Gremlin-Server uses a self-signed certificate for SSL, Gremlin-Python needs access to a local copy of the CA certificate file (in openssl .pem format), to be specified in the SSL_CERT_FILE environment variable.

Note
If connecting from an inherently single-threaded Python process where blocking while waiting for Gremlin traversals to complete is acceptable, it might be helpful to set pool_size and max_workers parameters to 1. See the Configuration section just below. Examples where this could apply are serverless cloud functions or WSGI worker processes.

Some connection options can also be set on individual requests made through the using with() step on the TraversalSource. For instance to set request timeout to 500 milliseconds:

vertices = g.with_('evaluationTimeout', 500).V().out('knows').toList()

The following options are allowed on a per-request basis in this fashion: batchSize, requestId, userAgent and evaluationTimeout (formerly scriptEvaluationTimeout which is also supported but now deprecated).

Common Imports

There are a number of classes, functions and tokens that are typically used with Gremlin. The following imports provide most of the typical functionality required to use Gremlin:

from gremlin_python import statics
from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.process.traversal import T
from gremlin_python.process.traversal import Order
from gremlin_python.process.traversal import Cardinality
from gremlin_python.process.traversal import Column
from gremlin_python.process.traversal import Direction
from gremlin_python.process.traversal import Operator
from gremlin_python.process.traversal import P
from gremlin_python.process.traversal import TextP
from gremlin_python.process.traversal import Pop
from gremlin_python.process.traversal import Scope
from gremlin_python.process.traversal import Barrier
from gremlin_python.process.traversal import Bindings
from gremlin_python.process.traversal import WithOptions

These can be used analogously to how they are used in Gremlin-Java.

>>> g.V().hasLabel('person').has('age',P.gt(30)).order().by('age',Order.desc).toList()
[v[6], v[4]]

Moreover, by importing the statics of Gremlin-Python, the class prefixes can be omitted.

>>> statics.load_statics(globals())

With statics loaded its possible to represent the above traversal as below.

>>> g.V().hasLabel('person').has('age',gt(30)).order().by('age',desc).toList()
[v[6], v[4]]

Statics includes all the __-methods and thus, anonymous traversals like __.out() can be expressed as below. That is, without the __-prefix.

>>> g.V().repeat(out()).times(2).name.fold().toList()
[['ripple', 'lop']]

There may be situations where certain graphs may want a more exact data type than what Python will allow as a language. To support these situations gremlin-python has a few special type classes that can be imported from statics. They include:

from gremlin_python.statics import long         # Java long
from gremlin_python.statics import timestamp    # Java timestamp
from gremlin_python.statics import SingleByte   # Java byte type
from gremlin_python.statics import SingleChar   # Java char type
from gremlin_python.statics import GremlinType  # Java Class

Configuration

The following table describes the various configuration options for the Gremlin-Python Driver. They can be passed to the Client or DriverRemoteConnection instance as keyword arguments:

Key Description Default

headers

Additional headers that will be added to each request message.

None

max_workers

Maximum number of worker threads.

Number of CPUs * 5

message_serializer

The message serializer implementation.

gremlin_python.driver.serializer.GraphSONMessageSerializer

password

The password to submit on requests that require authentication.

""

pool_size

The number of connections used by the pool.

4

protocol_factory

A callable that returns an instance of AbstractBaseProtocol.

gremlin_python.driver.protocol.GremlinServerWSProtocol

transport_factory

A callable that returns an instance of AbstractBaseTransport.

gremlin_python.driver.aiohttp.transport.AiohttpTransport

username

The username to submit on requests that require authentication.

""

kerberized_service

the first part of the principal name configured for the gremlin service

"""

session

A unique string-based identifier (typically a UUID) to enable a session-based connection. This is not a valid configuration for DriverRemoteConnection.

None

Note that the transport_factory can allow for additional configuration of the AiohttpTransport, which allows pass through of the named parameters available in AIOHTTP’s ws_connect, and the ability to call the api from an event loop:

import ssl
...
g = traversal().withRemote(
  DriverRemoteConnection('ws://localhost:8182/gremlin','g',
                         transport_factory=lambda: AiohttpTransport(read_timeout=10,
                                                                    write_timeout=10,
                                                                    heartbeat=1.0,
                                                                    call_from_event_loop=True
                                                                    max_content_length=100*1024*1024,
                                                                    ssl_options=ssl.create_default_context(Purpose.CLIENT_AUTH))))

Compression configuration options are described in the zlib documentation. By default, compression settings are configured as shown in the above example.

Traversal Strategies

In order to add and remove traversal strategies from a traversal source, Gremlin-Python has a TraversalStrategy class along with a collection of subclasses that mirror the standard Gremlin-Java strategies.

>>> g = g.withStrategies(SubgraphStrategy(vertices=hasLabel('person'),edges=has('weight',gt(0.5))))
>>> g.V().name.toList()
['marko', 'vadas', 'josh', 'peter']
>>> g.V().outE().elementMap().toList()
[{<T.id: 1>: 8, <T.label: 4>: 'knows', <Direction.IN: 2>: {<T.id: 1>: 4, <T.label: 4>: 'person'}, <Direction.OUT: 3>: {<T.id: 1>: 1, <T.label: 4>: 'person'}, 'weight': 1.0}]
>>> g = g.withoutStrategies(SubgraphStrategy)
>>> g.V().name.toList()
['marko', 'vadas', 'lop', 'josh', 'ripple', 'peter']
>>> g.V().outE().elementMap().toList()
[{<T.id: 1>: 9, <T.label: 4>: 'created', <Direction.IN: 2>: {<T.id: 1>: 3, <T.label: 4>: 'software'}, <Direction.OUT: 3>: {<T.id: 1>: 1, <T.label: 4>: 'person'}, 'weight': 0.4}, {<T.id: 1>: 7, <T.label: 4>: 'knows', <Direction.IN: 2>: {<T.id: 1>: 2, <T.label: 4>: 'person'}, <Direction.OUT: 3>: {<T.id: 1>: 1, <T.label: 4>: 'person'}, 'weight': 0.5}, {<T.id: 1>: 8, <T.label: 4>: 'knows', <Direction.IN: 2>: {<T.id: 1>: 4, <T.label: 4>: 'person'}, <Direction.OUT: 3>: {<T.id: 1>: 1, <T.label: 4>: 'person'}, 'weight': 1.0}, {<T.id: 1>: 10, <T.label: 4>: 'created', <Direction.IN: 2>: {<T.id: 1>: 5, <T.label: 4>: 'software'}, <Direction.OUT: 3>: {<T.id: 1>: 4, <T.label: 4>: 'person'}, 'weight': 1.0}, {<T.id: 1>: 11, <T.label: 4>: 'created', <Direction.IN: 2>: {<T.id: 1>: 3, <T.label: 4>: 'software'}, <Direction.OUT: 3>: {<T.id: 1>: 4, <T.label: 4>: 'person'}, 'weight': 0.4}, {<T.id: 1>: 12, <T.label: 4>: 'created', <Direction.IN: 2>: {<T.id: 1>: 3, <T.label: 4>: 'software'}, <Direction.OUT: 3>: {<T.id: 1>: 6, <T.label: 4>: 'person'}, 'weight': 0.2}]
>>> g = g.withComputer(workers=2,vertices=has('name','marko'))
>>> g.V().name.toList()
['marko']
>>> g.V().outE().valueMap().with_(WithOptions.tokens).toList()
[{<T.id: 1>: 9, <T.label: 4>: 'created', 'weight': 0.4}, {<T.id: 1>: 7, <T.label: 4>: 'knows', 'weight': 0.5}, {<T.id: 1>: 8, <T.label: 4>: 'knows', 'weight': 1.0}]
Note
Many of the TraversalStrategy classes in Gremlin-Python are proxies to the respective strategy on Apache TinkerPop’s JVM-based Gremlin traversal machine. As such, their apply(Traversal) method does nothing. However, the strategy is encoded in the Gremlin-Python bytecode and transmitted to the Gremlin traversal machine for re-construction machine-side.

The Lambda Solution

Supporting anonymous functions across languages is difficult as most languages do not support lambda introspection and thus, code analysis. In Gremlin-Python, a Gremlin lambda should be represented as a zero-arg callable that returns a string representation of the lambda expected for use in the traversal. The lambda should be written as a Gremlin-Groovy`string. When the lambda is represented in `Bytecode its language is encoded such that the remote connection host can infer which translator and ultimate execution engine to use.

>>> g.V().out().map(lambda: "it.get().value('name').length()").sum().toList()
[24]
Tip
When running into situations where Groovy cannot properly discern a method signature based on the Lambda instance created, it will help to fully define the closure in the lambda expression - so rather than lambda: ('it.get().value('name')','gremlin-groovy'), prefer lambda: ('x → x.get().value('name'),'gremlin-groovy').

Finally, Gremlin Bytecode that includes lambdas requires that the traversal be processed by the ScriptEngine. To avoid continued recompilation costs, it supports the encoding of bindings, which allow a remote engine to to cache traversals that will be reused over and over again save that some parameterization may change. Thus, instead of translating, compiling, and then executing each submitted bytecode, it is possible to simply execute.

>>> g.V(Bindings.of('x',1)).out('created').map(lambda: "it.get().value('name').length()").sum().toList()
[3]
>>> g.V(Bindings.of('x',4)).out('created').map(lambda: "it.get().value('name').length()").sum().toList()
[9]
Warning
As explained throughout the documentation, when possible avoid lambdas.

Submitting Scripts

The Client class implementation/interface is based on the Java Driver, with some restrictions. Most notably, Gremlin-Python does not yet implement the Cluster class. Instead, Client is instantiated directly. Usage is as follows:

from gremlin_python.driver import client (1)
client = client.Client('ws://localhost:8182/gremlin', 'g') (2)
  1. Import the Gremlin-Python client module.

  2. Opens a reference to localhost - note that there are various configuration options that can be passed to the Client object upon instantiation as keyword arguments.

Once a Client instance is ready, it is possible to issue some Gremlin:

result_set = client.submit('[1,2,3,4]')  (1)
future_results = result_set.all()  (2)
results = future_results.result() (3)
assert results == [1, 2, 3, 4] (4)

future_result_set = client.submitAsync('[1,2,3,4]') (5)
result_set = future_result_set.result() (6)
result = result_set.one() (7)
assert results == [1, 2, 3, 4] (8)
assert result_set.done.done() (9)

client.close() (10)
  1. Submit a script that simply returns a List of integers. This method blocks until the request is written to the server and a ResultSet is constructed.

  2. Even though the ResultSet is constructed, it does not mean that the server has sent back the results (or even evaluated the script potentially). The ResultSet is just a holder that is awaiting the results from the server. The all method returns a concurrent.futures.Future that resolves to a list when it is complete.

  3. Block until the the script is evaluated and results are sent back by the server.

  4. Verify the result.

  5. Submit the same script to the server but don’t block.

  6. Wait until request is written to the server and ResultSet is constructed.

  7. Read a single result off the result stream.

  8. Again, verify the result.

  9. Verify that the all results have been read and stream is closed.

  10. Close client and underlying pool connections.

Per Request Settings

The client.submit() functions accept a request_options which expects a dictionary. The request_options provide a way to include options that are specific to the request made with the call to submit(). A good use-case for this feature is to set a per-request override to the evaluationTimeout so that it only applies to the current request.

result_set = client.submit('g.V().repeat(both()).times(100)', result_options={'evaluationTimeout': 5000})

The following options are allowed on a per-request basis in this fashion: batchSize, requestId, userAgent and evaluationTimeout (formerly scriptEvaluationTimeout which is also supported but now deprecated).

Important
The preferred method for setting a per-request timeout for scripts is demonstrated above, but those familiar with bytecode may try g.with(EVALUATION_TIMEOUT, 500) within a script. Scripts with multiple traversals and multiple timeouts will be interpreted as a sum of all timeouts identified in the script for that request.
RequestOptions options = RequestOptions.build().timeout(500).create();
List<Result> result = client.submit("g.with(EVALUATION_TIMEOUT, 500).addV().iterate();" +
                                    "g.addV().iterate();
                                    "g.with(EVALUATION_TIMEOUT, 500).addV();", options).all().get();

In the above example, RequestOptions defines a timeout of 500 milliseconds, but the script has three traversals with two internal settings for the timeout using with(). The request timeout used by the server will therefore be 1000 milliseconds (overriding the 500 which itself was an override for whatever configuration was on the server).

Domain Specific Languages

Writing a Gremlin Domain Specific Language (DSL) in Python simply requires direct extension of several classes:

  • GraphTraversal - which exposes the various steps used in traversal writing

  • __ - which spawns anonymous traversals from steps

  • GraphTraversalSource - which spawns GraphTraversal instances

The Social DSL based on the "modern" toy graph might look like this:

class SocialTraversal(GraphTraversal):

    def knows(self, person_name):
        return self.out('knows').hasLabel('person').has('name', person_name)

    def youngestFriendsAge(self):
        return self.out('knows').hasLabel('person').values('age').min()

    def createdAtLeast(self, number):
        return self.outE('created').count().is_(P.gte(number))

class __(AnonymousTraversal):

    graph_traversal = SocialTraversal

    @classmethod
    def knows(cls, *args):
        return cls.graph_traversal(None, None, Bytecode()).knows(*args)

    @classmethod
    def youngestFriendsAge(cls, *args):
        return cls.graph_traversal(None, None, Bytecode()).youngestFriendsAge(*args)

    @classmethod
    def createdAtLeast(cls, *args):
        return cls.graph_traversal(None, None, Bytecode()).createdAtLeast(*args)


class SocialTraversalSource(GraphTraversalSource):

    def __init__(self, *args, **kwargs):
        super(SocialTraversalSource, self).__init__(*args, **kwargs)
        self.graph_traversal = SocialTraversal

    def persons(self, *args):
        traversal = self.get_graph_traversal()
        traversal.bytecode.add_step('V')
        traversal.bytecode.add_step('hasLabel', 'person')

        if len(args) > 0:
            traversal.bytecode.add_step('has', 'name', P.within(args))

        return traversal
Note
The AnonymousTraversal class above is just an alias for __ as in from gremlin_python.process.graph_traversal import __ as AnonymousTraversal

Using the DSL is straightforward and just requires that the graph instance know the SocialTraversalSource should be used:

social = traversal(SocialTraversalSource).withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
social.persons('marko').knows('josh')
social.persons('marko').youngestFriendsAge()
social.persons().filter(__.createdAtLeast(2)).count()

Syntactic Sugar

Python supports meta-programming and operator overloading. There are three uses of these techniques in Gremlin-Python that makes traversals a bit more concise.

>>> g.V().both()[1:3].toList()
[v[2], v[4]]
>>> g.V().both()[1].toList()
[v[2]]
>>> g.V().both().name.toList()
['lop', 'lop', 'lop', 'vadas', 'josh', 'josh', 'josh', 'marko', 'marko', 'marko', 'peter', 'ripple']

Differences

In situations where Python reserved words and global functions overlap with standard Gremlin steps and tokens, those bits of conflicting Gremlin get an underscore appended as a suffix:

Tokens - Scope.global_

Limitations

  • Traversals that return a Set might be coerced to a List in Python. In the case of Python, number equality is different from JVM languages which produces different Set results when those types are in use. When this case is detected during deserialization, the Set is coerced to a List so that traversals return consistent results within a collection across different languages. If a Set is needed then convert List results to Set manually.

  • Gremlin is capable of returning Dictionary results that use non-hashable keys (e.g. Dictionary as a key) and Python does not support that at a language level. Using GraphSON 3.0 or GraphBinary (after 3.5.0) makes it possible to return such results. In all other cases, Gremlin that returns such results will need to be re-written to avoid that sort of key.

  • The subgraph()-step is not supported by any variant that is not running on the Java Virtual Machine as there is no Graph instance to deserialize a result into on the client-side. A workaround is to replace the step with aggregate(local) and then convert those results to something the client can use locally.

Application Examples

The TinkerPop source code contains a simple Python script that shows a basic example of how gremlinpython works. It can be found in GitHub here and is designed to work best with a running Gremlin Server configured with the default conf/gremlin-server.yaml file as included with the standard release packaging.

pip install gremlinpython
pip install aiohttp
python example.py

Gremlin.Net

gremlin dotnet logo Apache TinkerPop’s Gremlin.Net implements Gremlin within the C# language. It targets .NET Standard and can therefore be used on different operating systems and with different .NET frameworks, such as .NET Framework and .NET Core. Since the C# syntax is very similar to that of Java, it should be easy to switch between Gremlin-Java and Gremlin.Net. The only major syntactical difference is that all method names in Gremlin.Net use PascalCase as opposed to camelCase in Gremlin-Java in order to comply with .NET conventions.

nuget install Gremlin.Net

Connecting

The pattern for connecting is described in Connecting Gremlin and it basically distills down to creating a GraphTraversalSource. A GraphTraversalSource is created from the AnonymousTraversalSource.traversal() method where the "g" provided to the DriverRemoteConnection corresponds to the name of a GraphTraversalSource on the remote end.

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsTests.cs[role=include]

Some connection options can also be set on individual requests using the With() step on the TraversalSource. For instance to set request timeout to 500 milliseconds:

var l = g.With(Tokens.ArgsEvalTimeout, 500).V().Out("knows").Count().ToList();

The following options are allowed on a per-request basis in this fashion: batchSize, requestId, userAgent and evaluationTimeout (formerly scriptEvaluationTimeout which is also supported but now deprecated). These options are available as constants on the Gremlin.Net.Driver.Tokens class.

Common Imports

There are a number of classes, functions and tokens that are typically used with Gremlin. The following imports provide most of the typical functionality required to use Gremlin:

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsTests.cs[role=include]

Configuration

The connection properties for the Gremlin.Net driver can be passed to the GremlinServer instance as keyword arguments:

Key Description Default

hostname

The hostname that the driver will connect to.

localhost

port

The port on which Gremlin Server can be reached.

8182

enableSsl

Determines if SSL should be enabled or not. If enabled on the server then it must be enabled on the client.

false

username

The username to submit on requests that require authentication.

none

password

The password to submit on requests that require authentication.

none

Connection Pool

It is also possible to configure the ConnectionPool of the Gremlin.Net driver. These configuration options can be set as properties on the ConnectionPoolSettings instance that can be passed to the GremlinClient:

Key Description Default

PoolSize

The size of the connection pool.

4

MaxInProcessPerConnection

The maximum number of in-flight requests that can occur on a connection.

32

ReconnectionAttempts

The number of attempts to get an open connection from the pool to submit a request.

4

ReconnectionBaseDelay

The base delay used for the exponential backoff for the reconnection attempts.

1 s

A NoConnectionAvailableException is thrown if all connections have reached the MaxInProcessPerConnection limit when a new request comes in. A ServerUnavailableException is thrown if no connection is available to the server to submit a request after ReconnectionAttempts retries.

Serialization

The Gremlin.Net driver uses by default GraphSON 3.0 but it is also possible to use another serialization format by passing a message serializer when creating the GremlinClient.

GraphBinary can be configured like this:

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsTests.cs[role=include]

and GraphSON 2.0 like this:

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsTests.cs[role=include]

Traversal Strategies

In order to add and remove traversal strategies from a traversal source, Gremlin.Net has an AbstractTraversalStrategy class along with a collection of subclasses that mirror the standard Gremlin-Java strategies.

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsTests.cs[role=include]
Note
Many of the TraversalStrategy classes in Gremlin.Net are proxies to the respective strategy on Apache TinkerPop’s JVM-based Gremlin traversal machine. As such, their Apply(ITraversal) method does nothing. However, the strategy is encoded in the Gremlin.Net bytecode and transmitted to the Gremlin traversal machine for re-construction machine-side.

Transactions

To get a full understanding of this section, it would be good to start by reading the Transactions section of this documentation, which discusses transactions in the general context of TinkerPop itself. This section builds on that content by demonstrating the transactional syntax for C#.

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsTests.cs[role=include]

The Lambda Solution

Supporting anonymous functions across languages is difficult as most languages do not support lambda introspection and thus, code analysis. While Gremlin.Net doesn’t support C# lambdas, it is still able to represent lambdas in other languages. When the lambda is represented in Bytecode its language is encoded such that the remote connection host can infer which translator and ultimate execution engine to use.

g.V().Out().Map<int>(Lambda.Groovy("it.get().value('name').length()")).Sum<int>().ToList();      (1)
g.V().Out().Map<int>(Lambda.Python("lambda x: len(x.get().value('name'))")).Sum<int>().ToList(); (2)
  1. Lambda.Groovy() can be used to create a Groovy lambda.

  2. Lambda.Python() can be used to create a Python lambda.

The ILambda interface returned by these two methods inherits interfaces like IFunction and IPredicate that mirror their Java counterparts which makes it possible to use lambdas with Gremlin.Net for the same steps as in Gremlin-Java.

Tip
When running into situations where Groovy cannot properly discern a method signature based on the Lambda instance created, it will help to fully define the closure in the lambda expression - so rather than Lambda.Groovy("it.get().value('name')), prefer Lambda.Groovy("x → x.get().value('name')).

Submitting Scripts

Gremlin scripts are sent to the server from a IGremlinClient instance. A IGremlinClient is created as follows:

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsTests.cs[role=include]

If the remote system has authentication and SSL enabled, then the GremlinServer object can be configured as follows:

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsTests.cs[role=include]

It is also possible to initialize the Client to use sessions:

var gremlinServer = new GremlinServer("localhost", 8182);
var client = new GremlinClient(gremlinServer, sessionId: Guid.NewGuid().ToString()))

Per Request Settings

The GremlinClient.Submit() functions accept an option to build a raw RequestMessage. A good use-case for this feature is to set a per-request override to the evaluationTimeout so that it only applies to the current request.

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsTests.cs[role=include]

The following options are allowed on a per-request basis in this fashion: batchSize, requestId, userAgent and evaluationTimeout (formerly scriptEvaluationTimeout which is also supported but now deprecated). These options are available as constants on the Gremlin.Net.Driver.Tokens class.

Important
The preferred method for setting a per-request timeout for scripts is demonstrated above, but those familiar with bytecode may try g.with(EVALUATION_TIMEOUT, 500) within a script. Scripts with multiple traversals and multiple timeouts will be interpreted as a sum of all timeouts identified in the script for that request.

Domain Specific Languages

Developing a Domain Specific Language (DSL) for .Net is most easily implemented using Extension Methods as they don’t require direct extension of classes in the TinkerPop hierarchy. Extension Method classes simply need to be constructed for the GraphTraversal and the GraphTraversalSource. Unfortunately, anonymous traversals (spawned from __) can’t use the Extension Method approach as they do not work for static classes and static classes can’t be extended. The only option is to re-implement the methods of __ as a wrapper in the anonymous traversal for the DSL or to simply create a static class for the DSL and use the two anonymous traversals creators independently. The following example uses the latter approach as it saves a lot of boilerplate code with the minor annoyance of having a second static class to deal with when writing traversals rather than just calling __ for everything.

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsDsl.cs[role=include]

Note the creation of __Social as the Social DSL’s "extension" to the available ways in which to spawn anonymous traversals. The use of the double underscore prefix in the name is just a convention to consider using and is not a requirement. To use the DSL, bring it into scope with the using directive:

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsDslTests.cs[role=include]

and then it can be called from the application as follows:

link:../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Reference/GremlinVariantsDslTests.cs[role=include]

Differences

The biggest difference between Gremlin in .NET and the canonical version in Java is the casing of steps. Canonical Gremlin utilizes camelCase as is typical in Java for function names, but C# utilizes PascalCase as it is more typical in that language. Therefore, when viewing a typical Gremlin example written in Gremlin Console, the conversion to C# usually just requires capitalization of the first letter in the step name, thus the following example in Groovy:

g.V().has('person','name','marko').
  out('knows').
  elementMap().toList()

would become the following in C#:

g.V().Has("Person","name","marko").
  Out("knows").
  ElementMap().ToList();

In addition to the uppercase change, also note the conversion of the single quotes to double quotes as is expected for declaring string values in C# and the addition of the semi-colon at the end of the line. In short, don’t forget to apply the common syntax expectations for C# when trying to convert an example of Gremlin from a different language.

Another common conversion issues lies in having to explicitly define generics, which can make canonical Gremlin appear much more complex in C# where type erasure is not a feature of the language. For example, the following example in Groovy:

g.V().repeat(__.out()).times(2).values('name')

must be written as:

g.V().Repeat(__.Out()).Times(2).Values<string>("name");

Gremlin allows for Map instances to include null keys, but null keys in C# Dictionary instances are not allowed. It is therefore necessary to rewrite a traversal such as:

g.V().groupCount().by('age')

where "age" is not a valid key for all vertices in a way that will remove the need for a null to be returned.

g.V().has('age').groupCount().by('age')
g.V().hasLabel('person').groupCount().by('age')

Either of the above two options accomplishes the desired goal as both prevent groupCount() from having to process the possibility of null.

Limitations

  • The subgraph()-step is not supported by any variant that is not running on the Java Virtual Machine as there is no Graph instance to deserialize a result into on the client-side. A workaround is to replace the step with aggregate(local) and then convert those results to something the client can use locally.

Application Examples

This dotnet template helps getting started with Gremlin.Net. It creates a new C# console project that shows how to connect to a Gremlin Server with Gremlin.Net.

You can install the template with the dotnet CLI tool:

dotnet new -i Gremlin.Net.Template

After the template is installed, a new project based on this template can be installed:

dotnet new gremlin

Specify the output directory for the new project which will then also be used as the name of the created project:

dotnet new gremlin -o MyFirstGremlinProject

Gremlin-JavaScript

gremlin js Apache TinkerPop’s Gremlin-JavaScript implements Gremlin within the JavaScript language. It targets Node.js runtime and can be used on different operating systems on any Node.js 6 or above. Since the JavaScript naming conventions are very similar to that of Java, it should be very easy to switch between Gremlin-Java and Gremlin-JavaScript.

npm install gremlin

Connecting

The pattern for connecting is described in Connecting Gremlin and it basically distills down to creating a GraphTraversalSource. A GraphTraversalSource is created from the AnonymousTraversalSource.traversal() method where the "g" provided to the DriverRemoteConnection corresponds to the name of a GraphTraversalSource on the remote end.

const g = traversal().withRemote(new DriverRemoteConnection('ws://localhost:8182/gremlin'));

Gremlin-JavaScript supports plain text SASL authentication, you can set it on the connection options.

const authenticator = new gremlin.driver.auth.PlainTextSaslAuthenticator('myuser', 'mypassword');
const g = traversal().withRemote(new DriverRemoteConnection('ws://localhost:8182/gremlin', { authenticator });

Given that I/O operations in Node.js are asynchronous by default, Terminal Steps return a Promise:

  • Traversal.toList(): Returns a Promise with an Array as result value.

  • Traversal.next(): Returns a Promise with a { value, done } tuple as result value, according to the async iterator proposal.

  • Traversal.iterate(): Returns a Promise without a value.

For example:

g.V().hasLabel('person').values('name').toList()
  .then(names => console.log(names));

When using async functions it is possible to await the promises:

const names = await g.V().hasLabel('person').values('name').toList();
console.log(names);

Some connection options can also be set on individual requests made through the using with() step on the TraversalSource. For instance to set request timeout to 500 milliseconds:

const vertices = await g.with_('evaluationTimeout', 500).V().out('knows').toList()

The following options are allowed on a per-request basis in this fashion: batchSize, requestId, userAgent and evaluationTimeout (formerly scriptEvaluationTimeout which is also supported but now deprecated).

Common Imports

There are a number of classes, functions and tokens that are typically used with Gremlin. The following imports provide most of the typical functionality required to use Gremlin:

const gremlin = require('gremlin');
const traversal = gremlin.process.AnonymousTraversalSource.traversal;
const __ = gremlin.process.statics;
const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection;
const column = gremlin.process.column
const direction = gremlin.process.direction
const p = gremlin.process.P
const textp = gremlin.process.TextP
const pick = gremlin.process.pick
const pop = gremlin.process.pop
const order = gremlin.process.order
const scope = gremlin.process.scope
const t = gremlin.process.t

By defining these imports it becomes possible to write Gremlin in the more shorthand, canonical style that is demonstrated in most examples found here in the documentation:

const { P: { gt } } = gremlin.process;
const { order: { desc } } = gremlin.process;
g.V().hasLabel('person').has('age',gt(30)).order().by('age',desc).toList()

Transactions

To get a full understanding of this section, it would be good to start by reading the Transactions section of this documentation, which discusses transactions in the general context of TinkerPop itself. This section builds on that content by demonstrating the transactional syntax for Javascript.

const g = traversal().withRemote(new DriverRemoteConnection('ws://localhost:8182/gremlin'));
const tx = g.tx(); // create a Transaction

// spawn a new GraphTraversalSource binding all traversals established from it to tx
const gtx = tx.begin();

// execute traversals using gtx occur within the scope of the transaction held by tx. the
// tx is closed after calls to commit or rollback and cannot be re-used. simply spawn a
// new Transaction from g.tx() to create a new one as needed. the g context remains
// accessible through all this as a sessionless connection.
Promise.all([
  gtx.addV("person").property("name", "jorge").iterate(),
  gtx.addV("person").property("name", "josh").iterate()
]).then(() => {
  return tx.commit();
}).catch(() => {
  return tx.rollback();
});

The Lambda Solution

Supporting anonymous functions across languages is difficult as most languages do not support lambda introspection and thus, code analysis. In Gremlin-Javascript, a Gremlin lambda should be represented as a zero-arg callable that returns a string representation of the lambda expected for use in the traversal. The returned lambda should be written as a Gremlin-Groovy string. When the lambda is represented in Bytecode its language is encoded such that the remote connection host can infer which translator and ultimate execution engine to use.

g.V().out().
  map(() => "it.get().value('name').length()").
  sum().
  toList().then(total => console.log(total))
Tip
When running into situations where Groovy cannot properly discern a method signature based on the Lambda instance created, it will help to fully define the closure in the lambda expression - so rather than () ⇒ "it.get().value('name')", prefer () ⇒ "x → x.get().value('name')".
Warning
As explained throughout the documentation, when possible avoid lambdas.

Submitting Scripts

It is possible to submit parametrized Gremlin scripts to the server as strings, using the Client class:

const gremlin = require('gremlin');
const client = new gremlin.driver.Client('ws://localhost:8182/gremlin', { traversalSource: 'g' });

const result1 = await client.submit('g.V(vid)', { vid: 1 });
const vertex = result1.first();

const result2 = await client.submit('g.V().hasLabel(label).tail(n)', { label: 'person', n: 3 });

// ResultSet is an iterable
for (const vertex of result2) {
  console.log(vertex.id);
}

It is also possible to initialize the Client to use sessions:

const client = new gremlin.driver.Client('ws://localhost:8182/gremlin', { traversalSource: 'g', 'session': 'unique-string-id' });

With this configuration, the state of variables within scripts are preserved between requests.

Per Request Settings

The client.submit() functions accept a requestOptions which expects a dictionary. The requestOptions provide a way to include options that are specific to the request made with the call to submit(). A good use-case for this feature is to set a per-request override to the evaluationTimeout so that it only applies to the current request.

const result = await client.submit("g.V().repeat(both()).times(100)", null, { evaluationTimeout: 5000 })

The following options are allowed on a per-request basis in this fashion: batchSize, requestId, userAgent and evaluationTimeout (formerly scriptEvaluationTimeout which is also supported but now deprecated).

Important
The preferred method for setting a per-request timeout for scripts is demonstrated above, but those familiar with bytecode may try g.with(EVALUATION_TIMEOUT, 500) within a script. Scripts with multiple traversals and multiple timeouts will be interpreted as a sum of all timeouts identified in the script for that request.

Domain Specific Languages

Developing Gremlin DSLs in JavaScript largely requires extension of existing core classes with use of standalone functions for anonymous traversal spawning. The pattern is demonstrated in the following example:

class SocialTraversal extends GraphTraversal {
  constructor(graph, traversalStrategies, bytecode) {
    super(graph, traversalStrategies, bytecode);
  }

  aged(age) {
    return this.has('person', 'age', age);
  }
}

class SocialTraversalSource extends GraphTraversalSource {
  constructor(graph, traversalStrategies, bytecode) {
    super(graph, traversalStrategies, bytecode, SocialTraversalSource, SocialTraversal);
  }

  person(name) {
    return this.V().has('person', 'name', name);
  }
}

function anonymous() {
  return new SocialTraversal(null, null, new Bytecode());
}

function aged(age) {
  return anonymous().aged(age);
}

SocialTraversal extends the core GraphTraversal class and has a three argument constructor which is immediately proxied to the GraphTraversal constructor. New DSL steps are then added to this class using available steps to construct the underlying traversal to execute as demonstrated in the aged() step.

The SocialTraversal is spawned from a SocialTraversalSource which is extended from GraphTraversalSource. Steps added here are meant to be start steps. In the above case, the person() start step find a "person" vertex to begin the traversal from.

Typically, steps that are made available on a GraphTraversal (i.e. SocialTraversal in this example) should also be made available as spawns for anonymous traversals. The recommendation is that these steps be exposed in the module as standalone functions. In the example above, the standalone aged() step creates an anonymous traversal through an anonymous() utility function. The method for creating these standalone functions can be handled in other ways if desired.

To use the DSL, simply initialize the g as follows:

const g = traversal(SocialTraversalSource).withRemote(connection);
g.person('marko').aged(29).values('name').toList().
  then(names => console.log(names));

Differences

In situations where Javascript reserved words and global functions overlap with standard Gremlin steps and tokens, those bits of conflicting Gremlin get an underscore appended as a suffix:

Steps - from_(), in_(), with_()

Gremlin allows for Map instances to include null keys, but null keys in Javascript have some interesting behavior as in:

> var a = { null: 'something', 'b': 'else' };
> JSON.stringify(a)
'{"null":"something","b":"else"}'
> JSON.parse(JSON.stringify(a))
{ null: 'something', b: 'else' }
> a[null]
'something'
> a['null']
'something'

This behavior needs to be considered when using Gremlin to return such results. A typical situation where this might happen is with group() or groupCount() as in:

g.V().groupCount().by('age')

where "age" is not a valid key for all vertices. In these cases, it will return null for that key and group on that. It may bet better in Javascript to filter away those vertices to avoid the return of null in the returned Map:

g.V().has('age').groupCount().by('age')
g.V().hasLabel('person').groupCount().by('age')

Either of the above two options accomplishes the desired goal as both prevent groupCount() from having to process the possibility of null.

Limitations

  • The subgraph()-step is not supported by any variant that is not running on the Java Virtual Machine as there is no Graph instance to deserialize a result into on the client-side. A workaround is to replace the step with aggregate(local) and then convert those results to something the client can use locally.