Euler OP EN

Graph initialization
Node/edge sampling OP
Random walk OP
Node neighbor visiting OP
Node/edge attribute visiting OP

tf_euler defines a series of OPs as the bridge between TensorFlow and Euler. Users can directly use these OPs in TensorFlow computing graph for graph query. All OPs can be accessed in the top-level module of tf_euler.

Graph initialization

Before running the TensorFlow computing graph, you need to call one of the following two functions to initialize the graph.

# Load an entire graph and use it independently
tf_euler.initialize_embedded_graph(
    directory, # graph data directory
    graph_type # graph type: compact / fast, defaults to compact
) # Return a bool to indicate whether the graph is initialized successfully.

# Automatically split and share graph data between different workers
tf_euler.initialize_shared_graph(
    directory, # graph data directory
    zk_addr, # Zookeeper address, ip:port
    zk_path,
    shard_idx
    shard_num,
    global_sampler_type, # global_sampler_type: all / node / edge / none, defaults to node.
    graph_type, # graph type, compact / fast, defaults to compact.
    server_thread_num # euler service thread number, defaults to 4.
) # return bool to indicate whether the graph is initialized successfully.

In shared mode, the entire graph is split into multiple shards, and each shard can have multiple worker / euler instances.

Node/edge sampling OP

tf_euler provides node-by-node sampling by type.

tf_euler.ALL_NODE_TYPE # int, the node_type used with tf_euler.sample_node for full-type sampling.

tf_euler.sample_node(
  count, # 0-D int32 tf.Tensor, sampling number
  node_type # 0-D int32 tf.Tensor, sampling node_type
) # return 1-D int64 tf.Tensor [count], the sampling results

tf_euler.sample_edge(
  count, # 0-D int32 tf.Tensor, sampling number
  edge_type # 0-D int32 tf.Tensor, sampling node_type
) # return 2-D int64 tf.Tensor [count, 2], the sampling results

# Decide which type of node to be sampled based on the type of src_node
tf_euler.sample_node_with_src(
  src_nodes, # 1-D int64 tf.Tensor, src node ID, used to decide which type to be sampled, shape：[src_node_count]
  n, # 0-D int32 tf.Tensor, the sampling number for each src_id
  
  # bool, defaults to False to independently sample N nodes for each src_node
  # if set as True, the same type of src_nodes share n nodes.
  share_sample 
) 
# return 2-D int64 tf.Tensor, sampling results, shape: [src_node_count, n]

Random walk OP

tf_euler provides meta-path guided random walk and node pairs generation operations to define a random walk model. Refer to Node2Vec for detail.

tf_euler.random_walk(
  nodes, # 1-D int64 tf.Tensor, the source/start node for random walk.
  edge_types, # Python list of int32 tf.Tensor, the walked edge type in each step.
  p, # float, defaults to 1
  q, # float, defaults to 1
  default_node # int, the padding value when the source node doesn't exist or there is not any out-going edge, defaults to -1.
) # return 2-D int64 tf.Tensor [#nodes, len(edge_types) + 1], the random walk results

tf_euler.gen_pair(
  paths, # 2-D int64 tf.Tensor, random walk results
  left_win_size, # int, left windows size
  right_win_size # int, right windows size
) # return 3-D int64 tf.Tensor [#paths, #pairs, 2], node pairs

Node neighbor visiting OP

tf_euler provides two ways to visit a part of neighbors by edge types:

Weight sampling, single-hop version and multi-hop version;
Take top K according to the weight.

and two ways to visit the full neighbors:

The full neighbors;
The full neighbors sorted by node id.

tf_euler.sample_neighbor(
  nodes, # 1-D int64 tf.Tensor, the source/start nodes
  edge_types, # 1-D int64 tf.Tensor, the edge types
  count, # int, number of samples per node
  default_node # the padding value for when the source node doesn't exist or there is not any out-going edge, defaults to -1.
) # return a 3-D Python tuple: (neighbor, weight, edge_type)
  # (2-D int64 tf.Tensor [#nodes, count],
  #  2-D float tf.Tensor [#nodes, count],
  #  2-D int32 tf.Tensor [#nodes, count])

tf_euler.sample_fanout(
  nodes, # 1-D int64 tf.Tensor, the source/start nodes
  edge_types, # list of 1-D int64 tf.Tensor, the edge types per hop
  counts, # list of int, number of samples per hop
  default_node, # the padding value for when the source node doesn't exist or there is not any out-going edge, defaults to -1.
) # return a 3-D Python tuple: (neighbor_list, weight_list, edge_type_list)
  # ([2-D int64 tf.Tensor [#nodes], [#nodes x count1], [#nodes x count1 x count2], ...],
  #  [2-D float tf.Tensor [#nodes x count1], [#nodes x count1 x count2], ...],
  #  [2-D int32 tf.Tensor [#nodes x count1], [#nodes x count1 x count2], ...] )

tf_euler.get_top_k_neighbor(
  nodes, # 1-D int64 tf.Tensor, the source/start nodes
  edge_types, # 1-D int64 tf.Tensor, the edge types
  k, # int
  default_node # the padding value, defaults to -1.
) # return a 2-D Python tuple:
  # (2-D int64 tf.Tensor [#nodes, k],
  #  2-D float tf.Tensor [#nodes, k],
  #  2-D int32 tf.Tensor [#nodes, k])

tf_euler.get_full_neighbor(
  nodes, # 1-D int64 tf.Tensor, the source/start nodes
  edge_types # 1-D int64 tf.Tensor, the edge types
) # return a 2-D Python tuple:
  # (2-D int64 tf.SparseTensor [#nodes, max_num_neighbors],
  #  2-D float tf.SparseTensor [#nodes, max_num_neighbors],
  #  2-D int32 tf.SparseTensor [#nodes, max_num_neighbors])

tf_euler.get_sorted_full_neighbor(
  nodes, # 1-D int64 tf.Tensor, the source/start nodes
  edge_types # 1-D int64 tf.Tensor, the edge types
) # return a 2-D Python tuple:
  # (2-D int64 tf.SparseTensor [#nodes, max_num_neighbors],
  #  2-D float tf.SparseTensor [#nodes, max_num_neighbors],
  #  2-D int32 tf.SparseTensor [#nodes, max_num_neighbors])

Node/edge attribute visiting OP

tf_euler classifies features into three types: dense, sparse, and binary.

The dense feature corresponds to the Float32Feature in the graph and can be used as an input to the fully connected layer or convolved layer in the model;
The sparse feature corresponds to the UInt64Feature in the graph and can be used as an input to tf.nn.embedding_lookop_sparse in the model;
The binary feature corresponds to the BinaryFeature in the graph and can be used in a user-defined way.

tf_euler provides the operations to visit the three features for both nodes and edges.

tf_euler.get_dense_feature(
  nodes, # 1-D int64 tf.Tensor, the set of nodes
  feature_ids, # list of int, feature id of each dense feature
  dimensions # list of int, dimension of each dense feature
) # return 2-D float tf.Tensor [#nodes, dimension]

tf_euler.get_sparse_feature(
  nodes, # 1-D int64 tf.Tensor, the set of nodes
  feature_ids, # list of int, feature id of each sparse feature
  default_values # list of int, default value used when a node has no values for a feature
) # return 2-D int64 tf.SparseTensor [#nodes, max_num_features]

tf_euler.get_binary_feature(
  nodes, # 1-D int64 tf.Tensor, the set of nodes
  feature_ids # list of int, feature id of each binary feature
) # return 1-D string tf.Tensor [#nodes]

tf_euler.get_edge_dense_feature(
  edges, # 2-D int64 tf.Tensor [#edges, 3], the set of edges
  feature_ids, # list of int, feature id of each dense feature
  dimensions # list of int, dimension of each dense feature
) # return 2-D float tf.Tensor [#edges, dimension]

tf_euler.get_edge_sparse_feature(
  edges, # 2-D int64 tf.Tensor [#edges, 3], the set of edges
  feature_ids, # list of int, feature id of each sparse feature
  default_value # list of int, default value used when a node has no values for a feature
) # return 2-D int64 tf.SparseTensor [#edges, max_num_features]

tf_euler.get_edge_binary_feature(
  edges, # 2-D int64 tf.Tensor [#edges, 3], the set of edges
  feature_ids # list of int, feature id of each binary feature
) # return 1-D string tf.Tensor [#edges]