-
Notifications
You must be signed in to change notification settings - Fork 558
Euler OP EN
- Graph initialization
- Node/edge sampling OP
- Random walk OP
- Node neighbor visiting OP
- Node/edge attribute visiting OP
tf_euler
defines a series of OPs as the bridge between TensorFlow and Euler. Users can directly use these OPs in TensorFlow computing graph for graph query. All OPs can be accessed in the top-level module of tf_euler
.
Before running the TensorFlow computing graph, you need to call one of the following two functions to initialize the graph.
# Load an entire graph and use it independently
tf_euler.initialize_embedded_graph(
directory, # graph data directory
graph_type # graph type: compact / fast, defaults to compact
) # Return a bool to indicate whether the graph is initialized successfully.
# Automatically split and share graph data between different workers
tf_euler.initialize_shared_graph(
directory, # graph data directory
zk_addr, # Zookeeper address, ip:port
zk_path,
shard_idx
shard_num,
global_sampler_type, # global_sampler_type: all / node / edge / none, defaults to node.
graph_type, # graph type, compact / fast, defaults to compact.
server_thread_num # euler service thread number, defaults to 4.
) # return bool to indicate whether the graph is initialized successfully.
In shared mode, the entire graph is split into multiple shards, and each shard can have multiple worker / euler instances.
tf_euler
provides node-by-node sampling by type.
tf_euler.ALL_NODE_TYPE # int, the node_type used with tf_euler.sample_node for full-type sampling.
tf_euler.sample_node(
count, # 0-D int32 tf.Tensor, sampling number
node_type # 0-D int32 tf.Tensor, sampling node_type
) # return 1-D int64 tf.Tensor [count], the sampling results
tf_euler.sample_edge(
count, # 0-D int32 tf.Tensor, sampling number
edge_type # 0-D int32 tf.Tensor, sampling node_type
) # return 2-D int64 tf.Tensor [count, 2], the sampling results
# Decide which type of node to be sampled based on the type of src_node
tf_euler.sample_node_with_src(
src_nodes, # 1-D int64 tf.Tensor, src node ID, used to decide which type to be sampled, shape:[src_node_count]
n, # 0-D int32 tf.Tensor, the sampling number for each src_id
# bool, defaults to False to independently sample N nodes for each src_node
# if set as True, the same type of src_nodes share n nodes.
share_sample
)
# return 2-D int64 tf.Tensor, sampling results, shape: [src_node_count, n]
tf_euler
provides meta-path guided random walk and node pairs generation operations to define a random walk model. Refer to Node2Vec for detail.
tf_euler.random_walk(
nodes, # 1-D int64 tf.Tensor, the source/start node for random walk.
edge_types, # Python list of int32 tf.Tensor, the walked edge type in each step.
p, # float, defaults to 1
q, # float, defaults to 1
default_node # int, the padding value when the source node doesn't exist or there is not any out-going edge, defaults to -1.
) # return 2-D int64 tf.Tensor [#nodes, len(edge_types) + 1], the random walk results
tf_euler.gen_pair(
paths, # 2-D int64 tf.Tensor, random walk results
left_win_size, # int, left windows size
right_win_size # int, right windows size
) # return 3-D int64 tf.Tensor [#paths, #pairs, 2], node pairs
tf_euler
provides two ways to visit a part of neighbors by edge types:
- Weight sampling, single-hop version and multi-hop version;
- Take top K according to the weight.
and two ways to visit the full neighbors:
- The full neighbors;
- The full neighbors sorted by node id.
tf_euler.sample_neighbor(
nodes, # 1-D int64 tf.Tensor, the source/start nodes
edge_types, # 1-D int64 tf.Tensor, the edge types
count, # int, number of samples per node
default_node # the padding value for when the source node doesn't exist or there is not any out-going edge, defaults to -1.
) # return a 3-D Python tuple: (neighbor, weight, edge_type)
# (2-D int64 tf.Tensor [#nodes, count],
# 2-D float tf.Tensor [#nodes, count],
# 2-D int32 tf.Tensor [#nodes, count])
tf_euler.sample_fanout(
nodes, # 1-D int64 tf.Tensor, the source/start nodes
edge_types, # list of 1-D int64 tf.Tensor, the edge types per hop
counts, # list of int, number of samples per hop
default_node, # the padding value for when the source node doesn't exist or there is not any out-going edge, defaults to -1.
) # return a 3-D Python tuple: (neighbor_list, weight_list, edge_type_list)
# ([2-D int64 tf.Tensor [#nodes], [#nodes x count1], [#nodes x count1 x count2], ...],
# [2-D float tf.Tensor [#nodes x count1], [#nodes x count1 x count2], ...],
# [2-D int32 tf.Tensor [#nodes x count1], [#nodes x count1 x count2], ...] )
tf_euler.get_top_k_neighbor(
nodes, # 1-D int64 tf.Tensor, the source/start nodes
edge_types, # 1-D int64 tf.Tensor, the edge types
k, # int
default_node # the padding value, defaults to -1.
) # return a 2-D Python tuple:
# (2-D int64 tf.Tensor [#nodes, k],
# 2-D float tf.Tensor [#nodes, k],
# 2-D int32 tf.Tensor [#nodes, k])
tf_euler.get_full_neighbor(
nodes, # 1-D int64 tf.Tensor, the source/start nodes
edge_types # 1-D int64 tf.Tensor, the edge types
) # return a 2-D Python tuple:
# (2-D int64 tf.SparseTensor [#nodes, max_num_neighbors],
# 2-D float tf.SparseTensor [#nodes, max_num_neighbors],
# 2-D int32 tf.SparseTensor [#nodes, max_num_neighbors])
tf_euler.get_sorted_full_neighbor(
nodes, # 1-D int64 tf.Tensor, the source/start nodes
edge_types # 1-D int64 tf.Tensor, the edge types
) # return a 2-D Python tuple:
# (2-D int64 tf.SparseTensor [#nodes, max_num_neighbors],
# 2-D float tf.SparseTensor [#nodes, max_num_neighbors],
# 2-D int32 tf.SparseTensor [#nodes, max_num_neighbors])
tf_euler
classifies features into three types: dense, sparse, and binary.
- The dense feature corresponds to the Float32Feature in the graph and can be used as an input to the fully connected layer or convolved layer in the model;
- The sparse feature corresponds to the UInt64Feature in the graph and can be used as an input to
tf.nn.embedding_lookop_sparse
in the model; - The binary feature corresponds to the BinaryFeature in the graph and can be used in a user-defined way.
tf_euler
provides the operations to visit the three features for both nodes and edges.
tf_euler.get_dense_feature(
nodes, # 1-D int64 tf.Tensor, the set of nodes
feature_ids, # list of int, feature id of each dense feature
dimensions # list of int, dimension of each dense feature
) # return 2-D float tf.Tensor [#nodes, dimension]
tf_euler.get_sparse_feature(
nodes, # 1-D int64 tf.Tensor, the set of nodes
feature_ids, # list of int, feature id of each sparse feature
default_values # list of int, default value used when a node has no values for a feature
) # return 2-D int64 tf.SparseTensor [#nodes, max_num_features]
tf_euler.get_binary_feature(
nodes, # 1-D int64 tf.Tensor, the set of nodes
feature_ids # list of int, feature id of each binary feature
) # return 1-D string tf.Tensor [#nodes]
tf_euler.get_edge_dense_feature(
edges, # 2-D int64 tf.Tensor [#edges, 3], the set of edges
feature_ids, # list of int, feature id of each dense feature
dimensions # list of int, dimension of each dense feature
) # return 2-D float tf.Tensor [#edges, dimension]
tf_euler.get_edge_sparse_feature(
edges, # 2-D int64 tf.Tensor [#edges, 3], the set of edges
feature_ids, # list of int, feature id of each sparse feature
default_value # list of int, default value used when a node has no values for a feature
) # return 2-D int64 tf.SparseTensor [#edges, max_num_features]
tf_euler.get_edge_binary_feature(
edges, # 2-D int64 tf.Tensor [#edges, 3], the set of edges
feature_ids # list of int, feature id of each binary feature
) # return 1-D string tf.Tensor [#edges]