GenericWorkflow

class lsst.ctrl.bps.GenericWorkflow(name, incoming_graph_data=None, **attr)

Bases: networkx.classes.digraph.DiGraph

A generic representation of a workflow used to submit to specific workflow management systems.

Parameters:
name : str

Name of generic workflow.

incoming_graph_data : Any, optional

Data used to initialized graph that is passed through to nx.DiGraph constructor. Can be any type supported by networkx.DiGraph.

attr : dict

Keyword arguments passed through to nx.DiGraph constructor.

Attributes Summary

adj Graph adjacency object holding the neighbors of each node.
degree A DegreeView for the Graph as G.degree or G.degree().
edges An OutEdgeView of the DiGraph as G.edges or G.edges().
in_degree An InDegreeView for (node, in_degree) or in_degree for single node.
in_edges An InEdgeView of the Graph as G.in_edges or G.in_edges().
name Retrieve name of generic workflow.
nodes A NodeView of the Graph as G.nodes or G.nodes().
out_degree An OutDegreeView for (node, out_degree)
out_edges An OutEdgeView of the DiGraph as G.edges or G.edges().
pred Graph adjacency object holding the predecessors of each node.
succ Graph adjacency object holding the successors of each node.

Methods Summary

add_edge(u_of_edge, v_of_edge, **attr) Add edge connecting jobs in workflow.
add_edges_from(ebunch_to_add, **attr) Add several edges between jobs in the generic workflow.
add_job(job[, parent_names, child_names]) Add job to generic workflow.
add_job_inputs(job_name, files) Add files as inputs to specified job.
add_job_outputs(job_name, files) Add output files to a job.
add_job_relationships(parents, children) Add dependencies between parent and child jobs.
add_node(node_for_adding, **attr) Override networkx function to call more specific add_job function.
add_nodes_from(nodes_for_adding, **attr) Add multiple nodes.
add_weighted_edges_from(ebunch_to_add[, weight]) Add weighted edges in ebunch_to_add with specified weight attr
adjacency() Returns an iterator over (node, adjacency dict) tuples for all nodes.
clear() Remove all nodes and edges from the graph.
copy([as_view]) Returns a copy of the graph.
del_job(job_name) Delete job from generic workflow leaving connected graph.
draw(stream[, format_]) Output generic workflow in a visualization format.
edge_subgraph(edges) Returns the subgraph induced by the specified edges.
get_edge_data(u, v[, default]) Returns the attribute dictionary associated with edge (u, v).
get_file(name) Retrieve a file object by name.
get_files([data, transfer_only]) Retrieve files from generic workflow.
get_job(job_name) Retrieve job by name from workflow.
get_job_inputs(job_name[, data, transfer_only]) Return the input files for the given job.
get_job_outputs(job_name[, data, transfer_only]) Return the output files for the given job.
has_edge(u, v) Returns True if the edge (u, v) is in the graph.
has_node(n) Returns True if the graph contains the node n.
has_predecessor(u, v) Returns True if node u has predecessor v.
has_successor(u, v) Returns True if node u has successor v.
is_directed() Returns True if graph is directed, False otherwise.
is_multigraph() Returns True if graph is a multigraph, False otherwise.
load(stream[, format_]) Load a GenericWorkflow from the given stream
nbunch_iter([nbunch]) Returns an iterator over nodes contained in nbunch that are also in the graph.
neighbors(n) Returns an iterator over successor nodes of n.
number_of_edges([u, v]) Returns the number of edges between two nodes.
number_of_nodes() Returns the number of nodes in the graph.
order() Returns the number of nodes in the graph.
predecessors(n) Returns an iterator over predecessor nodes of n.
remove_edge(u, v) Remove the edge between u and v.
remove_edges_from(ebunch) Remove all edges specified in ebunch.
remove_node(n) Remove node n.
remove_nodes_from(nodes) Remove multiple nodes.
reverse([copy]) Returns the reverse of the graph.
save(stream[, format_]) Save the generic workflow in a format that is loadable.
size([weight]) Returns the number of edges or total of all edge weights.
subgraph(nodes) Returns a SubGraph view of the subgraph induced on nodes.
successors(n) Returns an iterator over successor nodes of n.
to_directed([as_view]) Returns a directed representation of the graph.
to_directed_class() Returns the class to use for empty directed copies.
to_undirected([reciprocal, as_view]) Returns an undirected representation of the digraph.
to_undirected_class() Returns the class to use for empty undirected copies.
update([edges, nodes]) Update the graph using nodes/edges/graphs as input.
validate() Run checks to ensure this is still a valid generic workflow graph.

Attributes Documentation

adj

Graph adjacency object holding the neighbors of each node.

This object is a read-only dict-like structure with node keys and neighbor-dict values. The neighbor-dict is keyed by neighbor to the edge-data-dict. So G.adj[3][2]['color'] = 'blue' sets the color of the edge (3, 2) to "blue".

Iterating over G.adj behaves like a dict. Useful idioms include for nbr, datadict in G.adj[n].items():.

The neighbor information is also provided by subscripting the graph. So for nbr, foovalue in G[node].data('foo', default=1): works.

For directed graphs, G.adj holds outgoing (successor) info.

degree

A DegreeView for the Graph as G.degree or G.degree().

The node degree is the number of edges adjacent to the node. The weighted node degree is the sum of the edge weights for edges incident to that node.

This object provides an iterator for (node, degree) as well as lookup for the degree for a single node.

Parameters:
nbunch : single node, container, or all nodes (default= all nodes)

The view will only report edges incident to these nodes.

weight : string or None, optional (default=None)

The name of an edge attribute that holds the numerical value used as a weight. If None, then each edge has weight 1. The degree is the sum of the edge weights adjacent to the node.

Returns:
If a single node is requested
deg : int

Degree of the node

OR if multiple nodes are requested
nd_iter : iterator

The iterator returns two-tuples of (node, degree).

See also

in_degree, out_degree

Examples

>>> G = nx.DiGraph()   # or MultiDiGraph
>>> nx.add_path(G, [0, 1, 2, 3])
>>> G.degree(0) # node 0 with degree 1
1
>>> list(G.degree([0, 1, 2]))
[(0, 1), (1, 2), (2, 2)]
edges

An OutEdgeView of the DiGraph as G.edges or G.edges().

edges(self, nbunch=None, data=False, default=None)

The OutEdgeView provides set-like operations on the edge-tuples as well as edge attribute lookup. When called, it also provides an EdgeDataView object which allows control of access to edge attributes (but does not provide set-like operations). Hence, G.edges[u, v]['color'] provides the value of the color attribute for edge (u, v) while for (u, v, c) in G.edges.data('color', default='red'): iterates through all the edges yielding the color attribute with default 'red' if no color attribute exists.

Parameters:
nbunch : single node, container, or all nodes (default= all nodes)

The view will only report edges incident to these nodes.

data : string or bool, optional (default=False)

The edge attribute returned in 3-tuple (u, v, ddict[data]). If True, return edge attribute dict in 3-tuple (u, v, ddict). If False, return 2-tuple (u, v).

default : value, optional (default=None)

Value used for edges that don’t have the requested attribute. Only relevant if data is not True or False.

Returns:
edges : OutEdgeView

A view of edge attributes, usually it iterates over (u, v) or (u, v, d) tuples of edges, but can also be used for attribute lookup as edges[u, v]['foo'].

See also

in_edges, out_edges

Notes

Nodes in nbunch that are not in the graph will be (quietly) ignored. For directed graphs this returns the out-edges.

Examples

>>> G = nx.DiGraph()   # or MultiDiGraph, etc
>>> nx.add_path(G, [0, 1, 2])
>>> G.add_edge(2, 3, weight=5)
>>> [e for e in G.edges]
[(0, 1), (1, 2), (2, 3)]
>>> G.edges.data()  # default data is {} (empty dict)
OutEdgeDataView([(0, 1, {}), (1, 2, {}), (2, 3, {'weight': 5})])
>>> G.edges.data('weight', default=1)
OutEdgeDataView([(0, 1, 1), (1, 2, 1), (2, 3, 5)])
>>> G.edges([0, 2])  # only edges incident to these nodes
OutEdgeDataView([(0, 1), (2, 3)])
>>> G.edges(0)  # only edges incident to a single node (use G.adj[0]?)
OutEdgeDataView([(0, 1)])
in_degree

An InDegreeView for (node, in_degree) or in_degree for single node.

The node in_degree is the number of edges pointing to the node. The weighted node degree is the sum of the edge weights for edges incident to that node.

This object provides an iteration over (node, in_degree) as well as lookup for the degree for a single node.

Parameters:
nbunch : single node, container, or all nodes (default= all nodes)

The view will only report edges incident to these nodes.

weight : string or None, optional (default=None)

The name of an edge attribute that holds the numerical value used as a weight. If None, then each edge has weight 1. The degree is the sum of the edge weights adjacent to the node.

Returns:
If a single node is requested
deg : int

In-degree of the node

OR if multiple nodes are requested
nd_iter : iterator

The iterator returns two-tuples of (node, in-degree).

See also

degree, out_degree

Examples

>>> G = nx.DiGraph()
>>> nx.add_path(G, [0, 1, 2, 3])
>>> G.in_degree(0) # node 0 with degree 0
0
>>> list(G.in_degree([0, 1, 2]))
[(0, 0), (1, 1), (2, 1)]
in_edges

An InEdgeView of the Graph as G.in_edges or G.in_edges().

in_edges(self, nbunch=None, data=False, default=None):

Parameters:
nbunch : single node, container, or all nodes (default= all nodes)

The view will only report edges incident to these nodes.

data : string or bool, optional (default=False)

The edge attribute returned in 3-tuple (u, v, ddict[data]). If True, return edge attribute dict in 3-tuple (u, v, ddict). If False, return 2-tuple (u, v).

default : value, optional (default=None)

Value used for edges that don’t have the requested attribute. Only relevant if data is not True or False.

Returns:
in_edges : InEdgeView

A view of edge attributes, usually it iterates over (u, v) or (u, v, d) tuples of edges, but can also be used for attribute lookup as edges[u, v]['foo'].

See also

edges

name

Retrieve name of generic workflow.

Returns:
name : str

Name of generic workflow.

nodes

A NodeView of the Graph as G.nodes or G.nodes().

Can be used as G.nodes for data lookup and for set-like operations. Can also be used as G.nodes(data='color', default=None) to return a NodeDataView which reports specific node data but no set operations. It presents a dict-like interface as well with G.nodes.items() iterating over (node, nodedata) 2-tuples and G.nodes[3]['foo'] providing the value of the foo attribute for node 3. In addition, a view G.nodes.data('foo') provides a dict-like interface to the foo attribute of each node. G.nodes.data('foo', default=1) provides a default for nodes that do not have attribute foo.

Parameters:
data : string or bool, optional (default=False)

The node attribute returned in 2-tuple (n, ddict[data]). If True, return entire node attribute dict as (n, ddict). If False, return just the nodes n.

default : value, optional (default=None)

Value used for nodes that don’t have the requested attribute. Only relevant if data is not True or False.

Returns:
NodeView

Allows set-like operations over the nodes as well as node attribute dict lookup and calling to get a NodeDataView. A NodeDataView iterates over (n, data) and has no set operations. A NodeView iterates over n and includes set operations.

When called, if data is False, an iterator over nodes. Otherwise an iterator of 2-tuples (node, attribute value) where the attribute is specified in data. If data is True then the attribute becomes the entire data dictionary.

Notes

If your node data is not needed, it is simpler and equivalent to use the expression for n in G, or list(G).

Examples

There are two simple ways of getting a list of all nodes in the graph:

>>> G = nx.path_graph(3)
>>> list(G.nodes)
[0, 1, 2]
>>> list(G)
[0, 1, 2]

To get the node data along with the nodes:

>>> G.add_node(1, time='5pm')
>>> G.nodes[0]['foo'] = 'bar'
>>> list(G.nodes(data=True))
[(0, {'foo': 'bar'}), (1, {'time': '5pm'}), (2, {})]
>>> list(G.nodes.data())
[(0, {'foo': 'bar'}), (1, {'time': '5pm'}), (2, {})]
>>> list(G.nodes(data='foo'))
[(0, 'bar'), (1, None), (2, None)]
>>> list(G.nodes.data('foo'))
[(0, 'bar'), (1, None), (2, None)]
>>> list(G.nodes(data='time'))
[(0, None), (1, '5pm'), (2, None)]
>>> list(G.nodes.data('time'))
[(0, None), (1, '5pm'), (2, None)]
>>> list(G.nodes(data='time', default='Not Available'))
[(0, 'Not Available'), (1, '5pm'), (2, 'Not Available')]
>>> list(G.nodes.data('time', default='Not Available'))
[(0, 'Not Available'), (1, '5pm'), (2, 'Not Available')]

If some of your nodes have an attribute and the rest are assumed to have a default attribute value you can create a dictionary from node/attribute pairs using the default keyword argument to guarantee the value is never None:

>>> G = nx.Graph()
>>> G.add_node(0)
>>> G.add_node(1, weight=2)
>>> G.add_node(2, weight=3)
>>> dict(G.nodes(data='weight', default=1))
{0: 1, 1: 2, 2: 3}
out_degree

An OutDegreeView for (node, out_degree)

The node out_degree is the number of edges pointing out of the node. The weighted node degree is the sum of the edge weights for edges incident to that node.

This object provides an iterator over (node, out_degree) as well as lookup for the degree for a single node.

Parameters:
nbunch : single node, container, or all nodes (default= all nodes)

The view will only report edges incident to these nodes.

weight : string or None, optional (default=None)

The name of an edge attribute that holds the numerical value used as a weight. If None, then each edge has weight 1. The degree is the sum of the edge weights adjacent to the node.

Returns:
If a single node is requested
deg : int

Out-degree of the node

OR if multiple nodes are requested
nd_iter : iterator

The iterator returns two-tuples of (node, out-degree).

See also

degree, in_degree

Examples

>>> G = nx.DiGraph()
>>> nx.add_path(G, [0, 1, 2, 3])
>>> G.out_degree(0) # node 0 with degree 1
1
>>> list(G.out_degree([0, 1, 2]))
[(0, 1), (1, 1), (2, 1)]
out_edges

An OutEdgeView of the DiGraph as G.edges or G.edges().

edges(self, nbunch=None, data=False, default=None)

The OutEdgeView provides set-like operations on the edge-tuples as well as edge attribute lookup. When called, it also provides an EdgeDataView object which allows control of access to edge attributes (but does not provide set-like operations). Hence, G.edges[u, v]['color'] provides the value of the color attribute for edge (u, v) while for (u, v, c) in G.edges.data('color', default='red'): iterates through all the edges yielding the color attribute with default 'red' if no color attribute exists.

Parameters:
nbunch : single node, container, or all nodes (default= all nodes)

The view will only report edges incident to these nodes.

data : string or bool, optional (default=False)

The edge attribute returned in 3-tuple (u, v, ddict[data]). If True, return edge attribute dict in 3-tuple (u, v, ddict). If False, return 2-tuple (u, v).

default : value, optional (default=None)

Value used for edges that don’t have the requested attribute. Only relevant if data is not True or False.

Returns:
edges : OutEdgeView

A view of edge attributes, usually it iterates over (u, v) or (u, v, d) tuples of edges, but can also be used for attribute lookup as edges[u, v]['foo'].

See also

in_edges, out_edges

Notes

Nodes in nbunch that are not in the graph will be (quietly) ignored. For directed graphs this returns the out-edges.

Examples

>>> G = nx.DiGraph()   # or MultiDiGraph, etc
>>> nx.add_path(G, [0, 1, 2])
>>> G.add_edge(2, 3, weight=5)
>>> [e for e in G.edges]
[(0, 1), (1, 2), (2, 3)]
>>> G.edges.data()  # default data is {} (empty dict)
OutEdgeDataView([(0, 1, {}), (1, 2, {}), (2, 3, {'weight': 5})])
>>> G.edges.data('weight', default=1)
OutEdgeDataView([(0, 1, 1), (1, 2, 1), (2, 3, 5)])
>>> G.edges([0, 2])  # only edges incident to these nodes
OutEdgeDataView([(0, 1), (2, 3)])
>>> G.edges(0)  # only edges incident to a single node (use G.adj[0]?)
OutEdgeDataView([(0, 1)])
pred

Graph adjacency object holding the predecessors of each node.

This object is a read-only dict-like structure with node keys and neighbor-dict values. The neighbor-dict is keyed by neighbor to the edge-data-dict. So G.pred[2][3]['color'] = 'blue' sets the color of the edge (3, 2) to "blue".

Iterating over G.pred behaves like a dict. Useful idioms include for nbr, datadict in G.pred[n].items():. A data-view not provided by dicts also exists: for nbr, foovalue in G.pred[node].data('foo'): A default can be set via a default argument to the data method.

succ

Graph adjacency object holding the successors of each node.

This object is a read-only dict-like structure with node keys and neighbor-dict values. The neighbor-dict is keyed by neighbor to the edge-data-dict. So G.succ[3][2]['color'] = 'blue' sets the color of the edge (3, 2) to "blue".

Iterating over G.succ behaves like a dict. Useful idioms include for nbr, datadict in G.succ[n].items():. A data-view not provided by dicts also exists: for nbr, foovalue in G.succ[node].data('foo'): and a default can be set via a default argument to the data method.

The neighbor information is also provided by subscripting the graph. So for nbr, foovalue in G[node].data('foo', default=1): works.

For directed graphs, G.adj is identical to G.succ.

Methods Documentation

add_edge(u_of_edge: str, v_of_edge: str, **attr)

Add edge connecting jobs in workflow.

Parameters:
u_of_edge : str

Name of parent job.

v_of_edge : str

Name of child job.

attr : keyword arguments, optional

Attributes to save with edge.

add_edges_from(ebunch_to_add, **attr)

Add several edges between jobs in the generic workflow.

Parameters:
ebunch_to_add : Iterable of tuple of str

Iterable of job name pairs between which a dependency should be saved.

attr : keyword arguments, optional

Data can be assigned using keyword arguments (not currently used)

add_job(job, parent_names=None, child_names=None)

Add job to generic workflow.

Parameters:
job : GenericWorkflowJob

Job to add to the generic workflow.

parent_names : list of str, optional

Names of jobs that are parents of given job

child_names : list of str, optional

Names of jobs that are children of given job

add_job_inputs(job_name: str, files)

Add files as inputs to specified job.

Parameters:
job_name : str

Name of job to which inputs should be added

files : GenericWorkflowFile or list

File object(s) to be added as inputs to the specified job.

add_job_outputs(job_name, files)

Add output files to a job.

Parameters:
job_name : str

Name of job to which the files should be added as outputs.

files : list of GenericWorkflowFile

File objects to be added as outputs for specified job.

add_job_relationships(parents, children)

Add dependencies between parent and child jobs. All parents will be connected to all children.

Parameters:
parents : list of str

Parent job names.

children : list of str

Children job names.

add_node(node_for_adding, **attr)

Override networkx function to call more specific add_job function.

Parameters:
node_for_adding : GenericWorkflowJob

Job to be added to generic workflow.

attr :

Needed to match original networkx function, but not used.

add_nodes_from(nodes_for_adding, **attr)

Add multiple nodes.

Parameters:
nodes_for_adding : iterable container

A container of nodes (list, dict, set, etc.). OR A container of (node, attribute dict) tuples. Node attributes are updated using the attribute dict.

attr : keyword arguments, optional (default= no attributes)

Update attributes for all nodes in nodes. Node attributes specified in nodes as a tuple take precedence over attributes specified via keyword arguments.

See also

add_node

Examples

>>> G = nx.Graph()   # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.add_nodes_from('Hello')
>>> K3 = nx.Graph([(0, 1), (1, 2), (2, 0)])
>>> G.add_nodes_from(K3)
>>> sorted(G.nodes(), key=str)
[0, 1, 2, 'H', 'e', 'l', 'o']

Use keywords to update specific node attributes for every node.

>>> G.add_nodes_from([1, 2], size=10)
>>> G.add_nodes_from([3, 4], weight=0.4)

Use (node, attrdict) tuples to update attributes for specific nodes.

>>> G.add_nodes_from([(1, dict(size=11)), (2, {'color':'blue'})])
>>> G.nodes[1]['size']
11
>>> H = nx.Graph()
>>> H.add_nodes_from(G.nodes(data=True))
>>> H.nodes[1]['size']
11
add_weighted_edges_from(ebunch_to_add, weight='weight', **attr)

Add weighted edges in ebunch_to_add with specified weight attr

Parameters:
ebunch_to_add : container of edges

Each edge given in the list or container will be added to the graph. The edges must be given as 3-tuples (u, v, w) where w is a number.

weight : string, optional (default= ‘weight’)

The attribute name for the edge weights to be added.

attr : keyword arguments, optional (default= no attributes)

Edge attributes to add/update for all edges.

See also

add_edge
add a single edge
add_edges_from
add multiple edges

Notes

Adding the same edge twice for Graph/DiGraph simply updates the edge data. For MultiGraph/MultiDiGraph, duplicate edges are stored.

Examples

>>> G = nx.Graph()   # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.add_weighted_edges_from([(0, 1, 3.0), (1, 2, 7.5)])
adjacency()

Returns an iterator over (node, adjacency dict) tuples for all nodes.

For directed graphs, only outgoing neighbors/adjacencies are included.

Returns:
adj_iter : iterator

An iterator over (node, adjacency dictionary) for all nodes in the graph.

Examples

>>> G = nx.path_graph(4)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> [(n, nbrdict) for n, nbrdict in G.adjacency()]
[(0, {1: {}}), (1, {0: {}, 2: {}}), (2, {1: {}, 3: {}}), (3, {2: {}})]
clear()

Remove all nodes and edges from the graph.

This also removes the name, and all graph, node, and edge attributes.

Examples

>>> G = nx.path_graph(4)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.clear()
>>> list(G.nodes)
[]
>>> list(G.edges)
[]
copy(as_view=False)

Returns a copy of the graph.

The copy method by default returns an independent shallow copy of the graph and attributes. That is, if an attribute is a container, that container is shared by the original an the copy. Use Python’s copy.deepcopy for new containers.

If as_view is True then a view is returned instead of a copy.

Parameters:
as_view : bool, optional (default=False)

If True, the returned graph-view provides a read-only view of the original graph without actually copying any data.

Returns:
G : Graph

A copy of the graph.

See also

to_directed
return a directed copy of the graph.

Notes

All copies reproduce the graph structure, but data attributes may be handled in different ways. There are four types of copies of a graph that people might want.

Deepcopy – A “deepcopy” copies the graph structure as well as all data attributes and any objects they might contain. The entire graph object is new so that changes in the copy do not affect the original object. (see Python’s copy.deepcopy)

Data Reference (Shallow) – For a shallow copy the graph structure is copied but the edge, node and graph attribute dicts are references to those in the original graph. This saves time and memory but could cause confusion if you change an attribute in one graph and it changes the attribute in the other. NetworkX does not provide this level of shallow copy.

Independent Shallow – This copy creates new independent attribute dicts and then does a shallow copy of the attributes. That is, any attributes that are containers are shared between the new graph and the original. This is exactly what dict.copy() provides. You can obtain this style copy using:

>>> G = nx.path_graph(5)
>>> H = G.copy()
>>> H = G.copy(as_view=False)
>>> H = nx.Graph(G)
>>> H = G.__class__(G)

Fresh Data – For fresh data, the graph structure is copied while new empty data attribute dicts are created. The resulting graph is independent of the original and it has no edge, node or graph attributes. Fresh copies are not enabled. Instead use:

>>> H = G.__class__()
>>> H.add_nodes_from(G)
>>> H.add_edges_from(G.edges)

View – Inspired by dict-views, graph-views act like read-only versions of the original graph, providing a copy of the original structure without requiring any memory for copying the information.

See the Python copy module for more information on shallow and deep copies, https://docs.python.org/2/library/copy.html.

Examples

>>> G = nx.path_graph(4)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> H = G.copy()
del_job(job_name: str)

Delete job from generic workflow leaving connected graph.

Parameters:
job_name : str

Name of job to delete from workflow.

draw(stream, format_='dot')

Output generic workflow in a visualization format.

Parameters:
stream : str or io.BufferedIOBase

Stream to which the visualization should be written.

format_ : str, optional

Which visualization format to use. It defaults to the format for the dot program.

edge_subgraph(edges)

Returns the subgraph induced by the specified edges.

The induced subgraph contains each edge in edges and each node incident to any one of those edges.

Parameters:
edges : iterable

An iterable of edges in this graph.

Returns:
G : Graph

An edge-induced subgraph of this graph with the same edge attributes.

Notes

The graph, edge, and node attributes in the returned subgraph view are references to the corresponding attributes in the original graph. The view is read-only.

To create a full graph version of the subgraph with its own copy of the edge or node attributes, use:

>>> G.edge_subgraph(edges).copy()  

Examples

>>> G = nx.path_graph(5)
>>> H = G.edge_subgraph([(0, 1), (3, 4)])
>>> list(H.nodes)
[0, 1, 3, 4]
>>> list(H.edges)
[(0, 1), (3, 4)]
get_edge_data(u, v, default=None)

Returns the attribute dictionary associated with edge (u, v).

This is identical to G[u][v] except the default is returned instead of an exception if the edge doesn’t exist.

Parameters:
u, v : nodes
default: any Python object (default=None)

Value to return if the edge (u, v) is not found.

Returns:
edge_dict : dictionary

The edge attribute dictionary.

Examples

>>> G = nx.path_graph(4)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G[0][1]
{}

Warning: Assigning to G[u][v] is not permitted. But it is safe to assign attributes G[u][v]['foo']

>>> G[0][1]['weight'] = 7
>>> G[0][1]['weight']
7
>>> G[1][0]['weight']
7
>>> G = nx.path_graph(4)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.get_edge_data(0, 1)  # default edge data is {}
{}
>>> e = (0, 1)
>>> G.get_edge_data(*e)  # tuple form
{}
>>> G.get_edge_data('a', 'b', default=0)  # edge not in graph, return 0
0
get_file(name)

Retrieve a file object by name.

Parameters:
name : str

Name of file object

Returns:
file_ : GenericWorkflowFile

File matching given name.

get_files(data=False, transfer_only=True)

Retrieve files from generic workflow. Need API in case change way files are stored (e.g., make workflow a bipartite graph with jobs and files nodes).

Parameters:
data : bool, optional

Whether to return the file data as well as the file object name.

transfer_only : bool, optional

Whether to only return files for which a workflow management system would be responsible for transferring.

Returns:
files : list of GenericWorkflowFile

Files from generic workflow meeting specifications.

get_job(job_name: str)

Retrieve job by name from workflow.

Parameters:
job_name : str

Name of job to retrieve.

Returns:
job : GenericWorkflowJob

Job matching given job_name.

get_job_inputs(job_name, data=True, transfer_only=False)

Return the input files for the given job.

Parameters:
job_name : str

Name of the job.

data : bool, optional

Whether to return the file data as well as the file object name.

transfer_only : bool, optional

Whether to only return files for which a workflow management system would be responsible for transferring.

Returns:
inputs : list of GenericWorkflowFile

Input files for the given job.

get_job_outputs(job_name, data=True, transfer_only=False)

Return the output files for the given job.

Parameters:
job_name : str

Name of the job.

data : bool

Whether to return the file data as well as the file object name. It defaults to True thus returning file data as well.

transfer_only : bool

Whether to only return files for which a workflow management system would be responsible for transferring. It defaults to False thus returning all output files.

Returns:
outputs : list of GenericWorkflowFile

Output files for the given job.

has_edge(u, v)

Returns True if the edge (u, v) is in the graph.

This is the same as v in G[u] without KeyError exceptions.

Parameters:
u, v : nodes

Nodes can be, for example, strings or numbers. Nodes must be hashable (and not None) Python objects.

Returns:
edge_ind : bool

True if edge is in the graph, False otherwise.

Examples

>>> G = nx.path_graph(4)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.has_edge(0, 1)  # using two nodes
True
>>> e = (0, 1)
>>> G.has_edge(*e)  #  e is a 2-tuple (u, v)
True
>>> e = (0, 1, {'weight':7})
>>> G.has_edge(*e[:2])  # e is a 3-tuple (u, v, data_dictionary)
True

The following syntax are equivalent:

>>> G.has_edge(0, 1)
True
>>> 1 in G[0]  # though this gives KeyError if 0 not in G
True
has_node(n)

Returns True if the graph contains the node n.

Identical to n in G

Parameters:
n : node

Examples

>>> G = nx.path_graph(3)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.has_node(0)
True

It is more readable and simpler to use

>>> 0 in G
True
has_predecessor(u, v)

Returns True if node u has predecessor v.

This is true if graph has the edge u<-v.

has_successor(u, v)

Returns True if node u has successor v.

This is true if graph has the edge u->v.

is_directed()

Returns True if graph is directed, False otherwise.

is_multigraph()

Returns True if graph is a multigraph, False otherwise.

classmethod load(stream, format_='pickle')

Load a GenericWorkflow from the given stream

Parameters:
stream : str or io.BufferedIOBase

Stream to pass to the format-specific loader. Accepts anything that the loader accepts.

format_ : str, optional

Format of data to expect when loading from stream. It defaults to pickle format.

Returns:
generic_workflow : GenericWorkflow

Generic workflow loaded from the given stream

nbunch_iter(nbunch=None)

Returns an iterator over nodes contained in nbunch that are also in the graph.

The nodes in nbunch are checked for membership in the graph and if not are silently ignored.

Parameters:
nbunch : single node, container, or all nodes (default= all nodes)

The view will only report edges incident to these nodes.

Returns:
niter : iterator

An iterator over nodes in nbunch that are also in the graph. If nbunch is None, iterate over all nodes in the graph.

Raises:
NetworkXError

If nbunch is not a node or or sequence of nodes. If a node in nbunch is not hashable.

See also

Graph.__iter__

Notes

When nbunch is an iterator, the returned iterator yields values directly from nbunch, becoming exhausted when nbunch is exhausted.

To test whether nbunch is a single node, one can use “if nbunch in self:”, even after processing with this routine.

If nbunch is not a node or a (possibly empty) sequence/iterator or None, a NetworkXError is raised. Also, if any object in nbunch is not hashable, a NetworkXError is raised.

neighbors(n)

Returns an iterator over successor nodes of n.

A successor of n is a node m such that there exists a directed edge from n to m.

Parameters:
n : node

A node in the graph

Raises:
NetworkXError

If n is not in the graph.

See also

predecessors

Notes

neighbors() and successors() are the same.

number_of_edges(u=None, v=None)

Returns the number of edges between two nodes.

Parameters:
u, v : nodes, optional (default=all edges)

If u and v are specified, return the number of edges between u and v. Otherwise return the total number of all edges.

Returns:
nedges : int

The number of edges in the graph. If nodes u and v are specified return the number of edges between those nodes. If the graph is directed, this only returns the number of edges from u to v.

See also

size

Examples

For undirected graphs, this method counts the total number of edges in the graph:

>>> G = nx.path_graph(4)
>>> G.number_of_edges()
3

If you specify two nodes, this counts the total number of edges joining the two nodes:

>>> G.number_of_edges(0, 1)
1

For directed graphs, this method can count the total number of directed edges from u to v:

>>> G = nx.DiGraph()
>>> G.add_edge(0, 1)
>>> G.add_edge(1, 0)
>>> G.number_of_edges(0, 1)
1
number_of_nodes()

Returns the number of nodes in the graph.

Returns:
nnodes : int

The number of nodes in the graph.

See also

order, __len__

Examples

>>> G = nx.path_graph(3)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.number_of_nodes()
3
order()

Returns the number of nodes in the graph.

Returns:
nnodes : int

The number of nodes in the graph.

See also

number_of_nodes, __len__

Examples

>>> G = nx.path_graph(3)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.order()
3
predecessors(n)

Returns an iterator over predecessor nodes of n.

A predecessor of n is a node m such that there exists a directed edge from m to n.

Parameters:
n : node

A node in the graph

Raises:
NetworkXError

If n is not in the graph.

See also

successors

remove_edge(u, v)

Remove the edge between u and v.

Parameters:
u, v : nodes

Remove the edge between nodes u and v.

Raises:
NetworkXError

If there is not an edge between u and v.

See also

remove_edges_from
remove a collection of edges

Examples

>>> G = nx.Graph()   # or DiGraph, etc
>>> nx.add_path(G, [0, 1, 2, 3])
>>> G.remove_edge(0, 1)
>>> e = (1, 2)
>>> G.remove_edge(*e) # unpacks e from an edge tuple
>>> e = (2, 3, {'weight':7}) # an edge with attribute data
>>> G.remove_edge(*e[:2]) # select first part of edge tuple
remove_edges_from(ebunch)

Remove all edges specified in ebunch.

Parameters:
ebunch: list or container of edge tuples

Each edge given in the list or container will be removed from the graph. The edges can be:

  • 2-tuples (u, v) edge between u and v.
  • 3-tuples (u, v, k) where k is ignored.

See also

remove_edge
remove a single edge

Notes

Will fail silently if an edge in ebunch is not in the graph.

Examples

>>> G = nx.path_graph(4)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> ebunch = [(1, 2), (2, 3)]
>>> G.remove_edges_from(ebunch)
remove_node(n)

Remove node n.

Removes the node n and all adjacent edges. Attempting to remove a non-existent node will raise an exception.

Parameters:
n : node

A node in the graph

Raises:
NetworkXError

If n is not in the graph.

Examples

>>> G = nx.path_graph(3)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> list(G.edges)
[(0, 1), (1, 2)]
>>> G.remove_node(1)
>>> list(G.edges)
[]
remove_nodes_from(nodes)

Remove multiple nodes.

Parameters:
nodes : iterable container

A container of nodes (list, dict, set, etc.). If a node in the container is not in the graph it is silently ignored.

See also

remove_node

Examples

>>> G = nx.path_graph(3)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> e = list(G.nodes)
>>> e
[0, 1, 2]
>>> G.remove_nodes_from(e)
>>> list(G.nodes)
[]
reverse(copy=True)

Returns the reverse of the graph.

The reverse is a graph with the same nodes and edges but with the directions of the edges reversed.

Parameters:
copy : bool optional (default=True)

If True, return a new DiGraph holding the reversed edges. If False, the reverse graph is created using a view of the original graph.

save(stream, format_='pickle')

Save the generic workflow in a format that is loadable.

Parameters:
stream : str or io.BufferedIOBase

Stream to pass to the format-specific writer. Accepts anything that the writer accepts.

format_ : str, optional

Format in which to write the data. It defaults to pickle format.

size(weight=None)

Returns the number of edges or total of all edge weights.

Parameters:
weight : string or None, optional (default=None)

The edge attribute that holds the numerical value used as a weight. If None, then each edge has weight 1.

Returns:
size : numeric

The number of edges or (if weight keyword is provided) the total weight sum.

If weight is None, returns an int. Otherwise a float (or more general numeric if the weights are more general).

See also

number_of_edges

Examples

>>> G = nx.path_graph(4)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.size()
3
>>> G = nx.Graph()   # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.add_edge('a', 'b', weight=2)
>>> G.add_edge('b', 'c', weight=4)
>>> G.size()
2
>>> G.size(weight='weight')
6.0
subgraph(nodes)

Returns a SubGraph view of the subgraph induced on nodes.

The induced subgraph of the graph contains the nodes in nodes and the edges between those nodes.

Parameters:
nodes : list, iterable

A container of nodes which will be iterated through once.

Returns:
G : SubGraph View

A subgraph view of the graph. The graph structure cannot be changed but node/edge attributes can and are shared with the original graph.

Notes

The graph, edge and node attributes are shared with the original graph. Changes to the graph structure is ruled out by the view, but changes to attributes are reflected in the original graph.

To create a subgraph with its own copy of the edge/node attributes use: G.subgraph(nodes).copy()

For an inplace reduction of a graph to a subgraph you can remove nodes: G.remove_nodes_from([n for n in G if n not in set(nodes)])

Subgraph views are sometimes NOT what you want. In most cases where you want to do more than simply look at the induced edges, it makes more sense to just create the subgraph as its own graph with code like:

# Create a subgraph SG based on a (possibly multigraph) G
SG = G.__class__()
SG.add_nodes_from((n, G.nodes[n]) for n in largest_wcc)
if SG.is_multigraph:
    SG.add_edges_from((n, nbr, key, d)
        for n, nbrs in G.adj.items() if n in largest_wcc
        for nbr, keydict in nbrs.items() if nbr in largest_wcc
        for key, d in keydict.items())
else:
    SG.add_edges_from((n, nbr, d)
        for n, nbrs in G.adj.items() if n in largest_wcc
        for nbr, d in nbrs.items() if nbr in largest_wcc)
SG.graph.update(G.graph)

Examples

>>> G = nx.path_graph(4)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> H = G.subgraph([0, 1, 2])
>>> list(H.edges)
[(0, 1), (1, 2)]
successors(n)

Returns an iterator over successor nodes of n.

A successor of n is a node m such that there exists a directed edge from n to m.

Parameters:
n : node

A node in the graph

Raises:
NetworkXError

If n is not in the graph.

See also

predecessors

Notes

neighbors() and successors() are the same.

to_directed(as_view=False)

Returns a directed representation of the graph.

Returns:
G : DiGraph

A directed graph with the same name, same nodes, and with each edge (u, v, data) replaced by two directed edges (u, v, data) and (v, u, data).

Notes

This returns a “deepcopy” of the edge, node, and graph attributes which attempts to completely copy all of the data and references.

This is in contrast to the similar D=DiGraph(G) which returns a shallow copy of the data.

See the Python copy module for more information on shallow and deep copies, https://docs.python.org/2/library/copy.html.

Warning: If you have subclassed Graph to use dict-like objects in the data structure, those changes do not transfer to the DiGraph created by this method.

Examples

>>> G = nx.Graph()  # or MultiGraph, etc
>>> G.add_edge(0, 1)
>>> H = G.to_directed()
>>> list(H.edges)
[(0, 1), (1, 0)]

If already directed, return a (deep) copy

>>> G = nx.DiGraph()  # or MultiDiGraph, etc
>>> G.add_edge(0, 1)
>>> H = G.to_directed()
>>> list(H.edges)
[(0, 1)]
to_directed_class()

Returns the class to use for empty directed copies.

If you subclass the base classes, use this to designate what directed class to use for to_directed() copies.

to_undirected(reciprocal=False, as_view=False)

Returns an undirected representation of the digraph.

Parameters:
reciprocal : bool (optional)

If True only keep edges that appear in both directions in the original digraph.

as_view : bool (optional, default=False)

If True return an undirected view of the original directed graph.

Returns:
G : Graph

An undirected graph with the same name and nodes and with edge (u, v, data) if either (u, v, data) or (v, u, data) is in the digraph. If both edges exist in digraph and their edge data is different, only one edge is created with an arbitrary choice of which edge data to use. You must check and correct for this manually if desired.

See also

Graph, copy, add_edge, add_edges_from

Notes

If edges in both directions (u, v) and (v, u) exist in the graph, attributes for the new undirected edge will be a combination of the attributes of the directed edges. The edge data is updated in the (arbitrary) order that the edges are encountered. For more customized control of the edge attributes use add_edge().

This returns a “deepcopy” of the edge, node, and graph attributes which attempts to completely copy all of the data and references.

This is in contrast to the similar G=DiGraph(D) which returns a shallow copy of the data.

See the Python copy module for more information on shallow and deep copies, https://docs.python.org/2/library/copy.html.

Warning: If you have subclassed DiGraph to use dict-like objects in the data structure, those changes do not transfer to the Graph created by this method.

Examples

>>> G = nx.path_graph(2)   # or MultiGraph, etc
>>> H = G.to_directed()
>>> list(H.edges)
[(0, 1), (1, 0)]
>>> G2 = H.to_undirected()
>>> list(G2.edges)
[(0, 1)]
to_undirected_class()

Returns the class to use for empty undirected copies.

If you subclass the base classes, use this to designate what directed class to use for to_directed() copies.

update(edges=None, nodes=None)

Update the graph using nodes/edges/graphs as input.

Like dict.update, this method takes a graph as input, adding the graph’s nodes and edges to this graph. It can also take two inputs: edges and nodes. Finally it can take either edges or nodes. To specify only nodes the keyword nodes must be used.

The collections of edges and nodes are treated similarly to the add_edges_from/add_nodes_from methods. When iterated, they should yield 2-tuples (u, v) or 3-tuples (u, v, datadict).

Parameters:
edges : Graph object, collection of edges, or None

The first parameter can be a graph or some edges. If it has attributes nodes and edges, then it is taken to be a Graph-like object and those attributes are used as collections of nodes and edges to be added to the graph. If the first parameter does not have those attributes, it is treated as a collection of edges and added to the graph. If the first argument is None, no edges are added.

nodes : collection of nodes, or None

The second parameter is treated as a collection of nodes to be added to the graph unless it is None. If edges is None and nodes is None an exception is raised. If the first parameter is a Graph, then nodes is ignored.

See also

add_edges_from
add multiple edges to a graph
add_nodes_from
add multiple nodes to a graph

Notes

It you want to update the graph using an adjacency structure it is straightforward to obtain the edges/nodes from adjacency. The following examples provide common cases, your adjacency may be slightly different and require tweaks of these examples.

>>> # dict-of-set/list/tuple
>>> adj = {1: {2, 3}, 2: {1, 3}, 3: {1, 2}}
>>> e = [(u, v) for u, nbrs in adj.items() for v in  nbrs]
>>> G.update(edges=e, nodes=adj)
>>> DG = nx.DiGraph()
>>> # dict-of-dict-of-attribute
>>> adj = {1: {2: 1.3, 3: 0.7}, 2: {1: 1.4}, 3: {1: 0.7}}
>>> e = [(u, v, {'weight': d}) for u, nbrs in adj.items()
...      for v, d in nbrs.items()]
>>> DG.update(edges=e, nodes=adj)
>>> # dict-of-dict-of-dict
>>> adj = {1: {2: {'weight': 1.3}, 3: {'color': 0.7, 'weight':1.2}}}
>>> e = [(u, v, {'weight': d}) for u, nbrs in adj.items()
...      for v, d in nbrs.items()]
>>> DG.update(edges=e, nodes=adj)
>>> # predecessor adjacency (dict-of-set)
>>> pred = {1: {2, 3}, 2: {3}, 3: {3}}
>>> e = [(v, u) for u, nbrs in pred.items() for v in nbrs]
>>> # MultiGraph dict-of-dict-of-dict-of-attribute
>>> MDG = nx.MultiDiGraph()
>>> adj = {1: {2: {0: {'weight': 1.3}, 1: {'weight': 1.2}}},
...        3: {2: {0: {'weight': 0.7}}}}
>>> e = [(u, v, ekey, d) for u, nbrs in adj.items()
...      for v, keydict in nbrs.items()
...      for ekey, d in keydict.items()]
>>> MDG.update(edges=e)

Examples

>>> G = nx.path_graph(5)
>>> G.update(nx.complete_graph(range(4,10)))
>>> from itertools import combinations
>>> edges = ((u, v, {'power': u * v})
...          for u, v in combinations(range(10, 20), 2)
...          if u * v < 225)
>>> nodes = [1000]  # for singleton, use a container
>>> G.update(edges, nodes)
validate()

Run checks to ensure this is still a valid generic workflow graph.