Opened 22 months ago
Closed 14 months ago
#28904 closed enhancement (fixed)
Move reversed graph from backend to CGraph for sparse graphs
Reported by:  ghkliem  Owned by:  

Priority:  major  Milestone:  sage9.2 
Component:  graph theory  Keywords:  sparse graphs, backend 
Cc:  vdelecroix, dcoudert  Merged in:  
Authors:  Jonathan Kliem  Reviewers:  David Coudert 
Report Upstream:  N/A  Work issues:  
Branch:  a46e717 (Commits, GitHub, GitLab)  Commit:  a46e717377c1cd020b5bda37922cf8ff42e914a6 
Dependencies:  Stopgaps: 
Description (last modified by )
Currently, for sparse graphs the backend keeps a copy of the reversed graph. This makes iterating over ingoing neighbors much faster.
However, SparseGraph
itself does not have access to this. So e.g. when deleting vertices, this results in a massive slow down as it needs to delete incoming edges.
We initialize the sparse backend with the reversed structure and instead add the reversed structure directly to SparseGraph
.
Before this ticket:
sage: edges = [(randint(0,1000), randint(0,1000)) for _ in range(100000)] sage: save(edges, '/tmp/edges') sage: D = DiGraph(multiedges=True, loops=True) sage: %time D.add_edges(edges) CPU times: user 80.1 ms, sys: 0 ns, total: 80.1 ms Wall time: 79.6 ms sage: %time for i in range(100): D.delete_vertex(i) CPU times: user 11.7 ms, sys: 0 ns, total: 11.7 ms Wall time: 11.5 ms
With this ticket:
sage: edges = load('/tmp/edges') sage: D = DiGraph(multiedges=True, loops=True) sage: %time D.add_edges(edges) CPU times: user 69 ms, sys: 0 ns, total: 69 ms Wall time: 68.6 ms sage: %time for i in range(100): D.delete_vertex(i) CPU times: user 4.42 ms, sys: 16 µs, total: 4.43 ms Wall time: 4.44 ms
We fix an issue in subdivide_edges
that would have caused a doctest to fail with the new setup:
There is a doctest calling the method subdivide_edges
with self.edges()
. As one can see from the documenation from EdgesView
, one should not modify the graph while iterating through EdgesView
, because it updates itself. So the following test in generic_graph.py
is wrong usage of subdivide_edges
or edges
:
sage: g.subdivide_edges(g.edges(), 1)
However, as it seems very natural to do this, we pass a tuple instead of EdgesView
to the backend.
Change History (30)
comment:1 Changed 22 months ago by
comment:2 Changed 22 months ago by
 Branch set to public/28904
 Commit set to 47a3897ead68a87329f38e729508462af5472392
 Status changed from new to needs_review
This probably needs to be cleaned up a lot. Is it going in a good direction? On my side, all tests in graphs seem to succeed.
New commits:
47a3897  moved duplicate of sparse graph into the cgraph structure

comment:3 Changed 22 months ago by
 Cc dcoudert added
comment:4 Changed 22 months ago by
 Description modified (diff)
One needs to be careful, about taking the same set of edges I guess.
comment:5 Changed 22 months ago by
 Commit changed from 47a3897ead68a87329f38e729508462af5472392 to 7d70aafae5b56c1d440256bf4a1ed63af43291a0
Branch pushed to git repo; I updated commit sha1. New commits:
7d70aaf  make class final

comment:6 followup: ↓ 10 Changed 22 months ago by
Thank you for doing this, it's really needed.
With these changes, _cg_rev
is no longer initialized, right ? If it's no longer needed, may be we can remove it, but we have to ensure that noone is using it. Actually I fear that some users are using it for personal code. I see 2 solutions for that:
 send a pool to sagedev for complete removal
 maintain an object build from
_cg
with appropriate pointers, and add a warning in the documentation.
## in sparse_graph.pyx
:
 for i from 0 <= i < self.active_vertices.size * self.hash_length: + for i in range(self.active_vertices.size * self.hash_length):
I think you can simplify tests to NULL
 while temp[0] != NULL: + while temp[0]:
 Lists the inneighbors of a vertex as BTNodes + List the inneighbors of a vertex as BTNodes
 Builds the list of arcs into a vertex. + Build the list of arcs into a vertex.
 Adds arc (u, v) to the graph with label l. + Add arc (u, v) to the graph with label l.
Better to use True
and False
instead of 0 and 1.
 cdef int u_int = self.check_labelled_vertex(u, 0)  cdef int v_int = self.check_labelled_vertex(v, 0) + cdef int u_int = self.check_labelled_vertex(u, False) + cdef int v_int = self.check_labelled_vertex(v, False)
## in c_graph.pyx
we can do some extra optimisations, like
in verts
 cdef int i  return [i for i in range(<int>self.active_vertices.size)  if bitset_in(self.active_vertices, i)] + return bitset_list(self.active_vertices)
in iterator_verts
 for i in range(self._cg.active_vertices.size):  if (bitset_in(self._cg.active_vertices, i)  and i not in self.vertex_labels  and i not in self.vertex_ints):  yield i + i = bitset_first(self._cg.active_vertices) + while i >= 0: + if (i not in self.vertex_labels + and i not in self.vertex_ints): + yield i + i = bitset_next(self._cg.active_vertices, i + 1)
comment:7 Changed 22 months ago by
 Milestone changed from sage9.0 to sage9.1
Ticket retargeted after milestone closed
comment:8 Changed 22 months ago by
 Branch changed from public/28904 to public/28904reb
 Commit changed from 7d70aafae5b56c1d440256bf4a1ed63af43291a0 to 64c949321b61352f991bca2aefdfb59732898d55
comment:9 Changed 22 months ago by
 Commit changed from 64c949321b61352f991bca2aefdfb59732898d55 to 81e99e641b7ea36d16dcdd30dd6db9e01bb35d1e
Branch pushed to git repo; I updated commit sha1. New commits:
81e99e6  lots of small improvements

comment:10 in reply to: ↑ 6 Changed 22 months ago by
Thanks for the comments.
Replying to dcoudert:
in
iterator_verts
 for i in range(self._cg.active_vertices.size):  if (bitset_in(self._cg.active_vertices, i)  and i not in self.vertex_labels  and i not in self.vertex_ints):  yield i + i = bitset_first(self._cg.active_vertices) + while i >= 0: + if (i not in self.vertex_labels + and i not in self.vertex_ints): + yield i + i = bitset_next(self._cg.active_vertices, i + 1)
Fixing this makes a mistake apparent: There is a doctest in generic_graph.py
that subdivides all edges of a graph by using self.edges()
. However, one should not iterate over an EdgesView
object while modifying the graph. I have modified the method subdivide_edges
to generate a tuple from EdgesView
.
comment:11 Changed 22 months ago by
 Commit changed from 81e99e641b7ea36d16dcdd30dd6db9e01bb35d1e to f4423f08ba37e8d5478f1564f574b8bb64a212ba
Branch pushed to git repo; I updated commit sha1. New commits:
f4423f0  small optimizations

comment:12 Changed 22 months ago by
 I have some compilation warnings. It's better to check if it's possible to avoid.
[2/4] gcc Wnounusedresult Wsigncompare Wunreachablecode DNDEBUG g fwrapv O3 Wall Wnounused I./sage/cpython I./sage/libs/ntl I/Users/dcoudert/sage3/sage/local/lib/python3.7/sitepackages/cysignals Isage/cpython I./sage/rings I/Users/dcoudert/sage3/sage/local/include I/Users/dcoudert/sage3/sage/src I/Users/dcoudert/sage3/sage/src/sage/ext I/Users/dcoudert/sage3/sage/local/include/python3.7m I/Users/dcoudert/sage3/sage/local/lib/python3.7/sitepackages/numpy/core/include Ibuild/cythonized I/Users/dcoudert/sage3/sage/local/include/python3.7m c build/cythonized/sage/graphs/base/c_graph.cpp o build/temp.macosx10.9x86_643.7/build/cythonized/sage/graphs/base/c_graph.o fnostrictaliasing DCYTHON_CLINE_IN_TRACEBACK=1 std=c++11 build/cythonized/sage/graphs/base/c_graph.cpp:16256:48: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'long' [Wsigncompare] __pyx_t_2 = ((__pyx_cur_scope>__pyx_v_i != 1L) != 0); ~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~ 1 warning generated. g++ bundle undefined dynamic_lookup L/Users/dcoudert/sage3/sage/local/lib Wl,rpath,/Users/dcoudert/sage3/sage/local/lib L/usr/local/opt/openssl/lib L. L/Users/dcoudert/sage3/sage/local/lib Wl,rpath,/Users/dcoudert/sage3/sage/local/lib L/usr/local/opt/openssl/lib L/Users/dcoudert/sage3/sage/local/lib Wl,rpath,/Users/dcoudert/sage3/sage/local/lib build/temp.macosx10.9x86_643.7/build/cythonized/sage/graphs/base/c_graph.o L/Users/dcoudert/sage3/sage/local/lib L/Users/dcoudert/sage3/sage/local/lib lgmp lstdc++ o build/lib.macosx10.9x86_643.7/sage/graphs/base/c_graph.cpython37mdarwin.so lpari [3/4] gcc Wnounusedresult Wsigncompare Wunreachablecode DNDEBUG g fwrapv O3 Wall Wnounused I./sage/cpython Isage/cpython I/Users/dcoudert/sage3/sage/local/lib/python3.7/sitepackages/cysignals I/Users/dcoudert/sage3/sage/local/include I/Users/dcoudert/sage3/sage/src I/Users/dcoudert/sage3/sage/src/sage/ext I/Users/dcoudert/sage3/sage/local/include/python3.7m I/Users/dcoudert/sage3/sage/local/lib/python3.7/sitepackages/numpy/core/include Ibuild/cythonized I/Users/dcoudert/sage3/sage/local/include/python3.7m c build/cythonized/sage/graphs/base/sparse_graph.c o build/temp.macosx10.9x86_643.7/build/cythonized/sage/graphs/base/sparse_graph.o fnostrictaliasing DCYTHON_CLINE_IN_TRACEBACK=1 std=c99 build/cythonized/sage/graphs/base/sparse_graph.c:11350:35: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [Wsigncompare] for (__pyx_t_5 = 0; __pyx_t_5 < __pyx_t_4; __pyx_t_5+=1) { ~~~~~~~~~ ^ ~~~~~~~~~ build/cythonized/sage/graphs/base/sparse_graph.c:11384:35: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [Wsigncompare] for (__pyx_t_5 = 0; __pyx_t_5 < __pyx_t_4; __pyx_t_5+=1) { ~~~~~~~~~ ^ ~~~~~~~~~ build/cythonized/sage/graphs/base/sparse_graph.c:19114:87: warning: incompatible pointer types passing 'struct __pyx_obj_4sage_6graphs_4base_7c_graph_CGraph *' to parameter of type 'struct __pyx_obj_4sage_6graphs_4base_12sparse_graph_SparseGraph *' [Wincompatiblepointertypes] ...((struct __pyx_obj_4sage_6graphs_4base_7c_graph_CGraph *)((struct __pyx_obj_4sage_6graphs_4base_12sparse_graph_SparseGraph *)__pyx_v_self>__pyx_base._cg)), __pyx_v_u_int... ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ build/cythonized/sage/graphs/base/sparse_graph.c:9396:149: note: passing argument to parameter '__pyx_v_self' here static int __pyx_f_4sage_6graphs_4base_12sparse_graph_11SparseGraph_has_arc_unsafe(struct __pyx_obj_4sage_6graphs_4base_12sparse_graph_SparseGraph *__pyx_v_self, int __pyx... ^ build/cythonized/sage/graphs/base/sparse_graph.c:25037:87: warning: incompatible pointer types passing 'struct __pyx_obj_4sage_6graphs_4base_7c_graph_CGraph *' to parameter of type 'struct __pyx_obj_4sage_6graphs_4base_12sparse_graph_SparseGraph *' [Wincompatiblepointertypes] ...((struct __pyx_obj_4sage_6graphs_4base_7c_graph_CGraph *)((struct __pyx_obj_4sage_6graphs_4base_12sparse_graph_SparseGraph *)__pyx_v_self>__pyx_base._cg)), __pyx_v_u_int... ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ build/cythonized/sage/graphs/base/sparse_graph.c:9396:149: note: passing argument to parameter '__pyx_v_self' here static int __pyx_f_4sage_6graphs_4base_12sparse_graph_11SparseGraph_has_arc_unsafe(struct __pyx_obj_4sage_6graphs_4base_12sparse_graph_SparseGraph *__pyx_v_self, int __pyx... ^ 4 warnings generated.
 method
del_arc_unsafe
returns nothing
 in method
del_arc_label_unsafe
, can't we do this (not sure if it's better or not): cdef int error = self._del_arc_label_unsafe(u, v, l, self.vertices)  if error:  return 1 # indicate an error + if self._del_arc_label_unsafe(u, v, l, self.vertices): + return 1 # indicate an error
 Thanks for correcting
subdivise_edges
.
I confirm that vertex deletion is much faster now.
comment:13 Changed 22 months ago by
There are tons of compilation warnings, as we include all of bitset.pxi
, which is annoying, but of course not serious.
The compilation warning about wrong casting of self._cg
is a bit tricky. I think I took care of it now. The problem is that SparseGraphBackend
casts the SparseGraph
to a CGraph
in order to assign it to self._cg
, which is declared to be a CGraph
. This is later on casted back into a SpareGraph
, which of course produces a warning.
My approach now: Remove the attribute _cg
from CGraphBackend
and add it to the inherited backends with the proper declaration. Add an inline method cg
, which returns _cg
casted to a CGraph
. This is cleaner, I think.
comment:14 Changed 22 months ago by
 Commit changed from f4423f08ba37e8d5478f1564f574b8bb64a212ba to 4af81b4339f574584e3f86a49cb597cae9002741
comment:15 Changed 22 months ago by
I'm happy with the current code, but it would be better to get extra opinion (e.g., from the thread https://groups.google.com/forum/#!topic/sagedevel/2eIaD3r8Vl8).
comment:16 Changed 22 months ago by
There remains the question, whether we should remove cg_rev
altogether and whether we should do it now.
The advantage of removing it, is that code referring to it won't compile anymore and/or give a proper warning. Currently, if one assumes this to be initialized and uses it in cython, sage might crash hard (segmentation fault I believe).
comment:17 Changed 22 months ago by
 Commit changed from 4af81b4339f574584e3f86a49cb597cae9002741 to bfd25a908e6e13da8280b98794424669c291e1b5
Branch pushed to git repo; I updated commit sha1. New commits:
bfd25a9  removed attribute _cg_rev from CGraphBackend

comment:18 Changed 22 months ago by
I removed the attribute _cg_rev
now.
The method check_labelled_vertex
just ignores the input reverse
now.
The method c_graph
return (self._cg, None)
instead of (self._cg, self._cg_rev)
.
comment:19 Changed 20 months ago by
Why
diff git a/src/sage/graphs/generic_graph.py b/src/sage/graphs/generic_graph.py index 139c8df..4fa7f75 100644  a/src/sage/graphs/generic_graph.py +++ b/src/sage/graphs/generic_graph.py @@ 11001,6 +11001,8 @@ class GenericGraph(GenericGraph_pyx):  :meth:`subdivide_edge`  subdivides one edge """ + if isinstance(edges, EdgesView): + edges = tuple(edges)
comment:20 Changed 20 months ago by
This is independent from the purpose of this ticket, but this is effectively an issue to be fixed.
comment:21 Changed 20 months ago by
 Description modified (diff)
I updated the description of the ticket to account for the change.
comment:22 Changed 19 months ago by
 Milestone changed from sage9.1 to sage9.2
Batch modifying tickets that will likely not be ready for 9.1, based on a review of the ticket title, branch/review status, and last modification date.
comment:23 Changed 15 months ago by
What's the status of this ticket?
comment:24 Changed 15 months ago by
Needs review.
It's annoying to check, I guess. But it would be a nice improvements with many applications. Maybe I should request a review on sagedevel.
comment:25 followup: ↓ 27 Changed 15 months ago by
overall, it's a very nice improvement. I have only a few suggestions.
sparse_graph.pyx
 instead of checking
self.vertices != self.vertices_rev
, we could store a boolean valueself._directed
and have a methodself.is_directed())
.
 instead of
<size_t>1
, don't we have a constant equal to the max value ? This is something we should do in several places in the code. Could be done in dedicated tickets
 in
out_neighbors_unsafe
andin_neighbors_unsafe
, removecdef list l = []
, not used
 I know that the documentation of cdef methods is not displayed, but it would be cleaner to follow the same standard than in the other parts of the code. For instance
Of course, we can do that in a future ticket. Not a priority here
INPUT:  u, v  nonnegative integers + +  ``u, v``  nonnegative integers
 methods
out_arcs_unsafe
andin_arcs_unsafe
return lists. Why not changing to an iterator ? If I'm not mistaken, these methods are only used in for loops.
 same for method
all_arcs
which is used in a for loop and in some doctests where we could uselist(...)
.confetti:sage dcoudert$ git grep i "\.all_arcs" src/sage/ src/sage/graphs/base/c_graph.pyx:  :meth:`all_arcs <sage.graphs.base.sparse_graph.SparseGraph.all_arcs>` src/sage/graphs/base/c_graph.pyx: sage: G.all_arcs(0, 1) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(0,1) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(1,2) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(7,3) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(0,1) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(1,2) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(5,4) src/sage/graphs/base/sparse_graph.pyx: sage: G.all_arcs(1,2) src/sage/graphs/base/sparse_graph.pyx: num_arcs = self.all_arcs_unsafe(u, v, arc_labels, size) src/sage/graphs/base/sparse_graph.pyx: sage: G.all_arcs(0,1) src/sage/graphs/base/sparse_graph.pyx: sage: G.all_arcs(0,1) src/sage/graphs/base/sparse_graph.pyx: for l_int in self._cg.all_arcs(u_int, v_int)] src/sage/graphs/base/sparse_graph.pyx: for l_int in self._cg.all_arcs(u_int, v_int):
comment:26 Changed 15 months ago by
 Branch changed from public/28904reb to public/28904reb2
 Commit changed from bfd25a908e6e13da8280b98794424669c291e1b5 to a46e717377c1cd020b5bda37922cf8ff42e914a6
comment:27 in reply to: ↑ 25 Changed 15 months ago by
Replying to dcoudert:
overall, it's a very nice improvement. I have only a few suggestions.
sparse_graph.pyx
 instead of checking
self.vertices != self.vertices_rev
, we could store a boolean valueself._directed
and have a methodself.is_directed())
.
Done.
 instead of
<size_t>1
, don't we have a constant equal to the max value ? This is something we should do in several places in the code. Could be done in dedicated tickets
This is being cythonized into ((size_t)1L)
and I hope the compiler takes care of it from there. It's maybe not so nice codewise, but I don't think it is any loss.
 in
out_neighbors_unsafe
andin_neighbors_unsafe
, removecdef list l = []
, not used
Done.
 I know that the documentation of cdef methods is not displayed, but it would be cleaner to follow the same standard than in the other parts of the code. For instance
Of course, we can do that in a future ticket. Not a priority hereINPUT:  u, v  nonnegative integers + +  ``u, v``  nonnegative integers
Done. It's a rather large commit though as I went over all the places that "caught my eye".
 methods
out_arcs_unsafe
andin_arcs_unsafe
return lists. Why not changing to an iterator ? If I'm not mistaken, these methods are only used in for loops.
 same for method
all_arcs
which is used in a for loop and in some doctests where we could uselist(...)
.confetti:sage dcoudert$ git grep i "\.all_arcs" src/sage/ src/sage/graphs/base/c_graph.pyx:  :meth:`all_arcs <sage.graphs.base.sparse_graph.SparseGraph.all_arcs>` src/sage/graphs/base/c_graph.pyx: sage: G.all_arcs(0, 1) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(0,1) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(1,2) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(7,3) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(0,1) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(1,2) src/sage/graphs/base/sparse_graph.pyx: sage: S.all_arcs(5,4) src/sage/graphs/base/sparse_graph.pyx: sage: G.all_arcs(1,2) src/sage/graphs/base/sparse_graph.pyx: num_arcs = self.all_arcs_unsafe(u, v, arc_labels, size) src/sage/graphs/base/sparse_graph.pyx: sage: G.all_arcs(0,1) src/sage/graphs/base/sparse_graph.pyx: sage: G.all_arcs(0,1) src/sage/graphs/base/sparse_graph.pyx: for l_int in self._cg.all_arcs(u_int, v_int)] src/sage/graphs/base/sparse_graph.pyx: for l_int in self._cg.all_arcs(u_int, v_int):
The last two I would leave for future tickets. But you are right. We really iterate over a structure that is already there (either over a bitset or a tree). So there is no use in wasting resources for creating this list.
comment:28 Changed 15 months ago by
 Reviewers set to David Coudert
 Status changed from needs_review to positive_review
OK to let the change from list to iterator for another ticket.
LGTM.
Thank you.
comment:29 Changed 15 months ago by
Thanks.
comment:30 Changed 14 months ago by
 Branch changed from public/28904reb2 to a46e717377c1cd020b5bda37922cf8ff42e914a6
 Resolution set to fixed
 Status changed from positive_review to closed
TBtw, there is a bug that leads to a crash with:
This ticket also fixes this (the method
in_degree
in the CGraph backend assumes_cg_rev
to be initialized).