Skip to content

Conversation

@kilianvolmer
Copy link
Contributor

@kilianvolmer kilianvolmer commented Nov 6, 2025

Changes and Information

Please briefly list the changes (main added features, changed items, or corrected bugs) made:

  • Added a lazy_add_edge function that does neither check for uniqueness nor sort the added edges.
  • Added a sort_edges and a make_edges_unique function
  • Added a reserve_edges function to reserve space for edges

If need be, add additional information and what the reviewer should look out for in particular:

Merge Request - Guideline Checklist

Please check our git workflow. Use the draft feature if the Pull Request is not yet ready to review.

Checks by code author

  • Every addressed issue is linked (use the "Closes #ISSUE" keyword below)
  • New code adheres to coding guidelines
  • No large data files have been added (files should in sum not exceed 100 KB, avoid PDFs, Word docs, etc.)
  • Tests are added for new functionality and a local test run was successful (with and without OpenMP)
  • Appropriate documentation within the code (Doxygen) for new functionality has been added in the code
  • Appropriate external documentation (ReadTheDocs) for new functionality has been added to the online documentation
  • Proper attention to licenses, especially no new third-party software with conflicting license has been added
  • (For ABM development) Checked benchmark results and ran and posted a local test above from before and after development to ensure performance is monitored.

Checks by code reviewer(s)

  • Corresponding issue(s) is/are linked and addressed
  • Code is clean of development artifacts (no deactivated or commented code lines, no debugging printouts, etc.)
  • Appropriate unit tests have been added, CI passes, code coverage and performance is acceptable (did not decrease)
  • No large data files added in the whole history of commits(files should in sum not exceed 100 KB, avoid PDFs, Word docs, etc.)
  • On merge, add 2-5 lines with the changes (main added features, changed items, or corrected bugs) to the merge-commit-message. This can be taken from the briefly-list-the-changes above (best case) or the separate commit messages (worst case).

Closes #1410

@kilianvolmer kilianvolmer requested a review from HenrZu November 6, 2025 14:23
@kilianvolmer kilianvolmer changed the title 1410-Accelerate-add_edge-for-large-graphs 1410-Accelerate add_edge for large graphs Nov 6, 2025
@kilianvolmer kilianvolmer changed the title 1410-Accelerate add_edge for large graphs 1410 Accelerate add_edge for large graphs Nov 6, 2025
@codecov
Copy link

codecov bot commented Nov 6, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.28%. Comparing base (6233316) to head (d14011b).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1411      +/-   ##
==========================================
- Coverage   97.29%   97.28%   -0.01%     
==========================================
  Files         180      180              
  Lines       15646    15663      +17     
==========================================
+ Hits        15223    15238      +15     
- Misses        423      425       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@HenrZu HenrZu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! Im still not entirely sure about the overall usefulness of the feature.
Do you have some information about the overhead caused by the sorting and checking for dublicates?
if we want to add this feature, we should also extend the documentation of the current add edge function starting in line 169.

* @return Edge<EdgePropertyT>& End of edge vector
*/
template <class... Args>
Edge<EdgePropertyT>& lazy_add_edge(size_t start_node_idx, size_t end_node_idx, Args&&... args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should catch the case, when m_edges is empty

Copy link
Contributor Author

@kilianvolmer kilianvolmer Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Doesn't emplace_back also work on an empty vector?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He potentially means uninitialized, not empty?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.

*
* @return Edge<EdgePropertyT>& End of edge vector
*/
Edge<EdgePropertyT>& sort_edges()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why return the reference and not use void?

/**
* @brief Make the edges of a graph unique.
*
* Copies all the unique edges to a new vector and replaces the edge vector of the graph with it. Unique means that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also mention that properties might get lost.
Additionally, if i add two edges with the same start/end combination. Which edges gets deleted? the one that was added latest?

* the start and end node indices are unique. Other edge properties are not checked.
* @return Edge<EdgePropertyT>& End of edge vector
*/
Edge<EdgePropertyT>& make_edges_unique()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would be clearer, if we use the term "remove dublicates" instead of unique edges.
Also renaming the function to remove_duplicate_edges would add clarity here imo.

* @return Edge<EdgePropertyT>& End of edge vector
*/
Edge<EdgePropertyT>& make_edges_unique()
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also an option to merge the properties would be nice. However, this would make some restrictions to the property.

{
std::vector<Edge<EdgePropertyT>> unique_edges;
unique_edges.reserve(m_edges.size());
std::ranges::unique_copy(m_edges, std::back_inserter(unique_edges), [](auto&& e1, auto&& e2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do a copy here. Is this really neccessary?

* @param start_node_idx Id of start node
* @param end_node_idx Id of end node
* @param args Additional arguments for edge construction
* @return Edge<EdgePropertyT>& End of edge vector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

describe the return more in detail. This is to generic.
Currently, its a reference to the latest added edge.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also doxy-comment @tparam ?

* @return Edge<EdgePropertyT>& End of edge vector
*/
template <class... Args>
Edge<EdgePropertyT>& lazy_add_edge(size_t start_node_idx, size_t end_node_idx, Args&&... args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a motivation, why we even need this function? Whats the use case? why not add a bool flag to the existing add_edge function to allow addition without checking or sorting?

also, i would prefer the name add_edge_naive

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe directly unchecked?

EXPECT_THAT(g.out_edges(1), testing::ElementsAreArray(v1));
}

TEST(TestGraph, compare_add_edge_functions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test cases are great so far. However, could you add a few more tests for edge cases (empty graph / calling make_unique... before sorting / testing if the correct properties remain and the expected ones get deleted)

.. dropdown:: :fa:`gears` Working with large graphs

When working with very large graphs, i.e. starting from a few thousand edges, it will be faster to not use the standard ``add_edge`` function. This function always
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, now i see the motivation. Some questions.
Does that really make a noticeable difference? Have you ever quantified it?

Especially in simulations with many edges, we expect a relatively large runtime. Is the overhead caused by checking so much higher that we accept new sources of error (such as forgetting to call the sort function, etc.)?

Copy link
Contributor Author

@kilianvolmer kilianvolmer Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does make a difference of around four orders of magnitude in example simulations. For example, with 1.6 million edges I measure 4.5e+03 seconds with the standard function and 9.6e-01 seconds with the new functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Accelerate add edges for large graphs

4 participants