-
Notifications
You must be signed in to change notification settings - Fork 27
Speed up simulations #232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Speed up simulations #232
Conversation
Performance testingUsing the following script to test before/after, testing different crystals (hexagonal, cubic and triclinic) and both small and large unit cells (to decrease/increase the number of reflections - which should be where the greatest performance gain is realized): from diffsims.generators.simulation_generator import SimulationGenerator
from orix.crystal_map import Phase
from orix.quaternion import Rotation
from diffpy.structure import Lattice, Atom, Structure
import numpy as np
import timeit
gold = [Atom("Au", [0, 0, 0])]
hexagonal = Phase("test", 161, structure=Structure(gold,
Lattice(4, 4, 5, 90, 90, 120)
))
large_hexagonal = Phase("test", 161, structure=Structure(gold,
Lattice(20, 20, 25, 90, 90, 120)
))
cubic = Phase("test", 221, structure=Structure(gold,
Lattice(4, 4, 4, 90, 90, 90)
))
large_cubic = Phase("test", 221, structure=Structure(gold,
Lattice(20, 20, 20, 90, 90, 90)
))
triclinic = Phase("test", 1, structure=Structure(gold,
Lattice(4, 5, 6, 80, 90, 130)
))
large_triclinic = Phase("test", 1, structure=Structure(gold,
Lattice(20, 25, 30, 80, 90, 130)
))
gen = SimulationGenerator()
from numpy import random
random.seed(0)
rot = Rotation.random(1000)
kwargs = {
"rotation": rot,
"with_direct_beam": False,
"reciprocal_radius": 5,
}
res = timeit.repeat(
"gen.calculate_diffraction2d(hexagonal, **kwargs)",
globals=globals(),
number=2,
)
print(f"{'hexagonal' :<18}: {np.mean(res) :.2f} ± {np.std(res) :.2f} s")
res = timeit.repeat(
"gen.calculate_diffraction2d(cubic, **kwargs)",
globals=globals(),
number=2,
)
print(f"{'cubic' :<18}: {np.mean(res) :.2f} ± {np.std(res) :.2f} s")
res = timeit.repeat(
"gen.calculate_diffraction2d(triclinic, **kwargs)",
globals=globals(),
number=2,
)
print(f"{'triclinic' :<18}: {np.mean(res) :.2f} ± {np.std(res) :.2f} s")
kwargs["reciprocal_radius"] = 2 # we still get a huge number of reflections
res = timeit.repeat(
"gen.calculate_diffraction2d(large_hexagonal, **kwargs)",
globals=globals(),
number=2,
)
print(f"{'large_hexagonal' :<18}: {np.mean(res) :.2f} ± {np.std(res) :.2f} s")
res = timeit.repeat(
"gen.calculate_diffraction2d(large_cubic, **kwargs)",
globals=globals(),
number=2,
)
print(f"{'large_cubic' :<18}: {np.mean(res) :.2f} ± {np.std(res) :.2f} s")
res = timeit.repeat(
"gen.calculate_diffraction2d(large_triclinic, **kwargs)",
globals=globals(),
number=2,
)
print(f"{'large_triclinic' :<18}: {np.mean(res) :.2f} ± {np.std(res) :.2f} s")Running on current main branch of diffsims and orix: And with the full changes in this branch + Orix: So roughly 2-3x speedup. Not sure what I did to get the 5-6x speedup I saw before, probably some mistake. When adding precession, the speedup is negligible. It might even be slower. Below is the script run with 1 degree precession, run on a different computer than above, so with/without precession times are not comparable. This branch: |
|
@viljarjf I've thought about parallelizing this a couple of times. I profiled the simulation as well but I can't remember what exactly was slow. I think that there are a couple of places we could use numba and be quite a bit faster. I'd be hesistant to add something like dask as that's a bit of a larger dependency. Most of the reason that I haven't gotten around to parallelizing this is because in most of the cases the crystal strucutures I've been using have been pretty simple so the simulation cost is fairly small. I'm all for the idea and can help especially if there is a good reason to do so. |
|
Readthedocs build fails since the two orix PRs are not added, I just made a local branch and merged both. All tests pass with that, at least on python 3.10. I'll update the orix version once it's published as a proper release. There will probably be a merge conflict with #233 too. Other than that, I'm happy to call this finished. Leaving as draft untill the mentioned hangups are ready. The runtime depends a lot on the number of rotations, the number of atoms in the structure, the resolution/number of reflections in each pattern ect. From some simple testing, the runtime is spent on rotating vectors, finding intersections with Ewald's sphere, and finding symmetrically unique reflections. I cannot personally think of much beyond parallelization which would help much with runtime reduction now, at least without a lot of refactoring, but I'll gladly be proved wrong! @CSSFrancis I refactored the function responsible for calculating excitation error so it, at least, is now using numba. Parallelization can come later, I think it will need some more refactoring to leverage numba when dealing with ragged data. |
8d2c0f3 to
496138e
Compare
commit 6c9a31c81fa528000a28fb315338a6df40c5a3d2
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Tue Jun 10 11:17:00 2025 +0200
Use new phase.expand_asymmetric_unit
commit 496138e
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:21:10 2025 +0200
Formatting
commit cde8647
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:21:10 2025 +0200
Use numba for speedup
commit cba08e1
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:20:57 2025 +0200
Simplify `DiffractingVector.__getitem__`
commit fce8e2b
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:20:57 2025 +0200
Use Phase.expand_asymmetric_unit
commit 9c06e7d
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:20:39 2025 +0200
Fix test by using symmetrically unique reflections (4 degenerate were missing)
commit 98dc783
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:20:39 2025 +0200
Ensure compatible ordering of unique vectors
commit 54b6f07
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:20:39 2025 +0200
Getting closer, still some tests left
commit be19d4f
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:20:39 2025 +0200
Adress pyxem#234
commit ccdb65a
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:20:16 2025 +0200
Return self.__class__ instead of explicit ReciprocalLatticeVector
commit 44825d5
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:20:16 2025 +0200
Use rotate_with_basis
commit dd5b74e
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:19:44 2025 +0200
Fix bug where direct beam is added twice
commit 38afe7b
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:19:44 2025 +0200
Fix formatting
commit a00445d
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:19:11 2025 +0200
Support direct beam
commit b759b5d
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:19:11 2025 +0200
Working precession, need to test for correctness
commit c7d006a
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:19:10 2025 +0200
Support precession
commit 5c59e4d
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:18:34 2025 +0200
Rotate Ewald's sphere instead of lattice
commit 7aa207e
Author: Viljar Femoen <viljar.femoen@hotmail.no>
Date: Thu Jun 5 15:18:34 2025 +0200
Subclass Phase for faster initialization
496138e to
81d7be9
Compare
81d7be9 to
37fd399
Compare
|
@viljarjf ping me when you need a review here. It would be good to get this in to help with the integration into instamatic. |
Description of the change
The scope of this has expanded a little. Now fixes a bug with the direct beam being added twice, implements #234, as well as speeding up computations.
This is done in three main ways:
Some other optimizations:
(this can increase runtime when requesting only a few rotations, but massively speed up calculations if the structure has many atoms)
( I haven't actually checked if calculating structure factors is more expensive than determining unique reflections...)
Profiling now indicates around equal runtime is given to finding intersections, rotating the intersected vectors, and everything else.
"Everything else" includes mostly
DiffractingVector.__getitem__, calculating all the intensities, and finding all reflections within the given d_min.Progress of the PR
Object3d.uniqueorix#545For reviewers
__init__.py.unreleased section in
CHANGELOG.rst.creditsindiffsims/release_info.pyandin
.zenodo.json.