Skip to content

Conversation

@ethche
Copy link

@ethche ethche commented Nov 11, 2025

Includes the UCB Pattern Search autotuner, which modifies PatternSearch to search through configs using a Gaussian Process and the Upper Confidence Bound (UCB) acquisition function.

  • Similar to PatternSearch, we generate neighbors from search copies. But instead we generate random neighbors instead of exhaustive set.
  • Filters a fraction of them to evaluate using a fitted Gaussian Process and the Upper Confidence Bound acquisition function. This encourages exploration.

This improves over PatternSearch in kernel latency and autotuning wall-clock time on B200 for a set of benchmark kernels. DifferentialEvolution can improve further upon this, but takes substantially longer. DESurrogate in #1096, has comparable performance.

Kernel latency:
geomean_latency_ratio_vs_pattern

Autotuning Wall-clock Speedup:
geomean_wallclock_speedup_vs_pattern

Autotuning Convergence Time:
geomean_convergence_speedup_vs_pattern

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 11, 2025
@ethche ethche requested a review from jansel November 14, 2025 01:03

def encode_dim(self) -> int:
"""
Returns the dimension of the output of encode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this is? What is it encoding?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the dimension of the encoding

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the name to be more clear

"""
raise NotImplementedError

def encode(self, value: object) -> list[float]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse encode_scalar here? If not, I'd like to combine this with encode_scalar since they are solving the same problem.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jansel This is pretty much identical to encode_scalar for integer and poweroftwo. But the previous encode_scalar did not have any functionality for ListOf or PermutationFragments. How should we handle those two?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to rename everything to encode_scalar, but given ListOf and PermutationFragments, I think we should allow encode to output a list of floats

Comment on lines 75 to 82
self.cat_dims = []
offset = 0
for spec in self.config_gen.flat_spec:
n_dims = spec.encode_dim()
if spec.is_categorical():
# All dimensions of this encoder are categorical
self.cat_dims.extend(range(offset, offset + n_dims))
offset += n_dims
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems kind of error prone, can we have a cleaner way to do it?

@ethche ethche requested a review from jansel November 16, 2025 19:41
# Initialize config encoder
self.frac_selected = frac_selected

# compute offsets from the flat_spec
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants