Skip to content

Conversation

@hxaxd
Copy link

@hxaxd hxaxd commented Nov 7, 2025

Purpose

  • Reduce the deployment difficulty of small-scale visual models using DCNv3 on different devices (using the same precompiled package for downgrading)

Work

  • Provide CPU downgrade by modifying dcnv3.h
  • Modify setup.py to provide CPU-Only compilation method and enable O2 optimization
  • Fully implement CPU operators

Effects

  • Minimize intrusion into the original code and compilation methods as much as possible
  • Accuracy passes tests (based on the original tests by modifying CUDA interfaces to corresponding CPU interfaces)

Issues

  • In the scenario of multi-core x86 CPU supporting SIMD, the speed is only 0.27x of the PyTorch CPU version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant