Skip to content

Conversation

@PanZezhong1725
Copy link
Collaborator

为Parameter实现了张量并行分块,顺便修了几个bug
重要更改:

  1. setDevice除非强制,否则无视set CPU的情况
  2. 算子在dispatch阶段进行device检查以及setdevice。和 1 放在一起意味着纯cpu的算子在任何runtime里都可以执行
  3. 目前只对rearrage进行了更改(因为copy_from函数依赖它),其他算子需要之后修改

测试通过截图
infinicore-test --nvidia --test module
image

}

// Initialize source data
std::memset(src_memory->data(), 0xAB, data_size);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里之后没有数值检查,只会导致set device地址segfault,故删除

@PanZezhong1725 PanZezhong1725 merged commit 53f4bc1 into main Dec 3, 2025
10 checks passed
@PanZezhong1725 PanZezhong1725 deleted the issue/682 branch December 3, 2025 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DEV] prameter 支持张量并行

3 participants