Skip to content

Conversation

@ddavis-2015
Copy link
Member

@tensorflow/micro

Add support for alternate decompression memory to DECODE operator. Additional unit tests.
Update generic benchmark application and Makefile.

bug=fixes #3212

@tensorflow/micro

Add support for alternate decompression memory to DECODE operator.
Additional unit tests.
Update generic benchmark application and Makefile.

bug=fixes tensorflow#3212
@ddavis-2015 ddavis-2015 marked this pull request as ready for review November 7, 2025 02:53
@ddavis-2015 ddavis-2015 requested a review from a team as a code owner November 7, 2025 02:53
@ddavis-2015 ddavis-2015 requested a review from veblush November 7, 2025 02:53
Copy link
Collaborator

@veblush veblush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR; I have some minor feedback.

Regarding the decompression memory: How is the required size calculated, and do we have tooling for that? Also, when is it preferable to use multiple decompression regions rather than just one?

TF_LITE_MICRO_EXPECT_TRUE(invalid_output == nullptr);
}

TF_LITE_MICRO_TEST(TestSetDecompressionMemory) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add comments explaining why these specific state checks are necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

kAllocateSize, tflite::MicroArenaBufferAlignment()));
TF_LITE_MICRO_EXPECT(p == &g_alt_memory[16]);

// fail next allocation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment here clarifying that this fails due to 16-byte alignment overhead (consuming 26 of the 30 bytes after just two allocations). The first time I saw this, this was puzzling because 10 x 3 = 30 ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

TfLiteTensor* output = nullptr;
TfLiteStatus status = kTfLiteOk;

micro_context->ResetDecompressionMemoryAllocations();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that there will be only one Decode node in the entire graph? It looks like that if there were two Decode nodes, the second one's Prepare call would wipe out the memory assignments planned by the first one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<discussion moved to alternate venue>

size_t tensor_output_index,
TfLiteTensor* output) {
if (output->data.data != nullptr) {
// If memory has already been assigned to the tensor, leave it be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious how this would happen?

Copy link
Member Author

@ddavis-2015 ddavis-2015 Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<discussion moved to alternate venue>

@ddavis-2015
Copy link
Member Author

ddavis-2015 commented Nov 11, 2025

Thank you for the PR; I have some minor feedback.

Regarding the decompression memory: How is the required size calculated, and do we have tooling for that? Also, when is it preferable to use multiple decompression regions rather than just one?

@veblush The required size and number of regions are decided by the application. TFLM makes no assumptions about the processor memory.

There is no tooling associated with alternate decompression memory. An API for the MicroInterpreter is supplied, and it is up to the application developer as to how they wish to use it. An example of how the API is used is available in the Generic Benchmark application, and its associated documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for DECODE operator

3 participants