Optimize resources while importing pages #105

pchinery · 2019-10-08T14:13:23Z

We came across a PDF file that was referencing one resource dictionary from every page, which contained all fonts and images. Therefore, extracting a single page would make the resulting file very large, as all fonts and images would be embedded as well. We can provide this file for tests, if desired.

The code changes not treat cloning the resource dictionary differently from cloning other objects, as the resources will be reduced to resources used in the content.

There are a few questions open:

Are there (maybe indirect) ways to reference a resource from the content that are not considered here?
Is there a way to re-use the lexer/parser to go identify used resources? (currently, this is a rather hacky implementation)
Are there any points that we have not considered properly here?

Any feedback is greatly appreciated and we'd love to see this ability in the main branch at some point.

Optimize resources while importing pages

a844a1b

ThomasHoevel added the review pending label Jan 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize resources while importing pages #105

Optimize resources while importing pages #105

Uh oh!

pchinery commented Oct 8, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimize resources while importing pages #105

Are you sure you want to change the base?

Optimize resources while importing pages #105

Uh oh!

Conversation

pchinery commented Oct 8, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants