Update Hubspot Source to use API Incrementally #662

kevin-induro · 2025-10-23T17:05:34Z

Tell us what you do here

improving, documenting, or customizing an existing source (please link an issue or describe below)

Short description

The primary motivation is to switch the basic resources (i.e. those created from ALL_OBJECTS with the crm_objects function) from fetching all the Hubspot objects to fetching only those needed based on the incremental load date.

Additional Context

There are still a few different things that can be improved.

I noticed that the *_property_history tables have columns for all the properties of the object and null values for all those columns. It should really just have the columns _dlt_id, _dlt_load_id, object_id, property_name, source_id, source_type, timestamp, updated_by_user_id, value
I think there's room for improvement on the property history side as well. It'll likely be faster to use the search API to get the recent changes, then pass the IDs from that into the batch API rather than trying to get all of the objects histories every time.
I'm not happy with the settings properties. I think things can be generally put together better, and I think I made it worse while trying to follow the previous convention. I also think that there are more settings that can be configurable when declaring the source and refactoring the settings could allow that better.

dat-a-man

Review Summary
Reviewed and confirmed schema-level backward compatibility, table names, columns, and keys remain unchanged.

Behaviorally, the switch to the Search API introduces a few differences:

Associations are fetched per record → may increase latency and rate-limit risk for large datasets.
10k record limit → may trigger recursive fetch or full reload when timestamp overlaps occur.
Archived records always reload in full → gradual performance slowdown as archive grows.
Overall performance becomes variable (faster for small deltas, slower for heavy associations).

Everything else looks solid. Small improvements possible, mainly around improving the documentation to clearly describe these changes and set expectations.

PR looks good to approve.
@anuunchin, could you please have a passing look in case there’s anything else to refine or clarify?

Thanks for the great work @kevin-induro! 🙌

kevin-induro added 5 commits October 15, 2025 14:50

move CRM endpoints to search first

3bfc440

update search endpoint to include associations

e2bf235

allow search to work past 10000

761250f

add function documentation

4394dba

fix empty array bug

ec434b8

adrianbr requested review from adrianbr and dat-a-man October 27, 2025 12:18

add calls, emails, and meetings

0e96158

dat-a-man approved these changes Nov 6, 2025

View reviewed changes

dat-a-man requested a review from anuunchin November 6, 2025 09:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update Hubspot Source to use API Incrementally #662

Update Hubspot Source to use API Incrementally #662

Uh oh!

kevin-induro commented Oct 23, 2025

Uh oh!

dat-a-man left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Update Hubspot Source to use API Incrementally #662

Are you sure you want to change the base?

Update Hubspot Source to use API Incrementally #662

Uh oh!

Conversation

kevin-induro commented Oct 23, 2025

Tell us what you do here

Short description

Additional Context

Uh oh!

dat-a-man left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dat-a-man left a comment •

edited

Loading