Skip to content
This repository was archived by the owner on Jun 28, 2021. It is now read-only.
This repository was archived by the owner on Jun 28, 2021. It is now read-only.

Add "column headers detection function" when column headers row is unknown #294

@ev45ive

Description

@ev45ive

Is your feature request related to a problem? Please describe.
When column headers differ and their row differs per document (different "intro" in each document)
I have to parse once to find the header and then do it again to work from that line.
Also "from" vs "from_line" becomes confusing when the header is not the first row

Describe the solution you'd like
I think aside from static row number there should be option to add function that returns true or array of headers (column names) when it detects columns header row, and then we parse from that on with columns ( object fields ) named as headers

Describe alternatives you've considered
I've tried a lot of workarounds. using on_records with some global isHEader boolean and then columns to manually convert to columns. All felt like reinventing what library already does well (when the header is in the same place).

Second workaround was to parse once to locate header, and then start over parsing from that line - that seems to be working but code is much much complicated and I cannot just stream the file and do it in one pass.

Let me know how much of a problem would it be to make header detection and if you are accepting PRs what conditions / requirements for it to be accepted without much back and forth - maybe I could contribute. Whatever would work :-)

Aside from that - great job and awesome library guys! - It helped me a lot with a LOT of huge and nasty csv files. :-)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions