Nice tool. I was really hoping though it can help me tabulate messy copied text ...

bonyt · on Oct 24, 2019

Tabula is a helpful tool for extracting tables from PDFs, although its more for large tables of data, often spanning many pages, rather than the odd copy-and-paste.

https://tabula.technology

As for your specific example, you can download tables from EDGAR in other formats, like HTML and iXBRL. The HTML table will usually paste into Excel well.

HTML: https://www.sec.gov/Archives/edgar/data/313216/0000313216190...

iXBRL: https://www.sec.gov/Archives/edgar/data/313216/0000313216180...

https://en.wikipedia.org/wiki/XBRL

dougbarrett · on Oct 24, 2019

I appreciate the feedback!

The unfortunate part of it is it's parsing the data based on the characters it finds in the text being processed, so if when you copy the data from your PDF reader, I'm guessing the data is positioned in the document using X/Y coordinates which is why it can't be formatted correctly.

I will definitely look at the document and see if my assumptions are incorrect, and if there is a different delimiter being used then it may be something I can work with.

udayrddy · on Oct 24, 2019

Was this complicated to others or easy to us - attached the https://extracttable.com conversion for you.

  Input & output - https://imgur.com/a/5j8Bblo

  Table# 1 csv - https://json-csv.com/c/2faG

  Table# 2 csv - https://json-csv.com/c/WESI

martishrek666 · on Oct 24, 2019

You need someone like this dude on video to handle excel part...it is a skill in itself

https://youtu.be/poyf3Cnb-MQ?t=2724