-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design function returning start/end indices for key sections in SC 13D and SC 13G #69
Labels
Comments
bdcallen
added a commit
that referenced
this issue
Jan 31, 2020
- Includes function get_key_indices which is designed to return a one-row dataframe containing key information on each sc13d or g document - Includes write_indexes_to_table which writes the results of get_key_indices into edgar.sc13dg_indexes - Relates to #69
bdcallen
added a commit
that referenced
this issue
Feb 6, 2020
- Includes code to clean text, get bounded segments of the text - Relates to #69
bdcallen
added a commit
that referenced
this issue
Feb 7, 2020
bdcallen
added a commit
that referenced
this issue
Feb 19, 2020
- Some changes made due to mistakes/problems found in outcomes from table - Relates to #69
bdcallen
added a commit
that referenced
this issue
Mar 11, 2020
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Looking at this initial comment from issue #62, the vast majority of SC 13D and 13G forms seem to follow a given structure, starting with the header, then a title page, then the cover pages with questions 1 to 14 (or 12 for 13G), then an item section, then signatures, then exhibits. For the vast majority of forms (something like 90%), the starts and ends of these section can be found through a number of key regular expressions, just like the simpler case with the cusip numbers.
I think we need a function which finds the starting and ending indices of these sections, as well as other information such as whether the form is of an alternate style, upper/lower bounds, and so on. I also think it would be helpful to have a program which makes this function write to a table in the database, so that we can get key information on cases which do not follow the normal pattern.
The text was updated successfully, but these errors were encountered: