I am looking for a python developer to extract competitor names from approximately 1400 Corporate 10-K [login to view URL] filings from EDGAR [login to view URL] Then, the extracted names must be matched with another list of firm names.
The application would need to do the following:
1. Identify the names of competitors mentioned in 10-K filing that I provide in .txt format. For example, the attached 10-K sample [login to view URL] contains the following statement: “Competitors for our analog and DSP products include Broadcom Corporation, Cirrus Logic Inc., Infineon Technologies, Linear Technology Corporation, Maxim Integrated Products, Inc., National Semiconductor Corporation, Phillips Semiconductor, ST Microelectronics and Texas Instruments, Inc.
2. Extract the names and store them in an excel file called SEC: Column A: Year, Column B: Competitor Name. Note that the year is stated in the file name (i.e., xxxx-05-xxxxx =2005)
Deliverable: Python script that extract names
3. Match all rows in the SEC excel file with a master file named MASTER containing several thousand of firms names in order to identify those firms who have been listed as a competitor in step
4. For every match found, update the column ACCESSION NUMBER in the MASTER Excel file with the file name
Deliverable: Python script that matches names
Important Note: the firm names in MASTER will often be not identical. For example, the MSTER may list a firm name as Broadcom Corp. whereas the SEC Excel will list it as Broadcom Corporation.
ADDITIONAL General Information
Information about accessing EDGAR data can be found at: [login to view URL]
The EDGAR user interface is at [login to view URL] (New Version) or [login to view URL] (Old Version)
An example of and SEC Filing: [login to view URL]
Project preferences and deliveries:
• One-time Project: Develop script for academic research project
• Language: Python
• Starting date: Immediately
• Delivery date: will be mutually agreed upon
• Project Stage: Fully Specified
Fixed Price Project with two milestones: (1) After completing Step 2 has been validates and (2) after the completion of step 4 has been validate. Validation based on random sample and verification of accuracy of the script and output.
Project Type: One-time project
The sec 10-k filings for step 1 and the Master file for step 3 can be downloaded from my dropbox https://www.dropbox.com/sh/vae37t0zuzclbe7/AADR8Y9w4RC-i-BUY8Ww0_1wa?dl=0
27 freelancers are bidding on average $461 for this job
I am well-versed with Python scraping and text-processing, the [login to view URL] site seems to have some latency in response but your requests seem not to be so many, right? I would be glad to discuss more details
Hello. After reviewing your post, I am very interested in that due to my experience. I am an expert python web scraper. You will get a perfect result. Best regards.
I have nice experience of 1-2 years of building custom web scrapers with python, I have been providing quality data to my clients from vast sources. Message me for more clarification of project.
Hi. I can create auto scripts to scrape websites, auto click, format txt, csv, xls, xlsx, doc, docx, rtf, json, xml, database files as you request. I can start right now