Skip to content

Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in Reporting Language

Date
Wednesday, April 10, 2019
Presenters
Steven Cyphers
Presenter Company
GHD
City
Los Angeles
Location
us
Event
FME World Tour 2019
Session Type
User
Industry
AEC (Architecture Engineering and Construction)

Presentation Details

Using FME we build an API to collect and clean the US federal Security & Exchange Commission quarterly filings from the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) website. Using FME to quickly pool the filing data we perform sentiment analysis on the cleaned unstructured Management Discussion & Analysis (MD&A) data. We implement word to vector strategies to tokenize the fairly boilerplate text and assign the companies into groupings of changer and non-changer companies. This is done mainly graphing deltas in cosine similarity in the tokenized word vectors and also using word count vector strategies to flag language unattractive to investment. The end goal for this analysis is to forecast abnormal returns and find diversification opportunities which align with our existing clients.