Created an AI machine learning bot to detect if a certain sentence given is a joke or otherwise.
Workflow
- Get the jokes off a joke website, and other statements off a different website using pyScrapy
- We used snipplets.com for factual statements and rinkworks.com for jokes
- Pre-process the data so it is uniform, and no certain aspect dominates
- For example, taking the knock – knock out of knock-knock jokes
- Limiting joke length, since our factual snipplets were quite short
- Separate the data into two types. Testing data and Training data.
- Devise up a model to use with the Training data.
- We used bag of words model, which takes in all the words of the sentence and classifies them according to their usage and other aspects
- Test the model using the Testing data!
- We achieved a 0.7 score using python’s cross_val_score along with our testing data.
- Surprisingly high considering we only used two websites!