Posts with tag RateMyProfessorBack to all posts
My last post provided a general introduction to the new word embedding of language (WEMs), and introduced an R package for easily performing basic operations on them. It was geared mostly towards people in the Digital Humanities community. This post looks more closely at a single word2vec model I’ve trained, on about 14 million reviews of faculty members from ratemyprofessors.com,
To be precise: it is a 500-dimensional skip-gram model with window of about 12 on lowercased, punctuation-free text using the original word2vec C code. I’ve then heavily culled the vocabulary to remove words that usually appear uppercased, on the assumption that they are proper nouns.