nlp - Predicting next word with text2vec in R -


i building language model in r predict next word in sentence based on previous words. model simple ngram model kneser-ney smoothing. predicts next word finding ngram maximum probability (frequency) in training set, smoothing offers way interpolate lower order ngrams, can advantageous in cases higher order ngrams have low frequency , may not offer reliable prediction. while method works reasonably well, 'fails in cases n-gram cannot not capture context. example, "it warm , sunny outside, let's go the..." , "it cold , raining outside, let's go the..." suggest same prediction, because context of weather not captured in last n-gram (assuming n<5).

i looking more advanced methods , found text2vec package, allows map words vector space words similar meaning represented similar (close) vectors. have feeling representation can helpful next word prediction, cannot figure out how define training task. quesiton if text2vec right tool use next word prediction , if yes, suitable prediction algorithm can used task?

you can try char-rnn or word-rnn (google little bit). character-level model r/mxnet implementation take mxnet examples. possible extend code word-level model using text2vec glove embeddings.

if have success, let know (i mean text2vec or/and mxnet developers). interesting case r community. wanted perform such model/experiment, still haven't time that.


Comments

Popular posts from this blog

Django REST Framework perform_create: You cannot call `.save()` after accessing `serializer.data` -

Why does Go error when trying to marshal this JSON? -