Random Vector Accumulator

  • Tried running RVA on a single processor and it processed 0.67% corpus in 25 seconds. But runs out of memory as the corpus size grows.
  • Using the python module shelve to store and access the index dictionary from the disk. RVA_single.py uses only one processor and takes 180 seconds, which is faster than Word2Vec, which takes 200 seconds when run on 8 processors.
  • The code is currently processing the whole corpus. Though I think the difference in time between RVA_single.py and RVA.py. is due to the multiprocessing proxy dict being slower than the regular python dictionary.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s