- The RVA code took 1898 seconds to run on 12487761 words (0.67% of the corpus), while Word2Vec took only 200 seconds.
- method of generating sparse random vectors needs to be made more efficient.
- when I checked the cpu usage while generating Word2Vec embeddings across all the workers, it was 100%. But for RVA it never crossed 30%. For some reason multiprocessing library of python is unable to utilize the 100% of the cpu.
- Visualizing the RVA embeddings using t-SNE imported from sklearn, the RVA was able to place a few similar concepts together. The embeddings in the image are of dimension 200 and generated using window size 10.