Word2vec on Wave DPU (MIMD)

DPU accelerated Word2vec training algorithm
Based on Google reference design (CBOW)
  • Conversion of floating point reference design to fixed point
  • Mapped complex software implementation to MIMD Wave DPU
Complex MIMD design with
  • approx. 24% device utilization
  • Wave DPU has 16k PEs
  • Word2vec utilizes 7600 PEs
Performance (training time)
  • 4x performance gain over software reference design
Trained model achieved 30% accuracy on Wiki corups
image