Abstract: When running in Parameter Server (PS), the Distributed Stochastic Gradient Descent (D-SGD) incurs significant communication delays and huge communication overhead due to the model ...