What
- Matrix multiplication factorization
- split matrix to create smaller matrices to do multiplication.
- Do on one GPU and on another GPU, or both on the same GPU but separately
Hence
- reduce the amount of memory needed for both the weights and activations