$LU$ Factorization Algorithms on Distributed-Memory Multiprocessor Architectures