The comparison of articles and methods that perform the Interior Point Method (IPM) for solving SVM problems in parallel. Here, we highlight the articles in the survey that have interesting findings. The description of each column is as follows,

"Algorithm" indicates the corresponding algorithm's name or approach.

"Problem Type" shows whether the algorithm solves binary, multi-class, linear or/and non-linear classifications.

"Parallelism Type" highlights whether the algorithm uses the shared memory parallelism, distributed big data architectures, distributed HPC architecture, or/and GPU parallelism.

"Main Idea" simply shows the main contribution of the author(s).

"Pros" explains the remarks and findings of the algorithm. Besides, it highlights the advantages of the algorithm compared to the similar approaches, if possible.

"Cons" describes the shortcomings or the place for further improvements of the corresponding algorithm. Besides, it shows the weak points of the algorithm compared to the similar approaches, if possible.

Note the table can be sorted by clicking on the title of each column.
Algorithm	Problem Type	Parallelism Type	Main Idea	Pros	Cons	Reference
Algorithm	Problem Type	Parallelism Type	Main Idea	Pros	Cons	Reference
Incremental Approximate Matrix Factorization IPM	Non-linear and binary classifications	Shared memory parallelism	Employs incremental approximate matrix factorization using ICF and Kronecker	Reduces memory requirements Creates a balance between accuracy and computation time	Suffers from numerical stableness Exists a trade-off between speedup and accuracy May suffer from limited shared memory	Wu et al. [2006]
Hybrid MPI-OpenMP Linear IPM	Linear and binary classifications	Hybrid shared-distributed memory parallelism	Distributes data amongst parallel nodes using block structure and uses factorization for numerical stability	Higher classification accuracy compared to LibLinear Removes a dense matrix from the computation by reformulating the problem Good parallelizability Faster than a solo MPI implementation Numerical stability compared to the similar approaches Efficient computation of computationally expensive operations	Supports only linear classifications Requires loading of all samples into the memory	Woodsend et al. [2009]
PSVM	Non-linear and binary classifications	Distributed HPC Architectures	Reduces the problem size using matrix factorization and partitions the problem using row-based data blocks	Reduces memory requirements Higher speedup compared to the similar approaches Good scalability in terms of the number of cores Reduces the problem size Efficient computation of matrix inversion	A small fraction of the problem in each iteration is sequential May suffer from synchronization and communication overheads Reducing problem size impacts the performance in terms of the classification accuracy The suggested reduced size of problem is not optimal in many cases Sub-linear speedup due to sequential steps	Chang et al. in [2009]
SVM-PCG	Linear and binary classifications	Distributed HPC Architectures	Uses a linear preconditioned method to compute each iteration	Competitive performance with SVMLight Higher speedup compared to other similar approaches Reduces the problem size Efficient computations of matrix inversion and matrix operations Open access	No support for nonlinear kernels Exists sequential operations that have potential to be performed in parallel Benefits not fully clear for large-scale problems with a large number of features	Gertz et al. [2010]
GPU-cluster SVM	Non-linear and binary classifications	GPU parallelism	Accelerates a cross-validation process by performing multiple tasks in parallel	Higher speedup (90 times) compared to Libsvm using a single node Higher speedup (3 times) compared to the CPU implementations Scales well with the number of training samples Efficient memory access Efficient data transfer between the host and the device Efficient parameter tuning using cross-validation Avoids re-computation of computationally expensive tasks	Does not take the maximum advantage of GPU features, e.g., the pinned memory and unified virtual address May suffers from limited memory	Li et al. [2013]
HPSVM	Non-linear and binary classifications	GPU parallelism	Implements a heterogeneous parallel SVM and accelerates the computationally expensive matrix operations using an approximate factorization	Low memory requirements Fast convergence Good parallelizability Takes the maximum advantage of CPU-GPU parallelism using dual buffer 3-stage pipeline mechanism Higher speedup (11 times) compared to the CPU implementation Improves the work proposed by Li et al. [2013] in terms of data transfer and memory access Efficient hierarchical memory access Good scalability in terms of the number of training samples High training accuracy	Exists some sequential operations May suffer from limited memory	Li et al. [2016]