Electrical and Comp Engineerng
EECE 5645: Parallel Processing for Data Analytics
Lecture - 4 credits
ND
EI
IC
FQ
SI
AD
DD
ER
WF
WD
WI
EX
CE
- Covers the fundamentals of parallel machine-learning algorithms, tailored specifically to learning tasks involving large data sets.
- Reviews methods for dealing with both large and high-dimensional data sets, emphasizing distributed implementations.
- Beyond covering the theory behind statistical data analysis, the course also offers a hands-on approach, using Spark as a development platform for parallel learning.
- Topics include, Apache Spark fundamentals, multithreaded/cluster execution, resilient distributed data structures, map-reduce operations, using key-value pairs, joins, convex optimization, gradient descent, linear regression, Gauss-Markov theorem, ridge and lasso regularization, feature selection, cross validation, variance vs. bias trade-off, classification, logistic regression, ROC curves and AUC, matrix and tensor factorization, graph-parallel algorithms and sparsity, Perceptron algorithm, and deep neural networks.
Covers the fundamentals of parallel machine-learning algorithms, tailored specifically to learning tasks involving large data sets. Show more.