About Applied Machine Learning in Python course
This course introduces students to applied machine learning, focusing more on the techniques and methods than on the statistics behind the methods. The course begins with a discussion of how machine learning differs from descriptive statistics and an introduction to the scikit learn toolkit via a tutorial. The issue of data dimensionality is addressed, the problem of clustering data is solved, and how to estimate those clusters is discussed. Supervised approaches to building predictive models are described, and students will be able to apply scikit learn's predictive modeling techniques while understanding the issues associated with generalizability of the data (e.g., cross-validation, overfitting). The course concludes with a discussion of more advanced techniques such as ensemble building and the practical limitations of predictive models. By the end of the course, students will be able to identify the difference between supervised (classification) and unsupervised (clustering) analysis techniques, determine which technique to apply for a particular dataset and need, design features to meet those needs, and write python code to perform the analysis.
This course should be taken after Introduction to Data Science with Python and Applied Plotting, Graphing, and Presenting Data with Python, and before Applied Text Analysis with Python and Applied Social Analytics with Python.