This course is designed for students who are new to the world of data science. After the introduction of some basic arithmetic, variables, and data structures in Python, students will start to learn how to collect and extract data from real datasets. Some data analytical skills using the control flows and Python packages (e.g., NumPy, SciPy, Pandas, etc.) will be introduced. To address the needs of big data processing, some distributed computing frameworks (e.g., Spark) and visualization tools with Python will be discussed. Students may apply some basic learning algorithms with Python packages (e.g., scikit-learn) to extract knowledge from data.
apply the Python language fundamentals, including basic syntax, variables, and process flows, to write their first program
apply functions and import packages to work with complex and/or large data sets
apply scientific packages (e.g., NumPy and SciPy) to perform useful computations
process text file using external packages (e.g., tabula)
apply stunning data visualization tools to visualize large data sets