Python, a multi-paradigm programming language, has become the language of choice for data scientists for data analysis, visualization, and machine learning.
In this video course, you will explore two of the most important Python packages used by Data Analysts. You will start off by learning how to set up the right environment for data analysis with Python. Here, you’ll learn to install the right Python distribution, as well as work with the Jupyter notebook, and set up a database. After that you will dive into Python’s NumPy package, Python’s powerful extension with advanced mathematical functions. You will learn to create NumPy arrays, as well as employ different array methods and functions. Then, you will explore Python’s Pandas extension, where you will learn to subset your data, as well as dive into data mapping using Pandas. You’ll also learn to manage your data sets by sorting and ranking them. Finally, you will learn to index and group your data for sophisticated data analysis and manipulation..
Numpy. This forms the basis for everything else. The central object in Numpy is the Numpy array, on which you can do various operations.
The key is that a Numpy array isn’t just a regular array you’d see in a language like Java or C++, but instead is like a mathematical object like a vector or a matrix. The most important aspect of Numpy arrays is that they are optimized for speed. So we’re going to do a demo where I prove to you that using a Numpy vectorized operation is faster than using a Python list.
Pandas. Pandas is great because it does a lot of things under the hood, which makes your life easier because you then don’t need to code those things manually.Pandas makes working with datasets a lot like R, if you’re familiar with R. The central object in R and Pandas is the DataFrame. We’ll look at how much easier it is to load a dataset using Pandas vs. trying to do it manually.
I like to think of Scipy as an addon library to Numpy. Whereas Numpy provides basic building blocks, like vectors, matrices, and operations on them, Scipy uses those general building blocks to do specific things.For example, Scipy can do many common statistics calculations, including getting the PDF value, the CDF value, sampling from a distribution, and statistical testing.
NumPy is the fundamental package for scientific computing with Python. It contains among other things:
- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
At the end of this course, you will have a thorough understanding of Numpy’ s features and when to use them. Numpy is mainly used in matrix computing. We’ll do a number of examples specific to matrix computing, which will allow you to see the various scenarios in which Numpy is helpful. There are a few computational computing libraries available for Python. It’s important to know when to choose one over the other. Through rigorous exercises, you’ll experience where Numpy is powerful and develop and understanding of the scenarios in which Numpy is most useful.
- Express fully why Numpy should be used
- Ability to install Numpy
- Understanding of how to use Numpy