Python Pandas Series basics – Programming digest

Python Pandas

Python Pandas:

pandas is a Python library that serves fast, flexible, and eloquent data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the basic high-level building block for doing practical, real world data analysis in Python. The two constitutional data structures of Python Pandas, Series (one-dimensional) and DataFrame (two-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. Pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other third-party libraries. Pandas is well suited for inserting and deleting columns from DataFrame, for easy handling of missing data (represented as NaN), explicitly aligning data to a set of labels, converting data in other Python and NumPy data structures into DataFrame objects, intelligent label-based slicing, indexing, and subsetting of large data sets, merging and joining of data sets, and flexible reshaping. Additionally, it has robust input/output tools for loading data from CSV files, Excel files, databases, and other formats. You have to import a Pandas library to make use of various functions and data structures defined in Python Pandas.

Python Pandas is usually renamed as pd.

Python Pandas Series:

Series is a 1-dimensional labeled array adept of holding any data type (integers, strings, floating-point numbers, Python objects, etc.). The axis labels are accordingly referred to as the index. Python Pandas Series is created using series() method and its syntax is,

s = pd.Series(data, index=None)

Here, s is the Pandas Series, data can be a Python dict, a ndarray, or a scalar value (like 5). The passed index is a list of axis labels. Both integer and label-based indexing are supported. If the index is not arranged, then the index will default to range(n) where n is the length of data. For example, Create Series from ndarrays

Import NumPy and Pandas libraries. Create a series using ndarray which is NumPy’s array class using Series() method  which returns a Pandas Series type s. You can also specify axis labels for index, i.e., index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’]. When data is a ndarray, the index must be the same length as data. In series s, by default the type of values of all the elements is dtype: float64. You can find out the index for a series using index attribute. The values attribute returns a ndarray  containing only values, while the axis labels are removed. If no labels for the index is passed, one will be created having a range of index values [0,…, len(data) – 1].

Python Pandas Create Series from Dictionaries

Series can be created from the dictionary. Create a dictionary and pass it to Series() method. When a series is created using dictionaries, by default the keys will be index labels. While creating series using a dictionary, if labels are passed for the index, the values corresponding to the labels in the index will be pulled out. The order of index labels will be preserved. If a value is not associated for a label, then NaN is printed. NaN (not a number) is the standard missing data marker used in pandas.

Create Series from Scalar data

You can create a Python Pandas Series from scalar value. Here scalar value is five. If data is a scalar value, an index must be arranged. The value will be repeated to match the length of the index.

Python Pandas Series Indexing and Slicing

You can provide index or slice data by index numbers in a Python Pandas Series. You can also specify a Boolean array indexing for Pandas Series. Multiple indices are specified as a list in. The index can be an integer value or a label _. Values associated with labeled index are extracted and displayed _– . Check for the presence of a label in Series using in operator .

Python Pandas Working with Text Data

The Pandas Series supports a set of string processing methods that make it easy to operate on each element of the array. These methods are accessible via the str attribute and they generally have the same name as that of the built-in Python string methods.

Various string methods to operate with Python Pandas Series is discussed.

Related Article:

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button