Split Pandas DataFrame by Rows in PythonPandas is a powerful and open-source Python library that is used for manipulating data and is useful in performing data analysis tasks; pandas provide data structures and functions that are very helpful in performing data analysis tasks. Pandas is built on top of the NumPy library, which is well-suited for working with tabular data, such as spreadsheets or SQL tables. The Pandas library is versatile and easy to use, which makes it a powerful tool for data analysis. Data scientists use Pandas to work with structured data in Python. Pandas are used in conjunction with other libraries that are used for data science. Pandas is built on top of the NumPy library, which means that a lot of functions are taken from NumPy. The data generated by Pandas are used to plot the function of Matplotlib, perform statistical analysis in SciPy, and use the machine learning algorithm in Scikit-learn. The function of Pandas Library:
Visualization of data. The pd represents an alias for the Pandas. It is not necessary to use the alias; this alias just helps in writing less code every time and can be used to write the code cleanly. There are two types of data structures provided by pandas:
Pandas Series:Pandas series is a one-dimensional array that is used to hold data of any type (integer, string, float, Python objects, etc.). The axis labels are called as indexes. Pandas series are a type of column in an Excel Sheet. The labels in the Pandas series must not be unique but must be a hashable type. Let's see how to create a series in Pandas. The series can be created with the help of lists, dictionaries, scaler values, etc. Output: Pandas Series: Series([], dtype: float64) Pandas Series: 0 p 1 a 2 n 3 d 4 a dtype: object In the above code, the panda's library is imported as pd, and the NumPy library is imported as np. A series is created with the help of the Series method provided by the Pandas, and a numpy array of characters is created with the help of the array() method in numpy; the array values are passed in the series() method in pandas, and the series is printed. Let's see how to create a data frame in Pandas: DataFrame is like tables in which the values are stored in the form of rows and columns. DataFrame can be created by loading datasets from SQL databases, CSV files, or Excel files. Pandas dataframe can also be created from lists, dictionaries and from a list of dictionaries etc. Example: Output: Empty DataFrame Columns: [] Index: [] 0 0 Data 1 Frame 2 in 3 Pandas Explanation: In the above code, the panda's module is imported, a DataFrame constructor is made, a list is created, the list is passed to the DataFrame constructor, and the data is printed. Many a time, an import error occurs when you try to import the pandas. This happens due to improper installation of the panda's library, and the panda's library is not installed. Let's see how to split Pandas DataFrame: Example: Output: Create DataFrame: Courses Fee Discount Duration 0 Spark 22000 1000 35days 1 PySpark 25000 2300 35days 2 Hadoop 23000 1000 40days 3 Python 24000 1200 30 days 4 Pandas 26000 2500 25days Explanation: In the above code, the Pandas module is imported, and a data frame is created with the help of a dictionary. The column values are split using the local method provided by Pandas. Split Pandas Dataframe by rows using iloc[] split function:The iloc attribute provided by Python helps in splitting the dataframe by rows. The iloc is used to get rows and columns by position or index. Splitting Dataframe by Row:This method is used to get the specific portion based on rows from the DataFrame. Let's see how to split the data frame. Code: Output: Create DataFrame: Courses Fee Discount Duration 0 Spark 22000 1000 35days 1 PySpark 25000 2300 35days 2 Hadoop 23000 1000 40days 3 Python 24000 1200 30 days 4 Pandas 26000 2500 25days ========================= Courses Fee Discount Duration 0 Spark 22000 1000 35days 1 PySpark 25000 2300 35days ========================= Courses Fee Discount Duration 2 Hadoop 23000 1000 40days 3 Python 24000 1200 30 days 4 Pandas 26000 2500 25days Explanation: In the above code, the Pandas module is imported and a data frame is create with dictionary data. With the help of the local method, the data frame is split by rows. Split Dataframe by Columns:The data frames can be split into columns with the help of the local method based on rows. Let's see how to split the data frame by columns. Code: Output: Create DataFrame: Courses Fee Discount Duration 0 Spark 22000 1000 35days 1 PySpark 25000 2300 35days 2 Hadoop 23000 1000 40days 3 Python 24000 1200 30days 4 Pandas 26000 2500 25days ========================= Courses Fee 0 Spark 22000 1 PySpark 25000 2 Hadoop 23000 3 Python 24000 4 Pandas 26000 ===================== Discount Duration 0 1000 35days 1 2300 35days 2 1000 40days 3 1200 30days 4 2500 25days ===================== Explanation: In the above code, the Pandas module is imported, and with the help of a dictionary, a dataframe is created. The data is split into columns based on rows. Conclusion:Splitting rows in Pandas is very important in the context of data analysis. There are various methods by which the Pandas DataFrames can be split into rows in Python. Next TopicSql using python |
We provides tutorials and interview questions of all technology like java tutorial, android, java frameworks
G-13, 2nd Floor, Sec-3, Noida, UP, 201301, India