Unit 4: Programming in Python – Class 10 SEE Computer Science Notes

Importantedunotes.com

Unit 4

Programming in Python

Chapter 4: File Handling using Panda Library

Comprehensive study guide exploring Python file handling, reading and writing CSV files using Pandas, and fully solved textbook exercises.

Welcome to Chapter Four! In this section of our Python journey, we will interact directly with the files on your computer.

File handling empowers programs to read input datasets and save outputs permanently. We will explore native Python file operations, modes of access, and leverage the powerful Pandas library to effortlessly manage CSV files. Below, you will also find complete textbook exercise solutions for your Class 10 SEE preparation.

1.File Handling using Panda

4.6.1 Concept of File Handling in Python

File handling involves interacting directly with the files on your computer’s hard drive to read data from them or write new data into them.

Benefits of File Handling in Python:

Versatility: You can perform a wide range of operations, such as creating, reading, writing, appending, renaming, and deleting files.

Flexibility: Python allows you to work seamlessly with different file types (e.g., standard text files, binary files, or CSVs).

User-friendly: Python provides a highly intuitive, built-in interface for file handling, making it easy to manipulate files with just a few lines of code.

Cross-platform: Python’s file-handling functions work identically across different operating systems (Windows, Mac, Linux).

Difficulties of File Handling in Python:

Error-prone: File operations can easily crash a program if you aren’t careful (e.g., trying to open a file that doesn’t exist or lacking administrative file permissions).

Security Risks: Accepting user input to access or modify files can pose serious security vulnerabilities if malicious files are executed.

Complexity: Working with advanced, highly structured file formats can get complicated and requires careful coding.

Performance: Python’s native file handling can sometimes be slower than lower-level languages (like C++) when dealing with exceptionally massive files.

4.6.2 Modes of File Handling

To do anything with a file in Python, the very first step is to open it. We use Python’s built-in open() function for this.

Syntax:

with open('Filename.txt', 'mode') as file:
    # Perform file operations here

Access Modes in Python

When opening a file, you must tell Python how you intend to use it. Here are the core modes:

Mode	Name	Behavior
`"r"`	Read-only	Allows you to read the file. You cannot change it. Raises an error if the file does not exist.
`"w"`	Write-only	Allows you to write data. Creates the file if it doesn’t exist. If the file already has data, it erases and overwrites everything!
`"a"`	Append-only	Allows you to add new data to the very end of the file. Creates the file if it doesn’t exist. It does not erase existing data.
`"x"`	Create	Creates a brand new, empty file. It throws an error if a file with that name already exists.

1. Reading a File

To read a file, it must be opened in "r" mode. Python offers a few ways to read the content:

file.read(): Reads the entire content of the file all at once.

file.read(10): Reads only the first 10 characters of the file.

file.readline(): Reads only the very first line.

file.readlines(): Reads all the lines and returns them as a list.

2. Creating and Writing to a File

When you open a file in "w" mode, you can use file.write() to insert text. Remember, "w" will completely erase whatever was in the file previously!

file = open("Test.txt", "w")
file.write("Hello, World!")
file.close()

3. Appending to a File

If you want to add text without destroying the old data, use "a" (Append) mode.

write(): Adds a single string to the end of the file.

writelines(): Allows you to insert multiple strings (like a list of lines) in one go.

4. Closing a File

Once you are done reading or writing, you must close the file using file.close(). Closing the file terminates all active resources, ensures your data is saved properly, and frees up system memory.

4.6.3 Reading and Writing CSV Files (Using Pandas)

CSV stands for Comma Separated Values. It is the most common format for exporting spreadsheets and databases. A CSV is simply a plain text file where every piece of data is separated by a comma.

Why are CSVs so popular?

Portability: Being plain text, they can be opened in virtually any program (Notepad, Excel, Google Sheets).

Simplicity: Their structure is incredibly straightforward, making them easy to generate.

Wide Support: Almost every programming language supports CSVs natively.

Using the Pandas Library

While Python has a built-in csv module, Pandas makes data importing and analyzing phenomenally easier. Pandas builds on other packages to give us a single, convenient toolkit for data analysis.

To use Pandas, you must first install it using your terminal: $ pip install pandas

1. Reading a CSV with Pandas

We use the pd.read_csv() function to instantly load a CSV file into a powerful table structure called a DataFrame.

import pandas as pd

# Load the file into a DataFrame
data = pd.read_csv("Salary_Data.csv", delimiter=',')

# Display the contents
print(data)

Helpful DataFrame Commands:

data.columns: Extracts and displays just the header/field names of the table.

data.Salary: Extracts and displays all the rows for the specific “Salary” column.

data.info(): Shows a summary of the DataFrame (data types, missing values).

data.shape: Tells you the exact size of the table (rows, columns).

2. Writing to a CSV with Pandas

If you have data inside Python and want to save it as a brand new CSV file, you use the to_csv() method.

import pandas as pd

# 1. Create a dictionary of data
data = {
    'Product': ['Laptop', 'Smartphone', 'Tablet'],
    'Price': [75000, 15000, 20000]
}

# 2. Convert it into a Pandas DataFrame
df = pd.DataFrame(data)

# 3. Save it as a CSV file (index=False prevents Pandas from writing row numbers)
df.to_csv('products.csv', sep=',', index=False)

Exercise 1: Choose the correct answer.

Select an option to view the correct answer and justification.

1. Which Python library is particularly useful for simplifying file handling of structured data like CSV files?

a) math

b) random

c) pandas

d) turtle

Correct Answer: c) pandas
Justification: Pandas provides powerful, high-level data structures like DataFrames that make reading, writing, and analyzing structured CSV files incredibly simple.

2. What is the primary function used in Pandas to read data from a CSV file?

a) open()

b) read_file()

c) pd.read_csv()

d) df.read_csv()

Correct Answer: c) pd.read_csv()
Justification: read_csv() is the standard Pandas function utilized to load a CSV file directly into a DataFrame.

3. When reading a CSV file with Pandas, which parameter in read_csv() is used to specify the separator between values?

a) separator

b) sep

c) delimiter

d) value_sep

Correct Answer: c) delimiter
Justification: The parameter delimiter (or sep) tells Pandas exactly which character is being used to separate the columns of data in the text file.

4. What attribute of a Pandas DataFrame can be used to obtain the header or field names after reading a CSV file?

a) headers

b) columns

c) fields

d) names

Correct Answer: b) columns
Justification: Calling data.columns on a DataFrame returns an index of all the header/field names present in the table.

5. Which Pandas method is used to write a DataFrame to a CSV file?

a) write_csv()

b) to_file()

c) df.to_csv()

d) pd.write_csv()

Correct Answer: c) df.to_csv()
Justification: The to_csv() method is called directly on an existing DataFrame object (like df) to export its contents into a standard CSV file.

6. When writing a DataFrame to a CSV file using to_csv(), what does the parameter index=False do?

a) It includes the index as a column in the CSV.

b) It removes the header row from the CSV.

c) It excludes the index column from the CSV.

d) It sorts the data based on the index.

Correct Answer: c) It excludes the index column from the CSV.
Justification: By default, Pandas will write the row numbers (0, 1, 2…) into the file. Setting index=False prevents these unnecessary index numbers from being saved.

7. What is the default separator used by Pandas when reading or writing CSV files?

a) semicolon (;)

b) tab (\t)

c) comma (,)

d) space ( )

Correct Answer: c) comma (,)
Justification: CSV literally stands for “Comma Separated Values,” making the comma the universal default delimiter.

8. Which mode should be used with the built-in open() function if you want to read a file?

a) “w”

b) “a”

c) “r”

d) “x”

Correct Answer: c) “r”
Justification: The "r" mode stands for Read-only. It allows a program to view file contents without risking accidental modifications.

9. Which mode, when used with the built-in open() function, will overwrite the file if it exists or create a new file if it doesn’t?

a) “r”

b) “a”

c) “w”

d) “x”

Correct Answer: c) “w”
Justification: The "w" (Write) mode completely erases existing data upon opening and allows you to write entirely fresh content.

10. Which Pandas function is used to create a DataFrame from a dictionary or a list of lists?

a) read_csv()

b) to_csv()

c) pd.DataFrame()

d) create_df()

Correct Answer: c) pd.DataFrame()
Justification: The pd.DataFrame() method constructs a tabular DataFrame object in memory out of standard Python data structures like lists or dictionaries.

Exercise 2: Write short answers to these questions.

1. What are the primary advantages of using the Pandas library for file handling in Python? 2 Marks

Pandas makes manipulating, analyzing, and managing massive datasets effortless. It abstracts away complex code by using DataFrames (table-like structures) and provides extremely convenient, one-line methods to read and write between different file formats like CSV, Excel, and SQL.

2. Explain the concept of a CSV file and why it is a common format for data exchange. 2 Marks

A CSV (Comma Separated Values) file is a simple, plain-text file where each piece of data is separated by a comma. It is universally common for data exchange because of its portability (it can be opened by any text editor or spreadsheet program like Excel) and wide support across all programming languages.

3. What is the first step you need to take to use the Pandas library in your Python script for file handling? 2 Marks

You must first import the library into your script by writing the command import pandas as pd at the very top of your file. (Note: If it is not installed on your computer, you must first run pip install pandas in your terminal).

4. Describe the basic syntax for reading a CSV file into a Pandas DataFrame. 2 Marks

The basic syntax utilizes the read_csv() function. You assign the result to a variable to hold the DataFrame: data = pd.read_csv('filename.csv', delimiter=',')

5. How can you access a specific column of data after reading a CSV file into a Pandas DataFrame? Provide an example. 2 Marks

You can access a specific column by appending a dot and the exact column name to your DataFrame variable.
Example: If your DataFrame is named data and you want the salary column, you type: data.Salary.

6. Explain the basic syntax for writing a Pandas DataFrame to a CSV file. 2 Marks

You apply the to_csv() method directly to your DataFrame object. You specify the desired filename and typically set index=False to avoid writing row numbers.
Syntax: df.to_csv('filename.csv', sep=',', index=False)

7. What happens if you try to open a non-existent file in read (“r”) mode using the built-in open() function? 2 Marks

The Python interpreter will immediately raise a FileNotFoundError (crash), because read mode requires the file to already exist on your system before it can be opened.

8. Explain the difference between the write (“w”) mode and the append (“a”) mode when opening a file. 2 Marks

While both modes create a new file if one doesn’t exist, they behave entirely differently with existing files. Write ("w") mode will brutally erase and overwrite all existing content in the file. Append ("a") mode safely preserves existing content and simply adds new data to the very end of the file.

9. Why is it important to close a file after you are done with read or write operations (even when using Pandas)? 2 Marks

Closing a file using the close() method terminates active resources, frees up system memory, and ensures that any data waiting in the memory buffer is completely and safely written to the hard disk without corruption.

Exercise 3: Long Answer Questions.

1. Write the steps to read data from a CSV file using Pandas in Python. Also, give a simple code to: Import the Pandas library, and Read the CSV file. 4 Marks

To read data from a CSV file using Pandas, you must follow these steps:

Step 1: Ensure Pandas is installed on your machine (pip install pandas).

Step 2: Open your Python script and import the Pandas library, usually utilizing the standard pd alias.

Step 3: Use the pd.read_csv() function, passing the target file name (and its file path if it isn’t in the same folder) as a string argument.

Step 4: Store the result inside a variable so you can interact with the generated DataFrame.

Simple Code:

# 1. Import the Pandas library
import pandas as pd

# 2. Read the CSV file and store it in a DataFrame
df = pd.read_csv("employee_data.csv")

# Optional: Print to verify it worked
print(df)

2. Imagine you have a CSV file named “student_data.csv” with columns “Name”, “Age”, and “Grade”. Write a Python program using the Pandas library to:

a. Read the data from the “student_data.csv” file into a DataFrame.
b. Print the first 5 rows of the DataFrame.
c. Calculate the average age of the students.
d. Create a new DataFrame containing only the students with a “Grade” of “A”. 4 Marks

import pandas as pd

# a. Read the data into a DataFrame
df = pd.read_csv("student_data.csv")

# b. Print the first 5 rows of the DataFrame
# (The .head() function automatically grabs the first 5 rows)
print("--- First 5 Rows ---")
print(df.head())

# c. Calculate the average age of the students
# (We access the 'Age' column and use the .mean() function)
average_age = df['Age'].mean()
print("\nAverage Age of Students:", average_age)

# d. Create a new DataFrame containing only students with an "A" Grade
# (We filter the DataFrame using a logical condition)
grade_a_students = df[df['Grade'] == 'A']

print("\n--- Students with Grade A ---")
print(grade_a_students)