Unit 4: Programming in Python – Class 10 SEE Computer Science Notes

Importantedunotes.com

Unit 4

Programming in Python

Chapter Five: Introduction to Data Visualization & Practical Programs

Comprehensive study guide exploring data visualization (Matplotlib, Seaborn, Plotly) and a full catalogue of 18 practical Python programs.

1. Introduction to Data Visualization

5.1 What is Data Visualization?

Data visualization means showing data using visual elements like charts, graphs, or maps to understand it better. Instead of reading through endless rows of numbers, a visual graph helps our brains instantly see patterns, trends, and connections in a simple and intuitive way.

Importance of Data Visualization

Simplifies Complex Data: It helps people understand massive amounts of data easily by turning abstract numbers into visual shapes.

Reveals Insights: It makes it much easier to spot trends (like rising sales), patterns (like seasonal weather changes), and entirely new ideas.

Improves Decision Making: Today, businesses, doctors, and engineers use data to make crucial decisions. Since a huge amount of data is created every single day, visualization helps professionals make sense of it quickly and share their ideas clearly with others.

5.2 Popular Python Libraries for Visualization

Python is famous for its data science capabilities, and it has many useful libraries designed specifically to make charts. Each library has its own unique features:

Matplotlib: The original and most popular plotting library. It is fantastic for creating simple, static graphs like bar charts and line graphs.

Seaborn: Built directly on top of Matplotlib, Seaborn makes creating beautiful, colorful, and complex statistical charts incredibly easy.

Plotly: A modern library used to create highly interactive and dynamic graphs (where you can zoom, hover, and click on the data).

5.3 Deep Dive: Matplotlib

Matplotlib is the foundational Python plotting library. It uses a “low-level” interface, which means it offers you a massive amount of freedom and customization, but it might require you to write a bit more code compared to newer libraries.

Installation: You can install it using your terminal: pip install matplotlib or conda install matplotlib.

Importing: It is typically imported using a standard shortcut: import matplotlib.pyplot as plt

1. Scatter Plot

Scatter plots use individual dots to represent values for two different numeric variables. They are perfect for observing relationships or correlations between data points. We use the plt.scatter() method.

import pandas as pd
import matplotlib.pyplot as plt 

# Reading the dataset
dataset = pd.read_csv("Stu_data.csv")

# Plotting the scatter chart
plt.scatter(dataset['Name'], dataset['Marks']) 
plt.title("Scatter Plot") 
plt.xlabel('Name') 
plt.ylabel('Marks') 
plt.show()

2. Bar Chart

A bar chart represents data categories using rectangular bars. The height or length of the bar directly corresponds to the data value it represents. We use the plt.bar() method.

import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv("tips.csv")
plt.bar(data['total_bill'], data['day'])
plt.title("Bar Chart")
plt.xlabel('Day')
plt.ylabel('Tip')
plt.show()

5.4 Deep Dive: Seaborn

Seaborn is a library designed to make beautiful, highly informative charts with minimal effort. Because it is built on top of Matplotlib, it works seamlessly with Pandas DataFrames. It is especially great for showing complex patterns through Line plots, Bar plots, and Heatmaps.

Installation: pip install seaborn

Line Plot

To plot a smooth line graph in Seaborn, we use the lineplot() method.

import pandas as pd
import seaborn as sn
import matplotlib.pyplot as plt 

dataset = pd.read_csv("Stu_data.csv") 

# Creating the line plot
sn.lineplot(x='Name', y='Marks', data=dataset) 
plt.show()

5.5 Deep Dive: Plotly

Plotly is an open-source Python library used to make interactive charts. Unlike static images, Plotly graphs let you click, zoom, hover over data points for exact numbers, and edit charts directly in your web browser or Jupyter notebook. It is perfect for 3D charts, scientific graphs, and financial dashboards.

Installation: pip install plotly

1. Interactive Scatter Plot

Plotly Express makes it easy to generate interactive scatter plots using plotly.express.scatter().

import pandas as pd
import plotly.express as px

dataset = pd.read_csv("tips.csv") 
# Color='smoker' automatically color-codes the dots based on that category!
graph = px.scatter(dataset, x="total_bill", y="size", color='smoker') 
graph.show()

2. Interactive Line Chart

Line plots in Plotly are highly accessible and easy to style. We use px.line().

import plotly.express as px
import pandas as pd

data = pd.read_csv("stu_data.csv")
fig = px.line(data, x='Name', y='Marks', color='Gender')
fig.show()

3. Interactive Bar Chart

With Plotly Express, you can create interactive bar charts effortlessly using the px.bar() method.

import plotly.express as px
import pandas as pd

data = pd.read_csv("Stu_data.csv")
fig = px.bar(data, x='Name', y='Marks', color='Gender')
fig.show()

Exercise 1: Choose the correct answer.

Select an option to view the correct answer and justification.

1. What is the primary goal of data visualization?

a) To store data efficiently.

b) To perform complex calculations on data.

c) To understand data through visual context.

d) To secure data from unauthorized access.

Correct Answer: c) To understand data through visual context.
Justification: Data visualization turns raw, hard-to-read numbers into charts and graphs, making it much easier for our brains to understand patterns, trends, and context at a glance.

2. Which of the following is a popular Python library for creating basic graphs like line charts and bar charts?

a) Seaborn

b) Plotly

c) Matplotlib

d) Pandas

Correct Answer: c) Matplotlib
Justification: While Seaborn and Plotly are great for advanced or interactive charts, the text specifies that Matplotlib is the foundational library specifically suited for creating basic graphs like line charts and bar charts.

3. Which type of plot uses dots to represent relationships between variables?

a) Bar chart

b) Line chart

c) Scatter plot

d) Pie plot

Correct Answer: c) Scatter plot
Justification: A scatter plot places individual dots on an X and Y axis to show how two different variables relate to or correlate with each other.

4. What type of chart uses rectangular bars to represent data categories?

a) Scatter plot

b) Line chart

c) Bar chart

d) Pie plot

Correct Answer: c) Bar chart
Justification: A bar chart uses rectangular bars where the length or height of the bar directly corresponds to the numeric value of the data category it represents.

5. Which data visualization library in Python is built on Matplotlib and offers more advanced statistical visualizations?

a) Plotly

b) Pandas

c) Seaborn

d) GGPlot

Correct Answer: c) Seaborn
Justification: Seaborn is explicitly built on top of Matplotlib. It simplifies the code needed to create beautiful, complex statistical charts and works seamlessly with Pandas.

6. Which Plotly method is used to create a scatter plot?

a) scatter()

b) line()

c) bar()

d) pie()

Correct Answer: a) scatter()
Justification: In the Plotly Express module, the scatter() method (e.g., plotly.express.scatter()) is called to generate an interactive scatter plot.

7. Which Plotly Express function is used to create a line chart?

a) px.scatter()

b) px.line()

c) px.bar()

d) px.pie()

Correct Answer: b) px.line()
Justification: If Plotly Express is imported as px, the function used to generate a line chart is px.line().

8. Which Matplotlib function is commonly used to create a pie chart?

a) plt.scatter()

b) plt.plot()

c) plt.bar()

d) plt.pie()

Correct Answer: d) plt.pie()
Justification: In Matplotlib (imported as plt), the plt.pie() function takes an array of sizes and turns them into a circular pie chart representing percentages.

9. Which Plotly Express function is used to create a bar chart?

a) px.scatter()

b) px.line()

c) px.bar()

d) px.histogram()

Correct Answer: c) px.bar()
Justification: In the Plotly Express module (px), the px.bar() function is explicitly used to construct interactive bar charts.

Exercise 2: Write short answers to these questions.

a) Define data visualization in your own words. 2 Marks

Data visualization is the process of translating raw data and numbers into visual graphics, like charts, graphs, or maps. This makes it much easier for people to quickly understand patterns, trends, and complex information.

b) Why is data visualization considered important for businesses and analysts? 2 Marks

Businesses generate massive amounts of data daily. Visualization is critical because it turns that overwhelming data into clear, understandable insights. This allows professionals to spot market trends, identify problems, and make smarter, faster business decisions.

c) Name three popular Python libraries for data visualization. 2 Marks

Three popular libraries are Matplotlib, Seaborn, and Plotly.

d) What is the key characteristic of Matplotlib that offers both freedom and the need for more code? 2 Marks

Matplotlib is a “low-level” library with a Matlab-like interface. This characteristic means it doesn’t make many assumptions for you; it gives you total freedom to customize every single pixel of your graph, but requires you to write more lines of code to achieve it.

e) What type of data is typically represented using a bar chart? 2 Marks

A bar chart is typically used to represent categorical data. It is perfect for comparing different distinct groups or categories against each other (e.g., total sales across different months, or populations of different cities).

f) What is Seaborn built upon, and what type of visualizations does it primarily focus on? 2 Marks

Seaborn is built directly on top of the Matplotlib library. It primarily focuses on creating beautiful, informative statistical visualizations, such as complex line plots, bar plots, and heatmaps that show patterns and relationships.

g) What is a key feature of Plotly that distinguishes it from Matplotlib? 2 Marks

The key distinguishing feature of Plotly is interactivity. While Matplotlib mostly generates static (unmoving) images, Plotly creates dynamic graphs that allow the user to zoom in, click on categories, and hover over data points to see exact numbers directly in their web browser.

h) In Matplotlib, what is the role of plt.xlabel() and plt.ylabel()? 2 Marks

The plt.xlabel() function is used to add a descriptive text label to the horizontal X-axis of a graph, while plt.ylabel() adds a descriptive text label to the vertical Y-axis, helping viewers understand what the graph is measuring.

i) What type of data is best represented using a pie plot? 2 Marks

A pie plot (or circular chart) is best used to represent parts of a whole. It is ideal for showing percentages or proportional distributions (e.g., showing the market share of different smartphone brands adding up to 100%).

j) What is the purpose of the color argument in Plotly Express plotting functions? 2 Marks

The color argument automatically groups and color-codes the data points on the graph based on a specific category or column in your dataset (e.g., color='Gender' will automatically color male and female data points differently and generate a legend).

Exercise 3: Long Answer Questions.

1. Explain the importance of data visualization in the process of data analysis. 4 Marks

In the process of data analysis, analysts must comb through thousands or even millions of rows of raw data. Human brains are not naturally equipped to find mathematical patterns in giant spreadsheets. Data visualization acts as a translation tool, converting complex numerical datasets into visual context (shapes, colors, lines). Its importance lies in three areas:

Speed: It allows analysts to instantly identify trends (like a sudden spike in website traffic) and outliers (abnormal data points).

Storytelling: It helps analysts communicate their findings clearly and convincingly to non-technical business leaders or clients.

Exploration: It allows analysts to interact with the data from different angles, leading to new insights and better, data-driven decision-making.

2. Compare and contrast the Matplotlib and Plotly libraries for data visualization in Python. Discuss their key features, strengths, and weaknesses. 4 Marks

Matplotlib is a foundational, low-level plotting library.

Strengths: It is highly customizable, deeply integrated into the Python ecosystem, and excellent for generating standard, static 2D plots (like basic line, bar, and pie charts) for academic papers or printed reports.

Weaknesses: Because it is low-level, it requires significantly more lines of code to make charts look beautiful. The charts are also mostly static.

Plotly is a modern, high-level, open-source library.

Strengths: Its primary feature is interactivity. Plotly generates dynamic charts where users can hover, zoom, and click elements on a web browser. It is incredibly easy to use (especially with Plotly Express) and excels at 3D and financial charts.

Weaknesses: It can be resource-heavy for extremely massive datasets and might be overkill if you only need a simple, printable image for a document.

3. Explain how to create a basic line chart, bar graph, and pie chart using the Matplotlib library in Python. Include the import statement, example data, and key functions used. 4 Marks

# 1. Import the library
import matplotlib.pyplot as plt

# Example Data
months = ['Jan', 'Feb', 'Mar']
sales = [150, 200, 180]

# --- A. Line Chart ---
# Uses plt.plot() to draw a line connecting the data points
plt.plot(months, sales)
plt.title("Monthly Sales Trend")
plt.show()

# --- B. Bar Graph ---
# Uses plt.bar() to create rectangular bars for categorical comparison
plt.bar(months, sales)
plt.title("Monthly Sales Comparison")
plt.show()

# --- C. Pie Chart ---
# Uses plt.pie() to show parts of a whole (percentages)
# 'labels' assigns names to the slices, 'autopct' displays the percentages
plt.pie(sales, labels=months, autopct='%1.1f%%')
plt.title("Sales Distribution")
plt.show()

4. Imagine you have a Pandas DataFrame containing sales data for different product categories over the past year. Outline the steps and Python code (using Plotly) to create a line chart, a bar graph, and a pie plot to analyze this data. 4 Marks

To accomplish this, we first import pandas and plotly.express. We then read the dataset into a DataFrame and pass that DataFrame into the specific Plotly Express functions (px.line, px.bar, px.pie).

import pandas as pd
import plotly.express as px

# Step 1: Assume the data is loaded into a Pandas DataFrame
# (In a real scenario, this would be: df = pd.read_csv("sales.csv"))
data = {
    'Month': ['Jan', 'Feb', 'Mar', 'Jan', 'Feb', 'Mar'],
    'Category': ['Tech', 'Tech', 'Tech', 'Home', 'Home', 'Home'],
    'Sales': [5000, 6000, 5500, 3000, 3200, 3100]
}
df = pd.DataFrame(data)

# a) Line Chart showing total sales trend over the months
# 'color' splits the lines by category
fig_line = px.line(df, x='Month', y='Sales', color='Category', title="Sales Trend")
fig_line.show()

# b) Bar Graph comparing total sales for each product category
fig_bar = px.bar(df, x='Category', y='Sales', title="Category Sales Comparison")
fig_bar.show()

# c) Pie Plot showing the percentage contribution of each category
fig_pie = px.pie(df, names='Category', values='Sales', title="Category Contribution")
fig_pie.show()

Part 3: Python Practical Questions (18 Programs)

1. Write a Python program to take two numbers as input and print their sum, difference, product, and quotient.

a = float(input("Enter first number: "))
b = float(input("Enter second number: "))
print("Sum:", a + b)
print("Difference:", a - b)
print("Product:", a * b)
print("Quotient:", a / b)

2. Write a program that takes a user’s name as input and prints a greeting message using the print() function.

name = input("Enter your name: ")
print("Hello there,", name, "! Welcome to Python.")

3. Create a Python program that takes an integer input from the user and checks whether it is positive, negative, or zero using an if-elif-else statement.

num = int(input("Enter an integer: "))
if num > 0:
    print("The number is positive.")
elif num < 0:
    print("The number is negative.")
else:
    print("The number is exactly zero.")

4. Write a program that accepts two numbers from the user and swaps their values without using a third variable.

x = int(input("Enter X: "))
y = int(input("Enter Y: "))
# Python allows elegant simultaneous swapping!
x, y = y, x 
print("After swapping: X =", x, "and Y =", y)

5. Create a program that takes a sentence as input and counts the number of vowels in it.

sentence = input("Enter a sentence: ").lower()
vowel_count = 0
for char in sentence:
    if char in 'aeiou':
        vowel_count += 1
print("Number of vowels:", vowel_count)

6. Write a Python program to create a new file called “data.txt” and write the sentence “Hello, this is a test file.” into it.

file = open("data.txt", "w")
file.write("Hello, this is a test file.\n")
file.close()
print("File created successfully.")

7. Modify the above program to append the sentence “This is an appended line.” to the same file.

file = open("data.txt", "a") # Open in Append mode
file.write("This is an appended line.\n")
file.close()
print("Line appended successfully.")

8. Write a program that reads a file and prints its content to the console.

file = open("data.txt", "r")
content = file.read()
print("--- File Contents ---")
print(content)
file.close()

9. Write a Python program using the csv module to create a CSV file “students.csv” with columns “Name” and “Marks”, and add three student records.

import csv

data = [
    ["Name", "Marks"],
    ["Aarosh", 85],
    ["Subigya", 92],
    ["Binay", 78]
]

with open("students.csv", "w", newline='') as file:
    writer = csv.writer(file)
    writer.writerows(data)
print("students.csv created successfully.")

10. Write a program that reads data from “students.csv” and displays the content.

import csv

with open("students.csv", "r") as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

11. Using Pandas, read a CSV file and print the first five rows.

import pandas as pd
# Using the file we created in question 9
df = pd.read_csv("students.csv")
print(df.head()) # .head() prints the first 5 rows by default

12. Write a Pandas program to create a DataFrame from a dictionary and save it as a CSV file.

import pandas as pd

my_dict = {
    'Item': ['Laptop', 'Mouse', 'Keyboard'],
    'Stock': [15, 100, 45]
}
df = pd.DataFrame(my_dict)
df.to_csv("inventory.csv", index=False)
print("Dictionary saved to inventory.csv")

13. Using Pandas and Matplotlib, read data from a CSV file and plot a pie chart for product sales.

import pandas as pd
import matplotlib.pyplot as plt

# Using the dictionary from Q12 as a mock dataset
data = {'Product': ['Laptop', 'Mouse', 'Keyboard'], 'Sales': [5000, 1500, 2500]}
df = pd.DataFrame(data)

plt.pie(df['Sales'], labels=df['Product'], autopct='%1.1f%%')
plt.title("Product Sales Distribution")
plt.show()

14. Write a Python function that takes two numbers as input and returns their sum.

def get_sum(num1, num2):
    return num1 + num2

print("Sum is:", get_sum(10, 20))

15. Create a function that calculates the area of a circle given the radius as a parameter.

import math

def circle_area(radius):
    return math.pi * (radius ** 2)

r = float(input("Enter radius: "))
print("Area of circle:", circle_area(r))

16. Write a function that checks if a number is prime or not.

def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

num = int(input("Enter a number to check if prime: "))
if is_prime(num):
    print("It is a prime number.")
else:
    print("It is not a prime number.")

17. Write a function that takes a list of numbers as input and returns the maximum number from the list.

def find_max(number_list):
    # Python has a built-in max() function!
    return max(number_list)

my_list = [4, 99, 23, 1, 85]
print("The maximum number is:", find_max(my_list))

18. Write a program that reads a file and handles the exception if the file does not exist.

try:
    file = open("missing_file.txt", "r")
    print(file.read())
    file.close()
except FileNotFoundError:
    print("Error: The file you are looking for does not exist on this computer!")