Unit 2.3 Extracting Information from Data, Pandas
Lab will perform alterations on images, manipulate RGB values, and reduce the number of pixels. College Board requires you to learn about Lossy and Lossless compression.
Files To Get
Save this file to your _notebooks folder
Save these files into a subfolder named files in your _notebooks folder
wget https://raw.githubusercontent.com/nighthawkcoders/APCSP/master/_notebooks/files/data.csv
wget https://raw.githubusercontent.com/nighthawkcoders/APCSP/master/_notebooks/files/grade.json
Save this image into a subfolder named images in your _notebooks folder
wget https://raw.githubusercontent.com/nighthawkcoders/APCSP/master/_notebooks/images/table_dataframe.png
Pandas and DataFrames
In this lesson we will be exploring data analysis using Pandas.
- College Board talks about ideas like
- Tools. "the ability to process data depends on users capabilities and their tools"
- Combining Data. "combine county data sets"
- Status on Data"determining the artist with the greatest attendance during a particular month"
- Data poses challenge. "the need to clean data", "incomplete data"
- From Pandas Overview -- When working with tabular data, such as data stored in spreadsheets or databases, pandas is the right tool for you. pandas will help you to explore, clean, and process your data. In pandas, a data table is called a DataFrame.
'''Pandas is used to gather data sets through its DataFrames implementation'''
import pandas as pd
df = pd.read_json('grade.json')
print(df)
# What part of the data set needs to be cleaned?
#Everything that is not the grade
# From PBL learning, what is a good time to clean data? Hint, remember Garbage in, Garbage out?
# Clean data whenever you are working so that it does not stack up.