Pandas Jupyter Notebook

Background

Analyze the data for an independent gaming company for their most recent fantasy game Heroes of Pymoli as a Lead Analyst.

Like many others in its genre, the game is free-to-play, but players are encouraged to purchase optional items that enhance their playing experience. As a first task, the company would like to generate a report that breaks down the game's purchasing data into meaningful insights.

The final report should include each of the following:

Player Count

Total Number of Players

#get the count
playerCount = df.SN.nunique()

#create table
playerCount_df = pd.DataFrame()
playerCount_df["Total Players"] = [playerCount]

#print
playerCount_df

Purchasing Analysis (Total)

Number of Unique Items
Average Purchase Price
Total Number of Purchases
Total Revenue

#get the number
uniqItem = df["Item Name"].nunique()
avePrice = df.Price.mean()
numPurchasing = df["Purchase ID"].count() 
totRev = df.Price.sum()


#create summary table
purchasingAnalysis = pd.DataFrame()
purchasingAnalysis["Number of Unique Items"] = [uniqItem]
purchasingAnalysis["Average Price"] = [avePrice]
purchasingAnalysis["Number of Purchases"] = [numPurchasing]
purchasingAnalysis["Total Revenue"] = [totRev]

#format the number
purchasingAnalysis["Average Price"] = purchasingAnalysis["Average Price"].map("${:.2f}".format)
purchasingAnalysis["Total Revenue"] = purchasingAnalysis["Total Revenue"].map("${:.2f}".format)

purchasingAnalysis

Gender Demographics

Percentage and Count of Male Players
Percentage and Count of Female Players
Percentage and Count of Other / Non-Disclosed

people_df = df.groupby(['SN','Gender']).size().reset_index().rename(columns={0:'count'})
#make the columns first
genderCount = people_df.Gender.value_counts()
genderPerc = genderCount / len(people_df)

#make the table
genderTable = pd.DataFrame()
genderTable["Total Count"] = genderCount
genderTable["Percentage of Players"] = genderPerc * 100

#format
genderTable["Percentage of Players"] = genderTable["Percentage of Players"].map("{:.2f}%".format)

#print
genderTable

Purchasing Analysis (Gender)

The below each broken by gender
- Purchase Count
- Average Purchase Price
- Total Purchase Value
- Average Purchase Total per Person by Gender

#get the columns
genderSize = df.groupby("Gender").size()
genderAvgPrice = df.groupby("Gender").Price.mean()
genderTotPrice = df.groupby("Gender").Price.sum()
genderAvgPerPerson = df.groupby("Gender").Price.sum() / genderTable["Total Count"]
#make the table
genderPurchasingTable = pd.DataFrame()
genderPurchasingTable["Purchase Count"] = genderSize
genderPurchasingTable["Average Purchase Price"] = genderAvgPrice
genderPurchasingTable["Total Purchase Value"] = genderTotPrice
genderPurchasingTable["Avg Total Purchase per Person"] = genderAvgPerPerson

#format
genderPurchasingTable["Average Purchase Price"] = genderPurchasingTable["Average Purchase Price"].map("${:,.2f}".format)
genderPurchasingTable["Total Purchase Value"] = genderPurchasingTable["Total Purchase Value"].map("${:,.2f}".format)
genderPurchasingTable["Avg Total Purchase per Person"] = genderPurchasingTable["Avg Total Purchase per Person"].map("${:,.2f}".format)

#print
genderPurchasingTable

Age Demographics

The below each broken into bins of 4 years (i.e. <10, 10-14, 15-19, etc.)
- Purchase Count
- Average Purchase Price
- Total Purchase Value
- Average Purchase Total per Person by Age Group

# Create bins in which to place values based upon player ages
bins = [0, 9, 14, 19, 24, 29, 34, 39, 50]

# Create labels for these bins
group_labels = [" <10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

# Slice the data and place it into bins
df["Age Group"] = pd.cut(df["Age"], bins, labels=group_labels).astype(str)
df.head()
people_df2 = df.groupby(['SN','Age Group']).size().reset_index().rename(columns={0:'count'})
people_df2
# Create bins in which to place values based upon player ages
bins = [0, 9, 14, 19, 24, 29, 34, 39, 50]

# Create labels for these bins
group_labels = [" <10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

# Slice the data and place it into bins
df["Age Group"] = pd.cut(df["Age"], bins, labels=group_labels).astype(str)


ageGroupTable = people_df2.groupby("Age Group").size().reset_index().rename(columns={0:'Total Count'})
ageGroupTable["Percentage of Players"] = ageGroupTable["Total Count"] / len(people_df2) * 100

#format
ageGroupTable["Percentage of Players"] = ageGroupTable["Percentage of Players"].map("{:.2f}%".format)

ageGroupTable = ageGroupTable.set_index("Age Group")
ageGroupTable
#get the columns
ageSize = df.groupby("Age Group").size()
ageAvgPrice = df.groupby("Age Group").Price.mean()
ageTotPrice = df.groupby("Age Group").Price.sum()
ageAvgPerPerson = df.groupby("Age Group").Price.sum() / ageGroupTable["Total Count"] # NOTE THE AGE COUNT USE HERE - HAVE TO GET UNIQUE PEOPLE

#make the table
agePurchasingTable = pd.DataFrame()
agePurchasingTable["Purchase Count"] = ageSize
agePurchasingTable["Average Purchase Price"] = ageAvgPrice
agePurchasingTable["Total Purchase Value"] = ageTotPrice
agePurchasingTable["Avg Total Purchase per Person"] = ageAvgPerPerson

#format
agePurchasingTable["Average Purchase Price"] = agePurchasingTable["Average Purchase Price"].map("${:,.2f}".format)
agePurchasingTable["Total Purchase Value"] = agePurchasingTable["Total Purchase Value"].map("${:,.2f}".format)
agePurchasingTable["Avg Total Purchase per Person"] = agePurchasingTable["Avg Total Purchase per Person"].map("${:,.2f}".format)

#print
agePurchasingTable

Top Spenders

Identify the the top 5 spenders in the game by total purchase value, then list (in a table):
- SN
- Purchase Count
- Average Purchase Price
- Total Purchase Value

#get the columns
sNSize = df.groupby("SN").size()
sNAvgPrice = df.groupby("SN").Price.mean()
sNTotPrice = df.groupby("SN").Price.sum()

#make the table
sNPurchasingTable = pd.DataFrame()
sNPurchasingTable["Purchase Count"] = sNSize
sNPurchasingTable["Average Purchase Price"] = sNAvgPrice
sNPurchasingTable["Total Purchase Value"] = sNTotPrice

#format
sNPurchasingTable["Average Purchase Price"] = sNPurchasingTable["Average Purchase Price"].map("${:,.2f}".format)
sNPurchasingTable["Total Purchase Value"] = sNPurchasingTable["Total Purchase Value"].map("${:,.2f}".format)

#print
sNPurchasingTable = sNPurchasingTable.sort_values(by="Total Purchase Value", ascending=False)
sNPurchasingTable.head()

Most Popular Items

Identify the 5 most popular items by purchase count, then list (in a table):
- Item ID
- Item Name
- Purchase Count
- Item Price
- Total Purchase Value

#get the columns
itemSize = df.groupby(["Item ID", "Item Name"]).size()
itemAvgPrice = df.groupby(["Item ID", "Item Name"]).Price.mean()
itemTotPrice = df.groupby(["Item ID", "Item Name"]).Price.sum()

#make the table
itemPurchasingTable = pd.DataFrame()
itemPurchasingTable["Purchase Count"] = itemSize
itemPurchasingTable["Item Price"] = itemAvgPrice
itemPurchasingTable["Total Purchase Value"] = itemTotPrice

#sort
itemPurchasingTable = itemPurchasingTable.sort_values(by="Purchase Count", ascending=False)

#format
itemPurchasingTable["Item Price"] = itemPurchasingTable["Item Price"].map("${:,.2f}".format)
itemPurchasingTable["Total Purchase Value"] = itemPurchasingTable["Total Purchase Value"].map("${:,.2f}".format)

#print
itemPurchasingTable.head()

Most Profitable Items

Identify the 5 most profitable items by total purchase value, then list (in a table):
- Item ID
- Item Name
- Purchase Count
- Item Price
- Total Purchase Value

#get the columns
itemSize = df.groupby(["Item ID", "Item Name"]).size()
itemAvgPrice = df.groupby(["Item ID", "Item Name"]).Price.mean()
itemTotPrice = df.groupby(["Item ID", "Item Name"]).Price.sum()

#make the table
itemPurchasingTable = pd.DataFrame()
itemPurchasingTable["Purchase Count"] = itemSize
itemPurchasingTable["Item Price"] = itemAvgPrice
itemPurchasingTable["Total Purchase Value"] = itemTotPrice

#sort
itemPurchasingTable = itemPurchasingTable.sort_values(by="Total Purchase Value", ascending=False)

#format
itemPurchasingTable["Item Price"] = itemPurchasingTable["Item Price"].map("${:,.2f}".format)
itemPurchasingTable["Total Purchase Value"] = itemPurchasingTable["Total Purchase Value"].map("${:,.2f}".format)

#print
itemPurchasingTable.head()

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
HeroesOfPymoli /Resources		HeroesOfPymoli /Resources
HeroesOfPymoli		HeroesOfPymoli
Images		Images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pandas Jupyter Notebook

Background

Player Count

Purchasing Analysis (Total)

Gender Demographics

Purchasing Analysis (Gender)

Age Demographics

Top Spenders

Most Popular Items

Most Profitable Items

About

Uh oh!

Releases

Packages

Languages

jammilo/Pandas-Jupyter-Notebook

Folders and files

Latest commit

History

Repository files navigation

Pandas Jupyter Notebook

Background

Player Count

Purchasing Analysis (Total)

Gender Demographics

Purchasing Analysis (Gender)

Age Demographics

Top Spenders

Most Popular Items

Most Profitable Items

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages