Data Visualization with Plotly
What is Data Visualization?
Data visualization is presenting information in a graphical or pictorial format. It is a way of communicating data insights and findings to a wider audience in an easy-to-understand manner. Data visualization has become increasingly important in recent years, as we generate more data daily. The amount of data we generate can be overwhelming, and visualization helps us make sense of it. By converting data into visual representations, we can identify patterns, relationships, and trends that would be difficult to see in raw data form. Data visualization enables us to communicate complex data insights in a way that is accessible and engaging to a wider audience, making it an essential tool for data analysis and decision-making.
Plotly
Plotly is a popular open-source data visualization library that provides a comprehensive range of tools for creating interactive, web-based visualizations. It is widely used by data scientists, analysts, and developers to create high-quality visualizations with ease. Plotly is known for its easy-to-use interface, wide range of customizable options, and ability to create interactive and animated visualizations.
Plotly is an excellent tool for data visualization for several reasons. Firstly, it offers a wide range of pre-built visualizations, including bar charts, line graphs, scatter plots, and more, making it easy to create visualizations quickly. Secondly, Plotly allows you to customize the visualizations to suit your needs, with options to change colours, labels, annotations, and more. Additionally, Plotly provides interactive features, such as hover text and zoom functionality, which makes the visualizations engaging and accessible to a wider audience. Finally, Plotly’s web-based platform allows you to publish your visualizations online, making it easy to share and collaborate with others.
In short, Plotly is a versatile, user-friendly, and powerful tool for data visualization that is widely used in data science and analysis.
Above Visualization
The visualization of By Birth Year Generation Populations with Rankings by Most Number of People Alive is a bar chart that presents the population of different generations based on their birth year. The chart shows the ranking of each generation based on the number of people alive.
The purpose of this visualization is to provide a quick and easy way to compare the populations of different generations and understand how they rank in terms of size.
In the visualization, the x-axis represents the birth year of each generation, and the y-axis represents the population size. The bar chart is used to display the population of each generation, with the height of each bar representing the number of people in that birth year. The chart is also labelled with the generation name and its corresponding birth year range.
Data Preparation
For this visualization, we’ll be utilizing the 2019 USA Census data, which includes information on the age and corresponding total population of the United States. The data can be accessed here.
# reading a CSV file into a pandas dataframe
df = pd.read_csv("2019_Population_Estimates.csv")
# applying filter
df = df[df.SEX == 'Total']
# selecting required columns
df = df.loc[:, ['AGE', 'Population']]
# calculating year of birth
df['birthYear'] = 2019 - df.AGE
# sorting values
df = df.sort_values('birthYear')
display(df.head())
Creating the Visualization
Ladies and Gentlemen, buckle up, we’re about to embark on a wild ride of data visualization! And what’s better? We’re going to do it with just THREE lines of code! Hold on to your hats, this is going to be legendary!
# importing plotly
import plotly.express as px
# plotting bar graph
fig = px.bar(df, x=df.birthYear, y=df.Population,
title="BY BIRTH YEAR GENERATION POPULATIONS")
fig.show()
Well folks, we’ve done it! We’ve created a bar chart in just three lines of code! But let’s be honest, it’s not exactly a work of art, is it? It’s functional, it gets the job done, but let’s just say it’s not winning any awards for aesthetics.
Fear not, my friends! With a few extra lines of code, we can turn this bland bar chart into a visually stunning masterpiece. So let’s add some pizazz and make this bar chart worthy of hanging in the Louvre!
Adding Pizazz
Well folks, it’s time to start off with a bang! Or should I say, with a colour! We’re going to begin our journey into the world of Plotly by updating the background colour.
#step 1 - update background
fig.update_layout(
plot_bgcolor = '#DCD2C4',
paper_bgcolor = '#DCD2C4'
)
Step two, folks! We’re on a roll now. After sprucing up the background with some colour, it’s time to make our data points stand out with some marker(bar) updates.
First, we’ll use the pd.cut
method to create bins and provide a label for each bin. These labels will become the colour of each marker. And then, using the magic of update_marker
, we'll not only update the marker’s colour, but we'll also update its borderline size, borderline colour, and opacity!
# step 2 - update marker
bins = [1918,1927, 1945, 1964, 1980, 1996, 2012, 2019]
labels = ["#00C3A9","#00AD90","#00C3A9","#00AD90","#00C3A9","#00AD90",
"#00C3A9"]
colors = pd.cut(df.birthYear, bins=bins, labels=labels, ordered=False)
fig.update_traces(
marker_color=colors,
marker_line_color='#DCD2C4',
marker_line_width=0.5,
opacity=0.5
)
And after these two simple steps, let’s take a look at how our Plotly figure has transformed.
I hope you’ve got your sunglasses ready because it’s about to get bright and beautiful in here. Let’s see if our background colour and marker updates have taken our figure to the next level!”
Hold up, wait a minute! Something doesn’t look quite right here… those grid lines are just all over the place, and the bar gap is too wide for my liking.
But no worries, we’ve got this! With just a few tweaks to the update_layout
and update_yaxes
options, we'll have those grid lines under control and those bars snug as a bug in a rug.
# step 3 - Deleting grid line and zeroline
fig.update_yaxes(
showgrid=False,
zeroline = False,
)
# step 4 - Adjusting Bar Gap
fig.update_layout(
bargap=0.0001
)
And now, it’s time to give our Plotly figure a little bit of context. Because let’s be real, a figure without titles is like a cake without frosting.
With just a few updates to the update_layout
options, we'll add a title to both the x and y-axis. And just like that, our figure will be ready to take centre stage and shine!
So let’s grab our glitter glue and add some sparkle to our chart titles!”
# step 5a - Update Titles
fig.update_xaxes(
title=dict(
text='Birth Year',
font=dict(size=18, family="Trebuchet MS"),
standoff=20
),
)
fig.update_yaxes(
title=dict(
text='Population',
font=dict(size=18, family="Trebuchet MS"),
standoff=20
),
)
fig.update_layout(
title=dict(
text='BY BIRTH YEAR GENERATION POPULATIONS',
font=dict(size=28, family="Trebuchet MS"),
x=0.5,
)
)
And with that, folks, we’ve reached the end of this tutorial — but not the end of adding some pizzazz to our chart! We’ve only covered half the distance, but there’s still so much more to explore and add to make this chart pop. Who needs boring bar charts anyway? Stay tuned for the next episode where we’ll add some sparkle and shine to this masterpiece!
One can view the above final visualization by visiting the following link . I hope you’ve found this guide informative and helpful in creating visually appealing data visualizations with Plotly.
Please feel free to leave a comment below. Any suggestions or questions will help us improve future content and better serve your needs. Thank you for reading!