This article is the “P” part of a series of tips and formats (in alphabet order) to help you convey your data visualization messages in the best way possible. Starting with A for “Area Charts” and going all the way through W for “Word Clouds,” the pros and cons of each data visualization format will be explained and illustrated. This article starts with outlining the basics of Pie Charts, and then continues to explain regression lines:
A pie chart is a circular chart that has section divisions which are used to illustrate percentages. The entire circle or pie represents 100% and each segment of the pie is a portion of the 100%. While simple, the pie chart does allow the viewer to see the variations in size among each category. The most easily understood pie charts make use of less than seven categories.
Pie charts appear in a variety of visual presentations. In the data visualization world they can also be used to misrepresent when specific data is left out. A challenge in using pie charts is that it’s difficult to make a comparison between various sections. This can be due to the fact that people are not good at reading angles and therefore labels are often assigned to the pie chart sections. But when used in the right way, pie charts can be powerful in that they do offer a visual simplicity that gets the message across.
In an attempt to improve, some organizations use 3D pie charts. But these often get in the way of the already difficult process of cutting to the core of a message through using angled visualizations. This causes a distortion and it’s not the best use of 3D. Stick with simplicity in your pie charts.
Stick with simplicity in your pie charts.
Pin maps are also referred to as location maps or pinpoint maps and are used to display locations of specific things. These are becoming more popular in use especially as it relates to GPS, smartphones, and social media.
A type of pin map is an online map that is searchable and indicates the exact location of specific items, people, etc. GPS units use pin map techniques for unit positions. Connection maps are a close cousin of pin maps, with the exception that the points are connected. These are used as graphs or data within a network that is associated with a geo-spatial layout. These are useful when dealing with abstract connections such as social media replies or phone calls and the connections are then represented as straight lines or an arc within the map.
Scatter Plots and Regression Lines
A scatter plot displays two variables for a specific data set through the use of Cartesian coordinates. Scatter plots are made up of a vertical axis, a horizontal axis and a collection of points. Each of the points on the scatter plot correlates to a value within the data set and is positioned within the value of each of the axis. In this way, the scatter plot displays two dimensions within one visual image.
The scatter plot is popular in the use of identifying the relationships between the variables of two sets of data. The relationship is referred to as the ‘correlation’ and the closer that the results come to making a straight line, the stronger the correlation between the data. Analyzing scatter plots requires that the viewer look for the strength and slope of the pattern in the data. The ‘slope’ refers to the direction of change that is shown when one of the variables gets bigger. “Strength’ refers to the plot density or concentration scatter around the line. Scatter plots are often used to demonstrate patterns, clusters or outlying information that might be hidden in a standard table.
When there are annotations on scatterplots that demonstrate a data set overall trend, they are referred to as regression lines. Linear regression methods work nicely in scatterplots due to the use of two variables. The linear regression goal is to create a mathematical model so that the viewer could potentially predict the value of ‘y’ when the x-value is known.
Image credit: Bradhoc https://www.flickr.com/photos/bradhoc/