Wednesday, April 10, 2024

Cartography Module 4, Data Classification

Greetings everyone,

Data classification methods are the name of the game this week. this is a part of our next couple of modules which are investigating different types of thematic maps. For this week though we aren't focusing as much on the thematic type, but the data classification methodology used to display the data set. Additionally, we continue to build upon the design principles of the previous modules. The key learning objectives though revolve around demonstrating differences between the 4 different methods presented below. 

The subject area is Miami Dade County in southern Florida with the subject matter being the percentage of population above age 65 per census tract. Also, the raw number of members above age 65 normalized by square mile is also presented. Before we look into those, a touch on the 4 classification methods. 

Equal Interval - This classification method takes the range of data values (Max – Min Value) and divides it by the desired number of classes. For example, values of 0 to 100 with 5 desired classes would equate to each class representing an increment of 20. One thing this doesn’t take into account is how the data is distributed along a number line. It could result to classes with no values in them, and classes with large amounts of values compared to the others. However there are no gaps in the legend, but there may be some visual gaps if there are segments without values.

Quantiles - This classification method separates your data into an equal number of observations per class. First, the data is rank ordered from lowest to highest or vice versa and then observations are dispersed into your classes until all classes hold an equal amount of observations in the ascending or descending order. One potential problem is that this method does not take into account data clustering or the natural break points observed as you can see when placing the data on a number line. A positive is that there will be no empty classes within your map.

Standard Deviation - This classification method is best for data which is approximately normally distributed along the number line. Provided the data follows the bell curve model or rather has roughly equal amounts of data along both sides of the mean it will represent well. One problem if the data is not equally distributed is that you’re likely to have a skewed presentation with empty or misrepresented color classes.

Jenks Natural Breaks - This classification method also takes into account where the data is along the number line, but tries to group data items based on where they occur most frequently. This attempts to group like values together and unlike values in separate classes through a best fit algorithm.

Presentation 1. 





















Presentation 2. 




















Discussing the two approaches above I came to the following conclusion. Utilizing the population above age 65, normalized by square mile provides a more accurate picture of this population. A similar argument as to the question 8 can be made for this normalized view of the data. The Jenks Natural Breaks method is still the most desirable due to how it captures the highest data class. The information that is more useful here is that it can show you an actual accounting of how many citizens could be reached per tract. In this case, the central area has tracts that mostly contain 3500 – 7120 people. This could then also be used to judge return on investment from discussions with this demographic.

This method also eliminates some of the skewed nature of some of the surrounding tracts. With the number of individuals being presented you can better account for actual tract population for the desired demographic. With the percentage method you could have a much smaller population in total for a tract, but the senior citizen population be a larger percentage. For example, a tract with 100 people in it, 60 of which are senior citizens, would show in the highest percentage class for the percent method. However, a tract with 1000 people and only 300 seniors, would show a lower percentage total, but by the number method would be more valuable to target. 

No comments:

Post a Comment

Special Topics - Mod 3 - Lab 6 - Aggregation and Scale

 Hello and Welcome back!  My how time has flown. It has almost been 8 weeks, and 6 different labs. There have been so many topics covered in...