Reject Discreteness

May 17, 2024

#blog

Your final grade is 89.5%. You beg and plead for your instructor to round it up to 90 so you can get an A- instead of a B+. He says no, because it wouldn't be fair.

You beg and plead a little more. Still no dice.

Technically, 89.5 does round to 90. But you can go even further. 89.45 rounds to 89.5, which in turn rounds to 90. 89.445 rounds to 89.45, which roun-- RecursionError: Maximum recursion depth exceeded. This joke isn't even funny, move on! Oops.

Point is, I've taken my fair share of college courses where grade boundaries are set and people on the edge find it unfair. Fundamentally, the idea of converting a continuous score (a percentage) to something discrete (a letter grade, or even just rounding to the nearest integer) simplifies the task of assigning a grade. It's a lot easier to say "here are two B students" than "here's a student who had a score of 86.3957 and here's another who had a score of 84.4963 even though the quality of their work is similar enough that it doesn't really matter how many decimal places I go."

Now grading is a whole rabbit hole that Zoe Bee has covered excellently, but it exemplifies the problem. People on the end of boundaries are always going to feel hard done. Maybe your local rapid transit system uses a zonal fare scheme - those who live near the edge of two zones are going to pay more than they think is fair. Further, it's been shown that living close to a time zone border can be dangerous. People who were born in the mid-90s can't agree on whether they're a millennial or Gen Z.

The point is, discretization sucks. Boundaries suck. They abstract a lot of complexity and punish people or things that are close to the boundaries themselves. Even if that hypothetical professor decided to round the student's 89.5 to a 90, now someone who got an 89.3 is going to feel bad because they were so close. You can't satisfy everyone with a system. Sure, keeping continuous quantities as-is may be harder to understand, but there's no rigging or transforming going on. You get the number in its purest form.

Discretization also makes intervals unequal. For example, suppose you took a bunch of people's heights, and sorted them into buckets for every 10 cm, using truncation instead of rounding. You're telling me that comparing two people with heights of 169 and 170 cm is that much more different than comparing two people with heights of 167 and 168 cm? It's the same absolute difference in height regardless, and it's only a big deal because we count in decimal. If we counted in, say, seximal instead, tens are not a clean and elegant way to form the buckets! You'd have to use sixes or twelves instead! That changes everything!

Discrete data is easier to understand, yes, but you can use it to deceive and obfuscate. I would much rather look at a KDE plot than a histogram. Maybe I should write about KDEs. They're super cool. Lest I turn this blog from a structureless queer paradise into Yet Another Medium.com Data Science Tutorial (just kidding, some of those are really well-written and deserve credit).

And yes, I support the abolition of time zones.



If you wish to leave a comment on a post, please drop an email.
Back