December 14, 2008

Technical Example: Ratings

I took a short bit of time to conduct an experiment comparing ratings of a standard 10-point system to the result which would occur on melative.

Here is a the whitesheet and a second model (though lacking proper detail).

The most disappointing part of this example is that assumptions were required in order for the system to work. Firstly, it is assumed that every user has exactly 10 levels, as compared to the real-implementation where users have lists varying in size. Secondly, it is assumed that every user has exactly 10 experiences, and more precisely, one experience per level. This forces a uniform distribution in the relative computation of the calculation. In short, all users have equal rating power, which is only true in this ideal system, or in systems which are not relative.

Compared to the linear and weighted model of the source data, the relative model produces a more modest grade; 7.02 and 7.71 (melative) vs 8.3 and 8.23. For this example, the melative rrs can be considered a heavy weighting, but this is only when relating to a 10.0 scale. Within melative, the two highlighted areas, point-based and pertecnt-based, offer insight to the rating, especially when comparing to another item in the system.

Point-Based

The point-based value yields an absolute value, or total awarded grade across the system. This is special in that there is no cap on the value, and each item in the system is graded equally when it appears on a list. Since the specific grade/rating depends on the system state, we will consider this value to be an energy; for any snapshot of the system, there is a total amount of energy, and each item contributes. If a this value is rather large, it is good indication of popularity, though not soley in event instances or activity. Rather, this value expresses both solid viewership and grading.

From this value, it is also necessary to create a derived value based on points/total number of appearances. A point-average disregards popularity, and expresses an item’s average grading spread solely on experienced users. The point-average is important, but it is based on a subset of the users, and in that way, it is similar to the percent-based value.

Percent-Based

If the point-value was an energy, the percent-value will be considered a potential. Similar to the point-average, the percent-value is a percentage of potential; how high an item appears on lists system-wide. This value is also based on the point-value, but rather than dividing by a head count, we divide by the maximum points that were available; the current state when item appears on a list.

Statistically, a 100% potential would require every appearance of an item to be alone in the top tier/level. Potential can be related more to a 10.0 scale, weighted, non-relative system, though the innerworkings are entirely different from the basis of calculation.

Looking closely at the example, percent-value and point-average appear to be exactly the same, but this comes with the assumptions of the test (10-levels for each user). In the case of point-average, the maximum number of points is disregarded, and when there is power variation between users, the point-average will yield a differing value to the percent-value.

Well, it was fun, and for reference we will be using the second model as it degrades more smoothly. A linear degrade is optional, but for now we stress that higher items are more significant.