Methodology

Understanding the algorithms and methods behind our revenue estimations

The Boxleiter Method

Overview

The Boxleiter Method is the industry-standard approach for estimating Steam game revenues, developed by Mike Boxleiter. It's based on the empirical observation that the number of reviews correlates strongly with the number of owners, using a multiplier that has been refined through extensive data analysis.

Core Formula

// Step 1: Estimate Owners
Estimated Owners = Total Reviews × 35
// Step 2: Calculate Gross Revenue
Gross Revenue = Estimated Owners × Average Price
// Step 3: Apply Deductions
Net Revenue = Gross Revenue × (1 - Total Deductions)

The Review Multiplier

The multiplier of 35 is derived from:

  • Analysis of thousands of Steam games where actual sales data was available
  • Statistical correlation between review counts and known owner counts
  • Accounts for the fact that only ~3% of players leave reviews

Revenue Deductions Breakdown

Deduction TypePercentageReason
Steam Cut30%Platform fee (reduces to 25% after $10M, 20% after $50M)
VAT/Sales Tax10%Average global tax rate
Regional Pricing10%Lower prices in certain regions
Discounts & Sales10%Seasonal sales, launch discounts
Refunds & Chargebacks5%Steam's refund policy impact
Total Deductions65%Net Revenue ≈ 35% of Gross

Interactive Calculator

Estimated Owners

35,000

Gross Revenue

$1,050,000

Net Revenue

$367,500

Limitations & Considerations

Important Notes:

  • • Less accurate for games with fewer than 30 reviews
  • • May overestimate for games given away for free or in bundles
  • • Doesn't account for DLC, microtransactions, or in-game purchases
  • • Review rates can vary significantly by genre and target audience
  • • Most accurate for traditional premium games on Steam

Wilson Score Confidence Interval

Overview

The Wilson Score confidence interval is a statistical method used to calculate confidence bounds for a binomial proportion. In the context of Steam reviews, it helps determine the "true" positive rating of a game by accounting for the sample size and providing confidence intervals.

Mathematical Formula

For a 95% confidence interval (z = 1.96):

Wilson Score = (p̂ + z²/(2n) ± z√[p̂(1-p̂)/n + z²/(4n²)]) / (1 + z²/n) Where: p̂ = observed proportion of positive reviews n = total number of reviews z = z-score for confidence level (1.96 for 95%)

Applications in SteamRev

Review Score Ranking

Used to rank games by review score while accounting for the number of reviews. Games with few reviews get lower confidence bounds.

Quality Score Calculation

Provides a more accurate quality score by considering both the positive percentage and the statistical confidence in that percentage.

Example Calculation

Game A

10 reviews, 100% positive

Wilson: 72.2%

Game B

100 reviews, 90% positive

Wilson: 83.6%

Game C

10,000 reviews, 85% positive

Wilson: 84.3%

Notice how Game A with 100% positive but only 10 reviews scores lower than Game B with 90% positive from 100 reviews, demonstrating the confidence adjustment.

Advantages

  • Handles small sample sizes: Provides meaningful scores even for games with few reviews
  • Statistical validity: Based on solid statistical theory with proven reliability
  • Fair comparison: Enables fair ranking between games with vastly different review counts

Inverse Wilson Score (Worst Games)

Overview

The Inverse Wilson Score is an adaptation of the Wilson Score algorithm specifically designed to identify and rank the worst-performing games. Instead of calculating confidence in positive reviews, it calculates confidence in negative reviews, providing a statistically sound method for finding truly poorly-received games.

How It Works

The algorithm inverts the proportion:

Standard Wilson: p̂ = positive_reviews / total_reviews
Inverse Wilson: p̂ = negative_reviews / total_reviews

Then applies the same Wilson Score formula to get confidence in the negative rating.

Primary Use Case

Finding the Worst Games

Used in the "Most Negative Games" section to identify games that are genuinely poorly received, not just games with a few bad reviews.

Without Inverse Wilson:

A game with 2 reviews (both negative) would rank as "worst"

With Inverse Wilson:

Requires statistical confidence, favoring games with many negative reviews

Comparison Example

GameReviewsNegative %Inverse WilsonRank
Bad Game A5100%56.6%3rd
Bad Game B50075%71.2%2nd
Bad Game C500070%68.9%1st (Worst)

Despite having the lowest negative percentage, Game C ranks as worst due to high confidence from 5000 reviews.

Benefits

Statistical Confidence

Ensures that games labeled as "worst" have enough reviews to be statistically significant, avoiding false negatives from small sample sizes.

Fair Negative Ranking

Provides a balanced approach to identifying poorly-received games, considering both the percentage and volume of negative feedback.

Important Disclaimer

All revenue estimates are approximations based on publicly available data. Actual revenues may vary significantly due to factors not captured in these models. These methods should be used for general market analysis and trends, not for precise financial calculations.