Hax In Pokemon Battles

Status
Not open for further replies.

jrrrrrrr

wubwubwub
is a Forum Moderator Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
Tangerine, for someone who has such a fallacious argument, you sure are angry at other people's posts....

You are under the assumption that luck is dependent on the skill of the player, and that for some reason, a better player will always be impacted more by luck. In reality, this is not the case. If two players were using the same teams and using the same moves on every turn, they would have an equal chance of haxing. That is what is meant by "luck will even out".

Since we know that people don't use the same teams, they will not have the same chance of haxing. This begs the question: what makes a "good team" good? Would you use a paraflinch team in an ST5 match? No, obviously, and if you did you would likely lose. Good teams are built specifically to reduce the player's dependance on luck. They are also built to reduce the effect that luck has on the outcome of the match. I never used SpDef CBTar, until I was crit OHKOd by a Celebi's Grass Knot. Now, I can take a crit and still nail it back with a Crunch. That is just one example of how skill/experience can turn the tides of luck into something as insignificant as possible.

The most prominent flaw in the support of this formula is that people think that good players are affected more by luck. They are not. Good players are actually less impacted by luck than bad players, since their teams are built to not need luck to win and to reduce the bad effects of luck as much as possible. Changing the rating system to lessen the effect of luck would be completely irrelevant to the sometimes unlucky but skilled player. When I was talking about how Pokemon is a game of odds manipulation, I was assuming that people would take the next step and understand that the better players manipulate the odds to better suit their chances of victory. If someone gets to a point where one critical hit can win the game, obviously they are doing something right, regardless of what their perceived skill level is. Players with a higher ranking are already judged less harshly based on luck than worse players because the nature of the game implies that good players see less "meaningful" luck.

The purpose of any rating system is to rate the player's skill. The current rating system definitely does not consider this at all - while win is a win, how well you win matters. The current formulas don't have this at all. I think this is a noble endeavor and I'd love to help out tweak some stuff but sad that I'm too busy at the moment (I haven't posted in stark in a while x_x).
You would be correct about the current rating system being wrong, if the bolded part of this quote was actually true. There is no reason to believe that how "well" you win matters, and there is nothing to determine what a "good win" even is. My main problem with this is that the rating system being proposed does not actually place more value on skill, it just deters people from using potentially gamechanging moves because they could be deemed too lucky. Pokemon is a game of manipulating luck, trying to soften the punishment of people who are victims of luck is just as bad as rewarding those who do get lucky. I am in agreeance with Colin here, I don't buy the "some wins are worse than others" argument when the objective of the game is as clear as it is in Pokemon: knock them out before they knock you out.

it is because pokemon is a game of probability management that we need to take into consideration probability into the ranking system.
We already do, probability is taken into consideration every step of the way, from when we are in the team builder to when we decide who is the winner and loser. The element of luck is part of the game, if you don't want to get luckfucked then dont use moves that can miss, use bulky pokemon that can take hits, dont take stupid risks. These are things that good players already do that account for probability. There is no need to change the rating system because of your perception that luck and skill are opposites.

To anyone against this concept - we're not changing Pokemon. We're changing how we, as a competitive community should look upon games and how they matter to ranking. Wouldn't you like to know how "good" you are to measure yourself against the others? It's obvious that our current system is flawed - obviously because we have just adopted a ranking system that's designed for some other game without even considering "will this rating system accurately measure skill in Pokemon". It's like saying the adoptation of Rating systems to Chess have changed the game of chess - give me a fucking break - adding a rating system and a way to measure "hax" isn't going to change pokemon no more than we have already done.
Whether or not the current rating system is the best, I don't see how it is not at least a fairly accurate representation of battler's skill. Would I like to know how good I am? Yes, and since Pokemon is a game that includes hax, it would be unreasonable to take the hax out of the game just to say that you are better in haxless Pokemon (a game that doesn't exist). Haxing is out of the player's control, and devaluing wins and losses because of a random element like this is arbitrary and self-serving. This is especially true since most of the prestigious battles are in tournament form, where luck plays a huge role in a single elimination format. Would we have to devalue "luck wins" in tournaments, too, or is this only relevant on the ladder?

Instead of implementing this formula, why don't we just invoke a "hax clause" that removes all haxes from ranked matches? It would be much easier than this IMO, and at least it would be consistent and fair to players on every level of the ladder.

I'm not going to say that the current rating system is perfect, but this is not the way to go about improving it.

To complain against such a system you guys will have to complain against ranking players as a whole -
what? How does "I don't want to turn a significant part of the game into a disadvantage" translate to "I don't want anybody to be ranked"?
 
Tangerine said:
First, to anyone who claims that hax evens out in the long run is obviously wrong. In fact, I'm surprised no one bothered to correct such a ridiculous fallacy that's been proclaimed in the forums for the longest time now and I'm more disturbed that so called "intelligent" members of the community have accepted this. Consider Pokemon without hax, with Player 1 and Player 2. Player 1 is a better player, in fact, will beat Player 2 80% of the time. Consider the game with "luck", which obviously applies to both sides. Let's fix this rate at some arbitrary number, 10%, meaning that you will win 10 % of your game because of luck. Then Player 1 will win because of luck 2% of the time, Player 2 will win 8% of the time. Evens out in the long run? Please - then again this isn't the only thing regarding statistics that you guys are completely oblivious of.
I considered this in my post (though apparently it's arguable but I'll let other people take issue with your actual math/logic) and judged it entirely irrelevant because it has no bearing on player hierarchy whatsoever. Under the current system, you will get the placement you deserve if you play enough matches; the only difference in an ideal situation would be, what, slightly inflated rankings? Please explain to me how this appeals to anything other than ego, because just like literally everyone else in support of this idea, that's the one and only benefit I can draw from any of your arguments.


To anyone who claims that "this is impossible" - please shut up unless you guys are able to give a convincing reasoning why such a concept is "impossible" - just because you guys cannot see it doesn't mean that it is impossible. If you think it's flawed, then it is your job to point it out and improve upon the concept (you guys are in PR after all) or else you guys all might as well get blogs and start bitching about policy there with no suggestions to improvement or with waterdowned reasoning. My words are harsh, but that's because I'm thoroughly disappointed with the posts within this thread.
I'm just going to assume that this isn't addressed to me, because if it is then I have no way to make any sense of it other than to just point you back up to my gigantic post up there.

To anyone against this concept - we're not changing Pokemon. We're changing how we, as a competitive community should look upon games and how they matter to ranking.
Yeah ok there

Please do not give me this sort of garbage when you and I both know what the proposed concept we're supposed to be discussing actually is-- a direct, overt alteration of Pokemon's win condition.
X-Act said:
If Hax is higher than some threshold value, then the game is deemed to have had too much hax, and hence the opposing player would get the win instead.
Throughout my posts I have been lenient enough to consider the possibility that people on both sides of the argument will eventually decide that this is ridiculous and take a stance more similar to yours ("let's change the rating system and that's it"), but don't act as if I'm somehow being unreasonable by treating this idea for what it literally currently is; make an actual effort to legitimize your point of view and have it "officially" recognized (or at least universally agreed upon) instead of pretending that I should be able to predict which of your assumptions will suddenly become "fact."

Tangerine said:
To anyone against this concept - we're not changing Pokemon. We're changing how we, as a competitive community should look upon games and how they matter to ranking. Wouldn't you like to know how "good" you are to measure yourself against the others? It's obvious that our current system is flawed - obviously because we have just adopted a ranking system that's designed for some other game without even considering "will this rating system accurately measure skill in Pokemon". It's like saying the adoptation of Rating systems to Chess have changed the game of chess - give me a fucking break - adding a rating system and a way to measure "hax" isn't going to change pokemon no more than we have already done. And to anyone who don't feel that this is necessary because "pokemon is a game of probability management" reason that I have pointed out so many times - please, it is because pokemon is a game of probability management that we need to take into consideration probability into the ranking system. To complain against such a system you guys will have to complain against ranking players as a whole - your arguments are pathetic and really it's just a "OMG THIS IS NEW I DONT WANT TO THINK ABOUT IT" nonsense - because your arguments can be applied to ANY RANKING SYSTEM - any ranking system changes the weight of wins. Are we changing pokemon now?
Please stop saying stuff like "our current system is flawed" and "obviously because we," when I have already given specific examples suggesting otherwise that could do to be addressed.

For one thing, there are a number of factors that could cause our current formula to work out poorly in the long run, such as the fact that alt accounts are not only pretty much "standard," but have now literally become encouraged under certain circumstances. Why should "hax," something that at worst compresses the rankings a little bit, be the culprit we're focusing on right now?

Aside from the fact that hax is pretty much by definition trivial in the long run (compared to a number of other possible things we could be addressing instead), the bottom line is that any competitive community only needs a formula which measures each player's ability to win games, and Pokemon is no different. Yes, we're changing what a win "means" when we take both consistency and relative ranking into account, and you're correct when you say that neither of these scenarios necessarily constitute changing the game itself. The difference is that you have completely failed to produce any sort of example in which the game wouldn't be changed as a direct result of this, or any vaguely similar formula-- and I think I've given my fair share of examples in which it would. If you and I are equally good at the game from almost every standpoint (equal number of wins against equally skilled players), but I lose to hax three times as often as you do, you should be ranked higher than me because you're obviously doing something right, but you won't be under the new system. I don't even remember if that's something I posted before or not, but my point is that this is either a lot of sacrifice, or a lot of work to consider when the reward is so... minimal? And I'm not even just saying this because of personal opinion-- I'm saying it because you've really just refused to give me a reason to believe otherwise. "This would be interesting to test," "this sooths my soul when I get flinchhaxed too much," "what's wrong with telling players how 'skillful' they are..." why should any of this matter to me?


I'm really, really glad Nintendo did us a favor and standarized the 'banning pokemon' shit because really, if they didn't do it, I would be more than willing to bet that people here would be whining and bitching about the concept of banning pokemon - hell there are people who are just against the idea of banning moves to balance a game. Please do us a favor, shut up unless you have a good argument against it other than your little purist opinion that "OMG POKEMON SHOULD BE THIS WAY" - it is that kind of mentality that is stopping competitive pokemon from ever being taken seriously because of your stubborn mindsets that can't really even grasp the idea of progress.
What do you mean, "stopping competitive Pokemon from being taken seriously"? There are like 150-350 people on the shoddy server daily, correct? So I assume you mean "from the perspective of other competitive communities," as in, the same competitive communities that probably helped spawn this "purist mentality" in the first place, but don't actually mean anything to us because "Pokemon is such a different game?" I seriously don't know how I'm supposed to make sense of this paragraph.

your arguments are pathetic and really it's just a "OMG THIS IS NEW I DONT WANT TO THINK ABOUT IT" nonsense
why did you post this?
 
First, to anyone who claims that hax evens out in the long run is obviously wrong. In fact, I'm surprised no one bothered to correct such a ridiculous fallacy that's been proclaimed in the forums for the longest time now and I'm more disturbed that so called "intelligent" members of the community have accepted this. Consider Pokemon without hax, with Player 1 and Player 2. Player 1 is a better player, in fact, will beat Player 2 80% of the time. Consider the game with "luck", which obviously applies to both sides. Let's fix this rate at some arbitrary number, 10%, meaning that you will win 10 % of your game because of luck. Then Player 1 will win because of luck 2% of the time, Player 2 will win 8% of the time. Evens out in the long run? Please - then again this isn't the only thing regarding statistics that you guys are completely oblivious of.
Sorry, perhaps I should've specified that when I said hax evens out over time I meant the absolute chance of hax occurring, and not necessarily the effect of hax on your win probability.

I'm prepared to admit that in this context, hax possibly does not completely even out over time, but I can't help but think that you're trying to pull wool over our eyes with that particular example you gave. It seems to assume that hax is a discrete variable and that the chance of favorable hax producing a win out of what would otherwise be a loss is the same for players of all skill levels. On the contrary; I believe that lesser players on average require significantly more hax than skilled players in order to make such a difference on the outcome of the match. Whether this is proportional to the base win ratios I'm not sure, but I think it is definitely a factor that exists that you failed to acknowledge.

I would go on further, but it appears that j7r has already mentioned most of the stuff I was going to say but a lot better.

PS: Kudos to you Lemmiwinks, the only person to make decent points in this thread. I'll address them when I get time.
Thanks, and I look forward to your responses. tbh I was hoping for more discussion on my input, whether it be positive or negative.

Anyway, here's an interesting PM I just received from petrie911:

petrie911 said:
My idea for measuring hax is what I call "variant damage". It's the difference between the damage you were expected to cause to the opponent, calculated from probabilities, and the actual damage dealt. Each player would have a running total of their variant damage, and at the end of the match, the difference between the two would tell how much the hax favored one side or the other.

To demonstrate this, suppose you and I were playing a hypothetical match, and the following turns took place.

Code:
Case 1
 
Your Magnezone (84%)
My Scizor (87%)
 
Scizor used Bullet Punch
Magnezone lost 10% of its health
Magnezone used Hidden Power
It's Super Effective
Scizor lost 87% of its health
Scizor fainted
 
Suppose Scizor would to 9-10.6%, without critical, and that it actually did exactly 10% to Magnezone.
Assume Magnezone's HP Fire is a guaranteed KO.  In this case, Scizor's expected damage is (10.6+9)/2 *
(1+1/16) = 10.4%. So My variant damage is -.4%.  Meanwhile, Magnezone had no chance of doing less
than 87% damage, so your variant damage is 0.  Since there was no hax this round, the variant
damage is quite small.
 
 
Case 2
 
Your Magnezone (84%)
My Scizor (87%)
 
Scizor used Bullet Punch
A critical hit!
Magnezone lost 20% of its health
Magnezone used Hidden Power
A critical hit!
It's Super Effective
Scizor lost 87% of its health
Scizor fainted
 
Now my variant damage is 9.6%, while yours is still 0%.  Once again, I had essentially no hax, so my
variant damage is 0.  You had a minor hax in your favor, so your variant damage is mildly positive
 
 
Case 3
 
My Heatran (100%) (Scarfed)
Your Dugtrio (54%) (LO) (Arena Trap)
 
Heatran used Fire Blast
Heatran's Attack Missed
Dugtrio used Earthquake
Heatran Fainted
 
Suppose Heatran's attack was a guaranteed KO if it hit.  Then Heatran's expected damage was 54%*.85
= 45.9%, and my variant damage is then -45.9%.  Additionally, since you only had a 15% chance of
attacking this turn, your expected damage was 100%*.15 = 15%, so your variant damage for the turn is 85%.
 
This turn has a very large difference in variant damage (131%), which reflects the important nature
of this hax.
 
 
Case 4
 
My Gengar (100%)
Your Swampert (100%)
 
Gengar used Hypnosis
Swampert fell asleep
Swampert is fast asleep
 
You had selected Surf and expected to do 56% damage, so your variant damage is -56% for this
turn.  Meanwhile, I would have done no damage regardless, so my variant damage is 0.
 
This moderate hax has a moderate effect on variant damage.
 
Also, if this case were replaced with an Ice Beam freeze, the chance of freeze would have to be taken
into account as well as the chance to instantly unfreeze.
 
 
Case 5
 
My Aerodactyl (1%)
Your Jolteon (100%) (Specs)
 
Jolteon used Thunderbolt
It's Super Effective!
Aerodactyl lost 1% of its health.
Aerodactyl fainted.
 
In this case, the result was decided on a speed tie.  Supposing EQ would have KO'd Jolteon, then my
variant damage is -50%, as I had a 50% chance of dealing 100% of your HP.  Your variant damage
is .5%, as you had a 50% damage of dealing 1% damage to Aerodactyl.

One problem is that I'm having trouble thinking of simple ways to deal with Burn, Poison, and Paralysis, though. While a full para is easy enough to deal with in the above model, if said paralysis was caused by Tbolt rather than Twave, should it have more weight? If so, how to implement that? Burn damage/attack loss has a similar problem, as does Poison damage when not caused by Toxic Spikes, as even Toxic has a chance to miss. A non-simple way to handle this is to assign to every pokemon the "probability that they'd have this status condition at this point in the match", which is calculable, and then factor that into the expected values. It's messy, but it would work. A similar thing could be used for stat-boosting moves, although Meteor Mash and Charge Beam are probably the only relevant ones.

Additionally, this doesn't take into account the fact that people play differently when badly haxed. Now that I'm without Heatran in Case 3, I'll have more difficulty switching into Celebii, for example. I'm not sure if it's even possible to account for that, but the issue is there. Added computational complexity is also an issue, but it seems inevitable to get any sort of reliable estimate.

Still, I think this is the correct way to go about measuring hax. The variant damage gives you an idea of how random draws affected the damage you did to your opponent. Thus, it would set be easy to say "If the winner's variant damage minus the loser's is more than X%, then the match is declared 'too haxy', and thus void.", where X is a number to be determined. And, if we did do the previous, a viable definition could be made for banning something based on randomness--if a suspect significantly raises the rate of matches being voided, then it is clearly greatly increasing the effect of hax.
I don't have time to dissect this properly right now, but his points look very well thought out so I'm posting it on his behalf for all to see.

On first impressions he brings up a number of interesting points, including stuff that I glossed over previously such as the damage RNG and the overall severity of critical hits with regards to absolute damage incurred. It seems like this analysis would complement my idea of ratings manipulation very well, although I still feel that when it comes to missing with less than 100% accuracy moves, it is partially the fault of the user and should therefore carry less weight when applying such instances to a hypothetical hax evaluator. Again, I'd like to hear what others think about this issue.

As for the original formula, I'm still fundamentally against the principle of changing the win condition for the adverse effects it could have on 'competitive' battling. I have yet to come up with a solution to remove this possibility without removing changes to the win condition entirely, but it's good to remain optimistic on these things until it has been proven beyond reasonable doubt that the idea is unworkable.

I'll probably post more tomorrow, but for now I'll let this thread develop overnight, hwever I urge more people to try and make some more meaningful contributions as to how to improve this idea, or at least come up with more convincing reasons as to why this endeavor is a lost cause, as Tangerine alluded to.

P.S: Sorry if my grammar is a little bit off, I'm rather tired atm.
 
In general I'm very surprised to see this formula get any support as it in general limits the kind of teams that are usable in the metagame, if anything.

For example, originally I would have thought that Explosion teams would have taken a big hit from this kind of formula, but after having ggGerbil(Lemmiwhinks or w/e) show that it makes it impossibly more viable it makes it even worse. Stall becomes more vaible also since it has more pokemon at the end, and it just lowers the variety of pokemon that I thought this community was striving to achieve.
 

Ancien Régime

washed gay RSE player
is a Top Team Rater Alumnusis a Battle Simulator Moderator Alumnus
"but I lose to hax three times as often as you do, you should be ranked higher than me because you're obviously doing something right, but you won't be under the new system."

Can you give concrete examples as to "what I'm obviously doing right" in this scenario?

Also, Articanus, you are directly contradicting jrrrrr's response to dealing with luck - you are arguing "no luck" would make stall the only viable strategy, whereas he seems to be implying that the best way to deal with luck is to...run stall.
 
Can you give concrete examples as to "what I'm obviously doing right" in this scenario?
You might be building a team that doesn't fall apart as easily if one of its members gets haxxed, or you could simply be playing better when your ordinary team strategy happens to fall through.

If we're both running teams that can steamroll most of the competition, but my team's strategy is rendered far less effective once a member meets an untimely KO/burn/etc., that's an area of improvement I definitely should be considering, even if those individual matches happened to favor my opponent for no good reason. Likewise, if I just dismiss a match when my Skarmory gets frozen or whatever, I'm not going to be as well-equipped to handle that sort of scenario when it happens in the future.
 
Also, Articanus, you are directly contradicting jrrrrr's response to dealing with luck - you are arguing "no luck" would make stall the only viable strategy, whereas he seems to be implying that the best way to deal with luck is to...run stall.
I Ctrl+F ed both of J7r's posts and I don't see the word "stall" in either of them, in any sense. However, assuming you are correct and he is implying that the best way to deal with luck is to stall, then I couldn't disagree more. By running stall without this "hax clause" on, you are inclined to take hits in which(most likely) you will find yourself screwed over by a crit or side-effect. With the clause on, you can freely play stall and if you find yourself screwed by that crit you can shrug it off and go "oh well, hax against me". And not only that, generally stall teams don't attack as much as other teams. They are designed to take hits, so the chance of them being "lucked against" increases. And since they rely on set damage such as SS, SR, and Spikes(along with Perish Song as a last-poke finisher), they don't usually get very lucky themselves.
 

jrrrrrrr

wubwubwub
is a Forum Moderator Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
"but I lose to hax three times as often as you do, you should be ranked higher than me because you're obviously doing something right, but you won't be under the new system."
This basically sums up my objection to this. If you lose three times as often as I do, how could it possibly make sense to rank you higher than me? "I had a higher probability of winning" does not mean "I should have won", as if the probability actually has a direct influence on how the random number generator works. If you are one Fire Blast away from winning, the law of averages states that you will lose the match 15 times out of 100. Why should we lessen the punishment for taking risks that the player KNOWS are there?

Also, Articanus, you are directly contradicting jrrrrr's response to dealing with luck - you are arguing "no luck" would make stall the only viable strategy, whereas he seems to be implying that the best way to deal with luck is to...run stall.
I never said that stall is the only or best viable way of dealing with luck, since that is obviously not true. Offensive teams (example: suicide lead + sweepers) that limit the amount of turns in a battle are also a perfectly legitimate way of "dealing with luck". Stall-ish pokemon that have long lists of resistances also deal with luck by being able to take lucky hits from lots of different pokemon and not have it matter as much...if your Swampert's Surf crits my Celebi, chances are its not a big deal. I was just making one quick suggestion there.
 

DougJustDoug

Knows the great enthusiasms
is a Site Content Manageris a Top Artistis a Programmeris a Forum Moderatoris a Top CAP Contributoris a Battle Simulator Admin Alumnusis a Smogon Discord Contributor Alumnusis a Top Tiering Contributor Alumnusis an Administrator Alumnus
I have been fascinated with the concept of using game data to increase the reliance on skill in competitive pokemon. X-Act came to me a long time ago, and showed me the prototype for his formula. At that time, I had been working on collecting more information about moves and damage during battle. I was doing it simply for increasing the quality of statistics I publish each month. But, when X-Act presented his formula to me, I realized a much more valuable purpose for the data I was collecting.

X-Act's formula was the seed that has grown into a much more comprehensive hax computation. I have been working on it feverishly over the past few weeks. I've been avoiding posting in this thread, because I wanted to get more work completed privately before mentioning anything in public.

By utilizing more detailed statistical data gathered during battles, we can actually keep very close track of how much luck affects the final outcome of a battle.

After cleaning up some of the data hooks I designed, I presented the information to a few of the analytical wizards here in Smogon, and asked how they might use this information to quantify "hax" in battle. From those discussions, we developed an algorithm that could be evaluated at the completion of a battle that would quantify if the winner received an inordinate amount of luck during the battle.

I then coded that algorithm into a special clause in Shoddy Battle -- I call it the "Anti-Hax Clause". The clause does not alter ANY game mechanics whatsoever. It simply computes a numeric representation of how luck affected the battle outcome. Depending on the magnitude of that number, it can determine the "True Winner" of the battle. I have a working version of the clause right now.

In the discussion presented here in PR, one complication has arisen that we did not foresee -- the possibility that players may use detailed knowledge of the win formula to actually game the battle outcome. If that were the case, it would defeat the whole purpose of calculating a "true winner" in the first place -- since the formula is intended to be an objective evaluation of normal battle strategy. If the formula actually became part of a battlers play strategy -- then it would likely alter the quality of the calculation itself. It's not unlike Heisenberg's Uncertainty Principle. I'm not quite sure how to handle this possibility -- but the authors of the formula are discussing this VERY actively right now. Expect to see more information on this in the near future.

For now, I won't go into all the specifics of the formula. Actually, I'm not even the right person to explain most of the math behind this thing. But, I am familiar with all the general functional elements, so I'll present those here. The clause utilizes the following battle data to compute a "Win".

Pokemon left standing at the end of the battle
This is, by far, the single biggest factor for determining the winner -- as it should be. Without the formula, this is the only factor for determining a winner. However, the number of pokemon remaining is very significant in the formula. For example, it is mathematically impossible for the formula to determine that a player with six pokemon remaining, lost the battle. In fact, it's also impossible for a 4-0 or 5-0 result to be a loss. I'm fairly sure that 3-0 can't be lost either, but I think Caelum recently proved that there is some remote mathematical chance that it could *technically* occur. But, I think there's a higher probability that lightning will strike your computer and electrocute you. So, for all practical purposes, the formula is only significant in cases where the battle is close. Most likely, a 1-0 or 2-0 battle.

Rating probability of one player to beat the other player
This is all based on the stuff X-Act presented in the OP. Caelum has taken that initial formula and made some changes, but I'm not sure how much he changed it or why. I'll let him explain that part of it, if he has time later.

Offensive move status effects
Certain moves have small chances to impart status on the target -- like Thunderbolt has a 10% chance to Paralyze the target. This portion of the formula is primarily driven by the type of status, the number of times it occurs, the percentage chance that it could occur, and the effect that it has on the target (for example, Burning a Guts pokemon with Flamethrower isn't exactly "bad luck" for the target. It is bad luck for the attacker). Moves whose purpose is to impart status (like Thunder Wave) are not counted in this part of the formula.

Offensive move boosts
This is similar to the move status component, but is computed from moves that can cause stat boosts and/or drops (like when Meteor Mash gives Metagross an attack boost). This portion of the formula is primarily driven by the stat or stats changed, the number of times it occurs, the percentage chance that it could occur, and the effect that it has on the attacker and/or target based on pokemon type and build. Moves whose purpose is for boosts or drops (like Swords Dance or Screech) are not counted in this part of the formula.

Evasion and Accuracy
Actually, this is two separate components -- "Lack of accuracy" and "Lack of evasion" -- but most people think of these as one thing. Basically, these parts of the formula evaluate hits and misses -- if you miss your opponent when you "should" have hit, and if your opponent hits you when they "should" have missed. I italicize "should" because it is all based on the normal mathematical probability of a hit, and then it is compounded with multiple uses of moves. So, if Fire Blast misses once, that is not particularly significant in terms of hax. But, if Fire Blast misses three times in a row? --- that's crummy luck for the attacker. Likewise, if you get hit by Hypnosis four times in a row -- that's crappy luck for the target.​

Those are the factors that we have firmed up so far, but there are a few other very important elements that will be added soon. We are trying to figure out a way to numerically weight the impact of significant critical hits in battle -- but that's proving quite hard to nail down. We'll get it sorted out, but it's hard. Also, we will likely have a separate component for quantifying streaks of full paralysis and self-hits of confusion. We just haven't determined the exact mathematical weighting to assign to those. We're also debating assigning a positive or negative value to hax created by certain items like Brightpowder or Scope Lens. I'm personally not thrilled about this part of the formula, but I'll save that rant for later. It may not even make it into the clause anyway.

We're also very likely to include a component based on a suggestion from Brain. Here is Brain's initial proposal of an idea for an "Alternate damage accumulator", since he can explain it better than I can:

Brain said:
Basically, you want to weight wins according to the expectation of the score rather than the real (noisy) score. Some progress can be made in that direction simply by taking the expectation of damage rather than the real damage towards the win statistic. For every pokemon, you have two HP meters: one that works as normal and one that is depleted at a rate corresponding to the expectation of damage. For example, if you have an attack that can do 30, 60, 90 or 120 damage with uniform probability, regardless of what is truly dealt, you count the mean (75 damage) on the "haxless" hp meter. If a pokemon is confused, you always deal half of the normal damage to the foe and half of whatever a pokemon hitting itself does. An asleep pokemon would deal damage proportional to the probability that it would wake up that turn. The game still goes on normally, but for every pokemon you tally an additional hp meter that is always depleted by the expectation of damage (maybe augmented by its variance). Even if a pokemon faints, you still count the expectation of the damage it would have dealt, multiplied by the probability that it would have survived. At the end of the game, you simply discard all the real hp meters and you use the "expected" meters to compute who "should" have won.

This system already handles a lot of sources of "hax" naturally: critical hits, misses, paralysis, confusion, etc. It doesn't handle random burn and freeze, stat boost "hax" etc. but in theory you could also have a probability distribution over status and stat boosts, which would cover them.

The important point that needs to be made is that the system gives us greater confidence in who carried out a particular battle with the most skill and "repairs" statistical errancy due to stochastic aspects of the game.
Brain has been busy lately, so we haven't formalized this part of the clause yet. I have built a basic accumulator that works like he described, but I need to get some more details from the other authors before I actually drop this code into the formula.


I have opened a private server for some basic playtesting to work out a few kinks with the clause. I am giving the server address and port to a few PR members who seem like they are analytical and open-minded to the basic premise of this whole endeavor. I'm not disclosing the address publicly, since I have not built this server to handle a bunch of random public traffic. But, expect to see feedback in this thread from some of the playtesters. I'll fix the bugs as soon as they are found, and then we'll decide what we want to do with all this data we are now able to collect.
 
Let me get one thing straight first; I am totally opposed to this formula. I find it completely ridiculous that we are going against what happens in link battles. In link battles you can get lucky, and if you win through luck, you don't get a little message saying you lost. If we are trying to replicate link battles, surely we need to use the same route, and say that a win, is a win, regardless. Whatever anyone says, I seriously doubt anybody can come of for an argument for changing what we have always gone for in trying to "replicate link battles". Honestly, I didn't want to get involved with this potential change but as far as I am concerned it has seriously gone too far.

Due to my annoyance, I asked Doug for access to this additional server to get a further insight into this. I guess it seemed okay as I haven't really seen any hiccups so far, but I still vehemently protest against this formula. To just change a win to a loss (regardless of how you put it, that's how I see it), is ridiculous and shouldn't be done.
 

august

you’re a voice that never sings
is a Community Leaderis a Tiering Contributoris a Top Tutor Alumnusis a Tournament Director Alumnusis a Top Team Rater Alumnusis a Smogon Discord Contributor Alumnusis a Smogon Media Contributor Alumnusis the 8th Smogon Classic Winnerwon the 5th Official Smogon Tournamentis a Five-Time Past WCoP Champion
OGC Leader
After gaining access to Doug's private server, my opinion still stands that this is the right way to go. I fought around 5 or 6 battles on the ladder system Doug set up, and watched the other battles that occured, and the formula isn't even gamebreaking; and this is coming from one of the few users who's had the win loss outcome changed to his opponents favor.

I really don't see why this is upsetting people so much, luck may be a natural occuring effect, but that doesn't mean luck isn't 2 sided. Pokemon is a game to most people, but when you are playing competitively against someone you are considerably better than and lose to something like 4 straight Fire Blast misses i'm sure that you would be rather angry too.

Lastly, very nice job on the formula and variables so far to Doug, Brain, caelum, and X Act and anyone working on this behind the scenes. I can't wait to see the final outcome.
Honestly, I didn't want to get involved with this potential change but as far as I am concerned it has seriously gone too far.
How? Changing a game that isn't based around luck to benefit someone that obviously outplays the other player but loses from hax isn't going to far in my book.
 

jrrrrrrr

wubwubwub
is a Forum Moderator Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
and the formula isn't even gamebreaking; and this is coming from one of the few users who's had the win loss outcome changed to his opponents favor.
I don't know how getting a win when you lost isn't gamebreaking....

I really don't see why this is upsetting people so much, luck may be a natural occuring effect, but that doesn't mean luck isn't 2 sided. Pokemon is a game to most people, but when you are playing competitively against someone you are considerably better than and lose to something like 4 straight Fire Blast misses i'm sure that you would be rather angry too.
Luck is a naturally occuring effect in the game, which is evidenced by the fact that it occurs in the game. And you're right about luck not being two sided. It is on the side of the better player.

How? Changing a game that isn't based around luck to benefit someone that obviously outplays the other player but loses from hax isn't going to far in my book.
1) The game is based around luck.

2) How can a computer determine who outplays the other person? Some decisions that would look retarded to a computer have won me countless matches. Things like bluffing items and making sacrifices to get free switches are all kinds of strategies used that would not be accounted for in a formula like this. "I would have won 3-0, but I sacrificed and then accidentally got lucky on the last turn so I lost instead" is not something I want to deal with, personally.

3) What is this obsession with people that "should have" won. Last I checked, if you "should have won" and lost, then you still lost. If you irresponsibly place yourself in a situation where even one crit could cost you the match, you "should have" been aware of that possibility and avoided the situation...

By utilizing more detailed statistical data gathered during battles, we can actually keep very close track of how much luck affects the final outcome of a battle.
I'm looking forward to this data, but I want to at least look at the data before we seriously try to adopt a formula like this. (this isnt to you djd, this is just more of a thinking out loud thing.)

I then coded that algorithm into a special clause in Shoddy Battle -- I call it the "Anti-Hax Clause". The clause does not alter ANY game mechanics whatsoever. It simply computes a numeric representation of how luck affected the battle outcome. Depending on the magnitude of that number, it can determine the "True Winner" of the battle. I have a working version of the clause right now.
Altering the win condition is quite the change in game mechanics. It's pretty important that we keep "true winner" in quotation marks.
Pokemon left standing at the end of the battle
This is, by far, the single biggest factor for determining the winner -- as it should be. Without the formula, this is the only factor for determining a winner. However, the number of pokemon remaining is very significant in the formula.
The only way that the amount of pokemon left standing should have an impact on who wins is if one of us has 0, in which case that person loses.

Rating probability of one player to beat the other player
This is all based on the stuff X-Act presented in the OP. Caelum has taken that initial formula and made some changes, but I'm not sure how much he changed it or why. I'll let him explain that part of it, if he has time later.
"ladder rank means nothing". Maybe I am looking at the ratings the wrong way, but to me, they are not a measure of how good you actually are...they are a measure of how good you have been in the past. If I make an alt and start at 100 rating and beat people with ~1200 ratings like I "should" based on my other accounts, how is that fair to them? It could completely screw up those people that have lower ratings.

Offensive move status effects
Certain moves have small chances to impart status on the target -- like Thunderbolt has a 10% chance to Paralyze the target. This portion of the formula is primarily driven by the type of status, the number of times it occurs, the percentage chance that it could occur, and the effect that it has on the target (for example, Burning a Guts pokemon with Flamethrower isn't exactly "bad luck" for the target. It is bad luck for the attacker). Moves whose purpose is to impart status (like Thunder Wave) are not counted in this part of the formula.
I'm assuming this is what flinches would fall under..so...if I were to attempt a bulky paraflinch Togekiss, which is a completely legit strategy, would all of those FPs and flinches that I am accumulating wind up working against me in the end if the match was close? Zapdos has a decent chance of losing to a bulky TWave/Air Slash/Roost Togekiss, but if I'm reading this right, that legit strategy could be compromised..especially if the match is close in the end.

Offensive move boosts
This is similar to the move status component, but is computed from moves that can cause stat boosts and/or drops (like when Meteor Mash gives Metagross an attack boost). This portion of the formula is primarily driven by the stat or stats changed, the number of times it occurs, the percentage chance that it could occur, and the effect that it has on the attacker and/or target based on pokemon type and build. Moves whose purpose is for boosts or drops (like Swords Dance or Screech) are not counted in this part of the formula.
Again, would this punish users who purposely attempt to abuse moves for the "luck value"? There is a Charge Beam Jolteon peer edit posted right now that clearly abuses that move for the offensive boost. Would such a set become a disadvantage if this formula were implemented?

Evasion and Accuracy
Actually, this is two separate components -- "Lack of accuracy" and "Lack of evasion" -- but most people think of these as one thing. Basically, these parts of the formula evaluate hits and misses -- if you miss your opponent when you "should" have hit, and if your opponent hits you when they "should" have missed. I italicize "should" because it is all based on the normal mathematical probability of a hit, and then it is compounded with multiple uses of moves. So, if Fire Blast misses once, that is not particularly significant in terms of hax. But, if Fire Blast misses three times in a row? --- that's crummy luck for the attacker. Likewise, if you get hit by Hypnosis four times in a row -- that's crappy luck for the target.
If this formula does work, this component of it would surely eliminate the need for Evasion Clause, OHKO Clause in addition to the removal of Freeze Clause via the "move effects" section. Those changes would be exciting, I'll admit.
We're also debating assigning a positive or negative value to hax created by certain items like Brightpowder or Scope Lens. I'm personally not thrilled about this part of the formula, but I'll save that rant for later. It may not even make it into the clause anyway.
I'm also not thrilled about this part of the formula. If someone uses Brightpowder or Scope Lens, it is obvious that luck is purposely being used as part of a strategy. Negative values for items like this would punish people for using perfectly viable strategies, even if those strategies are inferior to others on a competitive level.

I have opened a private server for some basic playtesting to work out a few kinks with the clause.
I look forward to testing this formula and hearing the results ^__^
 

reachzero

the pastor of disaster
is a Senior Staff Member Alumnusis a Top CAP Contributor Alumnusis a Tiering Contributor Alumnusis a Battle Simulator Moderator Alumnus
After a little bit of "laddering" on the hax-free server, I've found the results rather anticlimactic. Not as many battles come down to 2-0 or 1-0 as you would think, and none of my 7-8 battles so far had enough hax to trigger the hax clause; one battle had a significant critical hit and my Tyranitar was burned switching in to a Blissey Flamethrower, and the outcome was not changed (I lost). Assuming that the formula is functioning properly (and I certainly trust Doug/X-Act/Caelum that is it), this is hardly the earth-shaking formula of doom that its being made out to be by some here.
 
After a little bit of "laddering" on the hax-free server, I've found the results rather anticlimactic. Not as many battles come down to 2-0 or 1-0 as you would think, and none of my 7-8 battles so far had enough hax to trigger the hax clause; one battle had a significant critical hit and my Tyranitar was burned switching in to a Blissey Flamethrower, and the outcome was not changed (I lost). Assuming that the formula is functioning properly (and I certainly trust Doug/X-Act/Caelum that is it), this is hardly the earth-shaking formula of doom that its being made out to be by some here.
It's all well and good to trust that the formula is working as intended, but I have to ask: have you actually seen the formula and understand it? Because I don't know about you, but I personally would never even consider battling competitively unless I knew EXACTLY what was required of me to win a match. For Pokemon as it is now, this is easy: faint all the opponent's Pokemon first. But with all the extra complications brought about by such a formula, I would have no choice but to scrutinize it for every detail so that I knew what condition had to be met to win, as satisfying the old condition would no longer be a 100% guarantee.

Simply put, I need to see this formula in its entirety before I can lend any credence to the idea. I'm particularly curious as to how the bias towards closer matches can at all be justified.
 
I've been following this thread for a while, and I would like to say that maybe it would be best if instead of having it so the formula negates and reverses a win, it should just negate a loss. Speaking purely from a laddering standpoint, it is ridiculous to be up high (1650 up-ish) and then lose to a bucnh of bullshit freezes that drops your rating down an hours work. I know I would like laddering so much more if that didn't always happen to me.

From the small bits I've played on the private server I'm highly inclined to agree with reachzero. I haven't seen a single match overturned yet in maybe 15-20 battles. In theory that means its working, since nobody (except me! maybe!) gets badly haxxed enough for the formula to overturn a match every 20 battles or so.
 
It's all well and good to trust that the formula is working as intended, but I have to ask: have you actually seen the formula and understand it? Because I don't know about you, but I personally would never even consider battling competitively unless I knew EXACTLY what was required of me to win a match. For Pokemon as it is now, this is easy: faint all the opponent's Pokemon first. But with all the extra complications brought about by such a formula, I would have no choice but to scrutinize it for every detail so that I knew what condition had to be met to win, as satisfying the old condition would no longer be a 100% guarantee.

Simply put, I need to see this formula in its entirety before I can lend any credence to the idea. I'm particularly curious as to how the bias towards closer matches can at all be justified.
I feel you are over exaggerating the effects of the formula to really snatch a win from a player. After playing a few matches on the private server, I too am supporting this formula's implementation. Nothing is really changing apart from games that are really haxy. Hax effects that are expected to happen will not be adversely affected by the formula. I, like Elevator music and reachzero, have not seen a single match overturned because of the formula, so all complaints against it are somewhat unnecessary and invalid as of now. It seems that the formula is working fine now, and with the continuing development of it by Doug / X-Act / Caelum / Brain, it should perform much better in the future.
 

jrrrrrrr

wubwubwub
is a Forum Moderator Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
I've been following this thread for a while, and I would like to say that maybe it would be best if instead of having it so the formula negates and reverses a win, it should just negate a loss.
If your hax loss doesn't count, why would it make sense for a hax win to count? This is completely self-serving, basically a safety net so that someone can leave themself open and make poor moves to cost them the match in the end. Putting yourself in a situation where you are likely to get haxed is not a good strategy, dudes.

Speaking purely from a laddering standpoint, it is ridiculous to be up high (1650 up-ish) and then lose to a bucnh of bullshit freezes that drops your rating down an hours work. I know I would like laddering so much more if that didn't always happen to me.
"I don't want to lose points when I lose" isn't really a valid argument when you know the risks are there in the first place...believe me, I know that losing to luck is a pain in the nuts but seriously, it's part of the game that you play. If you got a bunch of bad rolls in a row in Monopoly, would you request to pay a lower rent to your opponents? The bad luck we experience only occurs when you play the game for a long number of matches. A pokemon match takes like 5 minutes, tops, most of the people in this forum have probably done 2 or 3 matches at the same time, all of those matches add up. People perceive more luck than there actually is because the sample size of matches is deceptively large. I'm all for a serious, fair way to go about something like this..but let's be real here and use a tactic other than "I dont wanna lose points :'(".
 

Jimbo

take me anywhere
is a Top Tutor Alumnusis a Tournament Director Alumnusis a Site Content Manager Alumnusis a Senior Staff Member Alumnusis a Top Contributor Alumnusis a Top Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnus
I tried laddering on the new server too and I don't think it was very different at all. None of my battles were overturned in either direction. Having this formula installed actually put my mind at ease. Like I said before, I usually feel really bad when haxxing people, bad enough to make me leave the room, because I know how hard it is to ladder somtimes. In one specific battle where I thanked this formula was against August. I ended up crit Bullet Punching his Infernape, basically winning me the game.

Usually I would've left the game since it was kind of lame, but I wanted to see what would happen. The game wasn't overturned, and it made me feel a bit better knowing that the "math" said the hax wasn't crucial enough.
 

Blue Kirby

Never back down.
is a Top Tutor Alumnusis a Site Content Manager Alumnusis a Battle Simulator Admin Alumnusis a Programmer Alumnusis a Smogon Discord Contributor Alumnusis a Top Contributor Alumnusis a Smogon Media Contributor Alumnusis an Administrator Alumnusis a Past SPL Championis a Three-Time Past WCoP Champion
I will admit that I was skeptical, but I have now logged some battles on the new test server, and I had a pretty similar experience to Jimbo. No battles were turned over against me, and I did get a win awarded where my opponent (not naming names) agreed it was deserved. In that instance, I had been cleaning up until a timely freeze followed by two critical hits in coming turns. I'm going to keep at this for a bit, and I'll post some logs a little later.

EDIT: Some more people need to get on the server so we can play!
 

Tangerine

Where the Lights Are
is a Top Team Rater Alumnusis a Community Leader Alumnusis a Smogon Discord Contributor Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Smogon Media Contributor Alumnus
An obligatory, but really late response

First, I can sum up jrrrrrrr's post with the statement that "Better Players are more resistant to luck". That statement is obviously true, and I don't think anyone can deny that - Better players are more able to create situations more advantageous for them with more consistency. However, no matter how good you are, you can't make yourself completely immune to luck. What are you going to do if a Togekiss Encores and then uses Ancient Power twice, only to find out that it got it boost twice in a row and your Zapdos is now dead? Or you can argue "the better player will go to Skarmory and roar it out after the second Ancient Power", but that results in a crippled Skarmory which means that you're now more vulnerable to physical attacks, congratulations!

There are just happenstances in the game that you can't do anything about - those are the situations i speak of, and that is what I mean by "luck". It's obvious our definitions differ - especially by your statement regaridng manipulating luck - but we can define luck by some measurement away from the expected created by ingame mechanics.

To people saying we should be rewarding "luck based" strategies - why should we? Secondly, why should we consider banning, when this formula fixes? After all, what we are interested in this - a rating system that will measure how good the player is, not how lucky they are and how much of a streak they can get by playing poor players.

Instead of implementing this formula, why don't we just invoke a "hax clause" that removes all haxes from ranked matches? It would be much easier than this IMO, and at least it would be consistent and fair to players on every level of the ladder.
First, you assume that the formula wont be fair and consistent. The formula is fair - no matter how you put it, and consistent. The question is whether or not the formula is relevant.

Secondly, the hax clause would remove risk/reward on many moves - what if someone prefers Fire blast? What would the hax clause do then? With this formula, we're able to do this - we can find the expected value in numbers of hits on a given move, and compare that with the actual, to find out how far the gap is and use that to show "how much luck influenced this battle".

Needless to say, the formula is looking quite good - I've been running around with a Ancient Power Togekiss team (as mentioned above) to see the formula in action and I think with a few tweaks, we'll be ready to test it soon :)
 

Chou Toshio

Over9000
is an Artist Alumnusis a Forum Moderator Alumnusis a Community Contributor Alumnusis a Contributor Alumnusis a Top Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnus
How about instead of a system that assigns a winner and loser, a system that throws out battles deemed "junk." A junk battle could be defined based on certain degrees of hax, however you want to define it. For instance if "x # Crits" + "y activations of extra-effects (f-thrower burns, shadowball sp.def drops, etc.)" + "z hits/misses by innacurate moves (fire blast, hypnosis etc.)" surpasses some threshhold, or if said hax effects happen to a much greater degree for one player than another.

Anyway, probably using a more sophisticated and detailed equation(s)/measure(s)/critera, a "junk" battle could be outlined however we deemed fit and the result would be to simply not include them for rating purposes.

While I am making a totally subjective statement here, I think I (and others out there) would be happier knowing the equation was throwing out battles rather than assigning winners and losers.

In any case, while we might try to say how much is or is not significant hax, the fact is that whether hax was significant or not is a case-by-case occurance that cannot be measured by numbers alone. For instance, you can count # of crits, but there is a big difference between scizor's bullet punch critting/not critting a heatran switch in, or scizor critting/not critting a +1dd salamence. You can count misses, but there's a big difference between Rotom's W-O-W hitting/not hitting a Blissey switch in, and Mix-Mence's Draco Meteor hitting/not hitting a +2ATK B-Pass Gliscor. Is it possible for the equation to read these kinds of situational differences?
 

Ancien Régime

washed gay RSE player
is a Top Team Rater Alumnusis a Battle Simulator Moderator Alumnus
This formula has worked better than I would have thought. From what I can tell, the hax has to be truly *blatant* to actually overturn a win or loss. I still think it should apply only to ladder ranking, but the chaos others have predicted does not seem to be there. I'm not going to go out and say "implement it on the main server right away", but I really do believe that it is workable.
 

tennisace

not quite too old for this, apparently
is a Site Content Manager Alumnusis a Top Social Media Contributor Alumnusis a Community Contributor Alumnusis a Researcher Alumnusis a Top CAP Contributor Alumnusis a Tiering Contributor Alumnusis a Contributor Alumnusis a Smogon Media Contributor Alumnusis an Administrator Alumnus
Jimbo invited me to try the new server, and like most, I was a bit skeptical at first. However in our first battle I critted him like 4 times and froze him at a key point. I really didn't deserve the win because he could have easily swept me.

That right there got me thinking. Tennis is considered a competitive sport, correct? In tennis, pros play in nice sunny weather outside, with perfect courts. However at my school, we play in windy, cold weather, and there are cracks all over the courts. What's happened before is that balls get blown all over the place by wind, no matter what you do during the point. The same goes for cracks; I can play the point perfectly, rally to the point where I'm about to finish the opponent, and then still lose due to a lucky shot and a luckier bounce on a crack. Why shouldn't we fix the proverbial cracks and make it completely fair for everyone?

To those that say we aren't simulating link battles: sleep clause and evasion clause aren't either. What makes this different exactly? The main argument for those two clauses is that it makes the game come down to hax and makes the game "less competitive".
 

Jimbo

take me anywhere
is a Top Tutor Alumnusis a Tournament Director Alumnusis a Site Content Manager Alumnusis a Senior Staff Member Alumnusis a Top Contributor Alumnusis a Top Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnus
I was actually kind of glad to have that battle with you, tennisace, because I wanted to see the formula in action (that was the first time a battle was overturned in one of my games). It got me thinking though, if that had been a real battle on Smogon Uni and I had been trying to ladder, I would've actually been really mad (I mean come on, you froze my with Tri-Attack against my Tyranitar with Porygon-Z ._.) I completely agree with tennisace on that if there are imperfections in the game (his "cracks") we should fix them.
 

DougJustDoug

Knows the great enthusiasms
is a Site Content Manageris a Top Artistis a Programmeris a Forum Moderatoris a Top CAP Contributoris a Battle Simulator Admin Alumnusis a Smogon Discord Contributor Alumnusis a Top Tiering Contributor Alumnusis an Administrator Alumnus
Several hours ago, I updated the Anti-Hax clause on the private server. I apologize for rebooting the server without warning, and disconnecting a few of you with battles in progress. Next time, I'll be sure to wall a warning to everyone before I bring the server down.

I updated the clause with the following "finalized" variables:

Alternate Damage Accumulator
This is the damage counter proposed initially by Brain. We worked out a few areas that were vague, and we now get a tally of the normalized damage that would be accumulated by each team. The delta between the regular damage counter and the alternate damage counter is now one of three "starting points" in calculating a win (along with the Pokemon Remaining and the Rating Probability). Read my earlier quote of Brain's proposal for more information on this accumulator.

Significant Critical Hits
This was a tough one to get sorted out from the data. We did not want to count all criticals equally. Because if you get a crit on a pokemon with 1% health, the critical is completely irrelevant. So for hax weighting purposes, we wanted to only count criticals that were "significant" in terms of damaging a pokemon compared to a non-critical hit. We added additional significance weighting for criticals that occur against a pokemon with defensive boosts, since critical hits bypass those boosts. The hard part was adding a bunch of data monitors on the battle to track non-critical damage. It turned out that I could reuse some of the stuff from the damage accumulator, so once we figured out the right weighting, the coding was easier than I thought it would be.

As you can see from the battle result between Jimbo and Tennisace -- I guess the "significant criticals code" has already come into play during testing. I'll be sure to go back through that log and verify the numbers. Thanks for posting guys.

Item Hax
I've been opposed to this variable from the very beginning -- since I think that hax items are largely unused, and if they are used, they are strategic. I really don't want the formula to discourage the use of any item, even those that are deemed "cheap". But, I was outvoted on this one, and I think we reached a decent compromise. The formula takes into account the number of times certain items "fire" and affect the battle. Stuff like Brightpowder, Scope Lens, Quick Claw (not implemented currently, but if it gets done, it's counted here), and Kings Rock are the sort of items tracked in this variable.

We placed a very low weighting on most items, since the effect of the item usually causes other variables to be impacted. For example, if Brightpowder causes a miss, then the miss will impact the "Lack of Accuracy/Evasion" variables. I don't think it's necessary to "double count" the miss, by tracking the item trigger. But, by shifting the numbers around and tracking two sets of RNG's in most cases where items are in play, I think we have a nice compromise that does not discriminate unfairly against someone using a "cheap item" -- while still applying a proper correction for situations where the item triggers an unusually large number of times with respect to the item's percentage chance to trigger.

Full Paralysis, Flinch, Infatuation, and Confusion Immobility
We lump all of these together into one variable. It tracks the number of times a pokemon is unable to act due to the conditions listed above. We have a slightly larger multiplier on Confusion self-hits, since it not only immobilizes the pokemon, but it also does damage. Flinches caused by Fake Out are not counted.​


Together with the variables I listed earlier, this now completes the formula used by the Anti-Hax Clause. We do not anticipate adding any new variables, but we may tweak the ones already there, depending on the results of playtesting on the private server.

To all playtesters: Please make particular note of battles where you feel "extreme hax" occurred, and make a subjective determination of whether the formula properly identifies the winner or not. If you think the formula needs adjustment, let us know and we can discuss it. I appreciate those of you that have taken the time to log on and battle, and I value your continued feedback.
 
Status
Not open for further replies.

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top