Today I come to you with a very important announcement...and a story. This is probably the most ... important thing that has happened in quite some time...
We, Fraeven and I, believe we have isolated and fixed the infamous payload cart bug. If you're a SCG regular of any age, you'd know about it. The bug is pretty simple :
Occasionally a round of payload will start and the hud reports an incorrect placement of the cart, the cart often is unable to reach the end of the track leading to either a crash or simply an unwinnable scenario.
An example can be found here :
The bug appears to have been caused by poor programming in the donator recognition plugin's donator array being offset and deleting entities entirely at random when trying to delete donator sprites at the end of rounds/games. Through debugging, we caught the following instances of incorrectly deleted events.
Debug - T name : zz_red_koth_timer, C name : team_round_timer
Server cvar 'mp_timelimit' changed to 25
[SCG] A fatal error has occured with the KOTH timers, the round was reset to prevent a crash!
Debug - T name : minecart_path_146, C name : path_track
Debug - T name : minecart_path_153, C name : path_track
Debug - T name : , C name : logic_auto
Debug - T name : , C name : tf_wearable
By deleting entities randomly, this is what has caused numerous map issues. Most notable here is the first example in which the
zz_red_koth_timer prop is deleted and triggers the band-aid koth timer fix plugin.
After extensive research, this bug as been found to be related to a number of issues.
A fix has been developed and installed on both Chocolate and Rainbow that should prevent anything but the donator sprites being deleted properly.
Now you may be wondering? Why is this so important exactly? It's happened before, you guys just reset the round correct? This is important because this is the oldest bug SCG has ever had. It has followed us through server installs, host switches, and has been around for almost the entirety of SCG.
In the earlier days, it used to crash the server if left unchecked, this doesn't happen anymore because whenever it happens we've just immediately reset the round. This bug is similar to that of the koth timer problem, a problem I was able to "fix" by writing a plugin that detects a broken koth clock and resets the round automatically, then restores the time left. But due to the random occurrence, I never had much to go on nor could I dedicate too much time to it.
Fast forward to today, Fraeven was wanting to help understand and fix the problem, or at least develop a similar band aid to find broken payload carts. Over the last couple weeks, we've been doing tests here and there; pouring through Valve wiki pages and copies of the TF2 source code to try and understand where things go wrong.
Our first goal was to try and recreate the bug consistently, during Friday night gaming and other nights of playing when it'd happen, I had tools running to check and see what TF2 was doing. We attempted to see what maps were causing it, what conditions, etc.
Trial and error led to some interesting discoveries but ultimately nothing useful. Very quickly I was able to rule out it being a hardware problem, as well as a configuration problem. SCG servers have always had this problem be it Windows or Linux. I compared the server settings against a Valve stock server and found no major differences that would cause any sort of issues.
In our core tests, we've been running chocolate with an elevated timescale with and without bots on all sorts of payload maps to attempt to make the bug happen. Nothing would ever happen until last night, when we finally saw it happen by ourselves. It's actually a funny story even, we almost missed it entirely. I wasn't paying attention to my screen and caught it at the last moment during the waiting for players screen on pl_halfacre. With coffee in my mouth I made a silly noise and slammed down on my screenshot button :
This was big news, we made it happen finally among ourselves. Soon after, we learned how to reproduce it! And here is where things get weird...
We know the following details.
While not super precise, our method wound up being this.
While playing on koth_arctic_b3 (With two players), winning one round then forcing a map change (With the use of mp_timelimit 1 and mp_match_end_at_timelimit 1) to pl_halfacre causes the screenshot above during the waiting for players screen.
We did a few other small tests and found different results :
koth_arctic_b3 -> pl_upward : Cart is broken around 45% mark on the track (Also the bug persisted through waiting for players and reached a point on the track where it would no longer move)
koth_arctic_b3 -> pl_thundermountain : No problems found
koth_lazarus -> pl_upward : No problems found
koth_sawmill -> pl_upward : Cart is broken around the 10% mark on the track
Now that we had a way to accurately reproduce this problem (occurrence rate was almost 100% of each test with a factor of 3 tests per scenario per each variation in study (teams used, etc)) it was time to find out what was causing it. Very quickly we narrowed it down to a problem explicitly on SCG with Sourcemod turned on.
The sheer entropy of the bug has always made this difficult to fix. I have toyed with the possibility of a plugin causing it before but have never discovered any glaringly wrong coding problems with any of them that would lead me to think oh yes, this is the one!
Ultimately after removing plugins and re-adding them until we narrowed it all down, the culprit was one of the donator plugins. Specifically, the one that puts sprites above people's heads at the end of a round. Removing this plugin stopped all of our tests from working. Adding it back in with the sprite code stripped out did the same.
It makes sense really. The plugin is old, one of the oldest ones inside of SCG. May even be pre-SCG! It's not a commonly used one, and it isn't up to modern Sourcemod standards. The bug being random also aligns with the random number of donators on at a given time.
"hmm. i know its been a while since anybody's posted, but i have a persistent map bug that seems to only effect payload and doomsday maps. im searching for the problem plugin, and only have about 4 donor plugins and all the Sourcemod plugins i need running."
I will admit, it's overwhelmingly embarrassing to have to talk about this with my tail between my legs. Something so annoying under my nose this whole time and never having been able to fix it. Then suddenly...fixed! Almost a decade of this nonsense...
I want to give a big thanks to Fraeven as well as anyone else who has ever attempted to pitch ideas or help me test this over the last decade. I'll have more news for you soon but I'm just gonna take a moment to relax haha