FYI

Tactical advice, How-to, Post-mortem, etc.
spideycw
Posts: 7512
Joined: Sun Jul 06, 2003 7:00 am

Post by spideycw »

VOODOO DAAAAAAANCE
I'm sorry I don't remember any of it. For you the day spideycw graced your squad with utter destruction was the most important day of your life. But for me, it was Sunday
Idanmel wrote:QUOTE (Idanmel @ Mar 19 2012, 05:54 AM) I am ashamed for all the drama I caused, I have much to learn on how to behave when things don't go my way.

My apologies.
MrChaos
Posts: 8352
Joined: Tue Mar 21, 2006 8:00 am

Post by MrChaos »

* removed for unloading on a specific person, and apparently misunderstand what the hell the post meant *

Updated by: Cary Wiedemann at Oct 07 00:01

Hello again David,

First please let me again apologize for the delay in communicating our furthering efforts to stamp out this trouble completely. Please be assured that we have been and will continue to actively work on this issue until it is no longer service impacting.

Additionally, I have just read over this entire case from the beginning and agree that the support you've received has been dismal. I assure you that the service you've received on this ticket is far from our usual prompt and thorough solutions. The cause for the trouble in this ticket is caused by the multitude of perplexing factors that this individual case presents, namely intermittent networking issues from certain locations at specific times. This is certainly not an excuse for the way this ticket has been handled but please understand that the myriad of symptoms and involvement of our core networking equipment has result! ed in a greatly delayed resolution.

For the past few weeks I have been monitoring our service reliability at the "NYIIX" internet exchange in New York City. This has been the principal focus of the initial investigation as it appeared that our connection from New York, NY to McLean, VA (where your server is located) was being saturated. The traceroutes that both you and your users have provided seemed to confirm this. However, my week-long ping tests to the ISC's (Internet Systems Consortium) NYIIX housed router showed not a single packet exceeded 50ms of latency for the entirety of the test (over one week and 600,000 packets). As this pipe is the same one shared by all other NYIIX peering it seems that something larger may be occurring.

I have also configured a "ping monitor" on your server and have the results currently being sent to my personal email account (and by extension my BlackBerry) in order to attempt to catch any other troubles in real tim! e to be able to investigate what may be occurring. This far t! his moni tor has produced no alerts and has been configured since 10/1. You can see this monitor and configure additional monitors by visiting the "Server Monitor" section of your myCP control panel.

I have just also gone back through the 'SolapTraces@yahoo.com' email box and the two relevant forum threads in order to re-analyze any and all traceroutes that were provided. Out of the numerous traceroutes provided only a small fraction were both usable and showed the trouble "in action." I will reproduce several of these below:

This traceroute is among the most damning as it clearly shows a spike at our NYIIX/DCA2 hop and fluctuating latency after that point:

1 47 ms 47 ms 47 ms 192.168.1.99
2 58 ms 57 ms 9 ms tkueur1.fi.elisa.net [85.156.192.1]
3 56 ms 53 ms 54 ms ge1-1-2.tkutur-p1.fi.elisa.net [139.97.9.18]
4 56 ms 67 ms 56 ms so5-3-0.helpa-p1.fi.elisa.net [139.97.29.137]
5 55 ms ! 56 ms 55 ms ae2.heltli-gw1.fi.elisa.net [139.97.6.246]
6 55 ms 55 ms 55 ms ae1-10.bbr2.hel1.fi.eunetip.net [213.192.191.49]
7 165 ms 164 ms 118 ms so2-3-0-0.bbr1.nyc1.us.eunetip.net [213.192.191.174]
8 790 ms 911 ms 968 ms nyc1.ge11-2.core2.dca2.hopone.net [66.36.224.209]
9 170 ms 169 ms 224 ms vl3.msfc1.distb2.dca2.hopone.net [66.36.224.245]
10 216 ms 172 ms 236 ms 66.36.240.92

This next traceroute appears to be very similar to the previous, but upon closer inspection the connection "all the way through" to your server seems to be optimal. With an actual sustained network event all hops after a problematic router will carry the latency of the affected hop, along with any new delays incurred. This traceroute seems to suggest that the delay was incurred by how long it took the router to process the UDP ping (traceroute ping) response as opposed to how long it took to forward the packet.! If the latency was actually 800ms on hop #7 hops 8 and 9 sho! uld show a minimum of 800ms:

1
2 80 ms 79 ms 84 ms cr1.cmpri.uk.easynet.net [87.87.251.186]
3 75 ms 76 ms 74 ms 80.238.46.161
4 92 ms 95 ms 82 ms bu4.er10.txlon.ov.easynet.net [89.200.135.142]
5 81 ms 82 ms 81 ms bu4.gr10.bllon.uk.easynet.net [89.200.135.143]
6 149 ms 149 ms 150 ms te0-0-0-1.gr10.bwnyc.us.easynet.net [87.86.77.105]
7 * 807 ms 803 ms nyc1.ge11-2.core2.dca2.hopone.net [66.36.224.209]
8 157 ms 155 ms 157 ms vl2.msfc1.distb1.dca2.hopone.net [66.36.224.228]
9 155 ms 157 ms 155 ms 66.36.240.92

The next traceroute is nearly identical to the one above:

1 1 ms 1 ms 1 ms 192.168.1.3
2 3 ms 1 ms 1 ms 192.168.0.1
3 27 ms 39 ms 26 ms fe0-0-c5.BG.YU.yubc.net [212.124.160.37]
4 26 ms 26 ms 32 ms ge-0-2-0-0-j0.BG.YU.yubc.net [212.124.160.62]
5 * 68 ms 29 ms YUBC-M10.telekom.yu [! 195.178.34.21]
6 29 ms 30 ms 35 ms 212.200.232.57
7 29 ms 23 ms 23 ms 212.200.227.249
8 83 ms 37 ms 31 ms PO9-0.bud-001-access-300.interoute.net [84.233.170.165]
9 132 ms 142 ms * xe-3-1-0-0.bud-001-score-1-re0.interoute.net [84.233.147.93]
10 131 ms 154 ms 129 ms ae2-0.prg-001-score-2-re0.interoute.net [84.233.138.213]
11 899 ms 132 ms 187 ms ae0-0.prg-001-score-1-re0.interoute.net [84.233.138.205]
12 131 ms 128 ms 141 ms ae2-0.fra-006-score-2-re0.interoute.net [84.233.138.210]
13 148 ms 129 ms 133 ms ae0-0.fra-006-score-1-re0.interoute.net [84.233.207.93]
14 131 ms 128 ms 135 ms ae1-0.ams-koo-score-2-re0.interoute.net [84.233.190.49]
15 128 ms 128 ms 132 ms ae0-0.ams-koo-score-1-re0.interoute.net [84.233.190.1]
16 133 ms 128 ms 130 ms ae1-0.lon-001-score-1-re0.interoute.net [84.233.190.58]
17 131 ms 127 ms 133 ms Gi0-0-0.lon-001-access-2.interoute.net [84.233.218.162]
18 133 ms 129 ms 127 ms PO6-0.nyc-002-access-1.intero! ute.net [212.23.43.149]
19 131 ms 129 ms 129 ms Gi7-0.nyc-! 002-acce ss-3.interoute.net [212.23.43.138]
20 781 ms 777 ms 780 ms nyc1.ge11-2.core2.dca2.hopone.net [66.36.224.209]
21 134 ms 134 ms 135 ms vl3.msfc1.distb1.dca2.hopone.net [66.36.224.244]
22 138 ms 135 ms 135 ms 66.36.240.92

As is this one:

1 2 ms 1 ms 2 ms 192.168.1.1
2 5 ms 2 ms 2 ms 213.101.209.65
3 17 ms 4 ms 3 ms htg0-ncore-1.gigabiteth1-4.swip.net [130.244.205.125]
4 2 ms 2 ms 2 ms htg0-core-1.tengigabiteth1-0-0.swip.net [130.244.52.129]
5 2 ms 2 ms 2 ms kst-core-1.tengigabiteth8-0-0.swip.net [130.244.218.154]
6 8 ms 8 ms 8 ms gbg-core-1.pos8-0-0.swip.net [130.244.39.142]
7 25 ms 23 ms 23 ms 130.244.205.150
8 98 ms 99 ms 98 ms nyc9-core-1.pos8-0-0.swip.net [130.244.218.214]
9 777 ms 752 ms 746 ms nyc1.ge11-2.core2.dca2.hopone.net [66.36.224.209]
10 107 ms 105 ms 105 ms vl2.msfc1.dis! tb2.dca2.hopone.net [66.36.224.229]
11 106 ms 106 ms 105 ms 66.36.240.92

By far the most telling and interesting traceroute I have seen thus far has to be this last one:

1 14 ms 9 ms 8 ms 10.42.64.1
2 22 ms 19 ms 11 ms osr01sand-v15.network.virginmedia.net [62.30.254.161]
3 17 ms 18 ms 18 ms osr02wolv-tenge74.network.virginmedia.net [62.30.254.77]
4 21 ms 15 ms 21 ms win-bb-b-ge-300-0.network.virginmedia.net [195.182.178.69]
5 21 ms 21 ms 20 ms gfd-bb-a-so-120-0.network.virginmedia.net [212.43.162.205]
6 24 ms 25 ms 21 ms gfd-bb-b-ae0-0.network.virginmedia.net [213.105.172.6]
7 19 ms 17 ms 21 ms redb-ic-1-as0-0.network.virginmedia.net [62.253.185.78]
8 114 ms 106 ms 108 ms ge1-1-9.core1.iad1.hopone.net [66.36.224.129]
9 757 ms 757 ms 769 ms ge11-1.core2.dca2.hopone.net [66.36.224.53]
10 124 ms ! 117 ms 105 ms vl3.msfc1.distb1.dca2.hopone.net [66.36.224.! 244] 11 110 ms 116 ms 105 ms 66.36.240.92

In this particular traceroute the packets enter our network at the hop "ge1-1-9.core1.iad1.hopone.net" which is a completely separate router in Ashburn, VA. From here they are sent (approximately 20 miles) to our router in McLean, VA. The trip from our router in Ashburn, VA to McLean, VA still seems to produce latency identically large to that which comes over the NYIIX connection, but this particular path doesn't go through New York at all.

The affected hop in this example is: ge11-1.core2.dca2.hopone.net and the affected hop in all of the other examples nyc1.ge11-2.core2.dca2.hopone.net

The "ge11" in this string means "card #11" in router core2.dca2.hopone.net, which happens to be a 3x Gigabit Ethernet card. It is possible that there is something physically wrong with this card that only manifests itself when certain other conditions are met. I have already started running advanced diagnostics o! n this aspect and hope to have more information by tomorrow.

Please note that every affected traceroute seems to use private peering contacts instead of global Tier 1 providers. For example all traces from Level3, GLBX, or other tier 1 transit providers which hand off directly to our core routers are unaffected. The specific peering sessions which seem to be affected by this are with:
easynet.net
eunetip.net
virginmedia.net
interoute.net
swip.net

If we cannot quickly resolve this trouble we can certainly start removing peering sessions from our NYIIX based router. This will cause a small increase in latency (as the routing will no longer connect directly into our network but rather force traffic to a global tier 1 transit provider) but should completely eliminate the instability and packet loss.

Before proceeding further, however, I would like to give an opportunity for the advanced diagnostics to run on our Cisco router. I! should have more detail tomorrow.

Unfortunately the ! other tr ouble you've experienced (drops from the East Coast, etc.) seem to be completely unrelated to this "European peering" trouble and will need to be investigated separately.

Again please let me apologize for the delay in sending this response. We have been trying many different methods to alleviate your trouble without taking extreme measures, but it is now obvious that extreme measures will be required.

This ticket will certainly be left open and another update will be coming shortly.

As always if you have any further questions or requests please don't hesitate to ask.

Thanks!

- Cary

Updated by: Customer at Oct 02 19:14

Cary

Any additional news?

Updated by: Customer at Sep 29 01:15

Cary

Thanks for the update, and I just reread the entire matter from the beginning.

I am honestly hearing nothing but endless grief on my side from the more vocal users, and the frustration! has leaked onto the help ticket and unfortunately you to as well. So fwiw to you I'm a bit red faced that I didn't keep a wall between them and the issue ticket in my discussions with you. My sincere apologizes for anything that made your job harder.

The kind words are still meant believe it or not, and I want to personally thank you for the efforts expended on one llittle server, in one little rack, in one room, of your company.

*sigh*
The complaints have expanded to include packet lose, and a "bounce" in ping times for all including North America. The users are experiencing a fluctuation from, for example, 40 to 180 ms randomingly for 30 to 60 secs and the duration is sometimes for hours. Those with ping times as low as 10 ms (holy crap 10 ms!) can not connect to the server, or time off.

I am aware that many times the issue is on their end with thier ISP and packet shaping. I'm also aware that 100ms pings are far from bad coming out o! f Seattle to VA. In addition the CPU usage was spiking the ot! her nigh t while I was idling on the server watching performance... and this should not be happening given past experiences with the applications and services running on the server. While no major update went live recently I can not swear to any minor changes, and will look into the matter further on my end.

Im letting you know this in a share the information vein and full disclosure.

I look forward to your diagnosis
David
Last edited by MrChaos on Fri Oct 09, 2009 2:57 am, edited 1 time in total.
Ssssh
MrChaos
Posts: 8352
Joined: Tue Mar 21, 2006 8:00 am

Post by MrChaos »

I would like to add I have spent a number of evenings sitting on the $#@!ing game server watching pings and in the NOAT lobby. I asked a number of people who were experiencing a fluctuation in ping times and packet loses to please provide a traceroute to the email addy. NOT ONE PERSON DID IT, NOT ONE. Now I was a hider since well I don't want to be bitched at but still who the $#@! did they think was asking them to provide trace routes but an admin for Solap?

Really the fact most of you couldn't be assed to even send a trace route kind has me a wee bit ticked off.

I love the $#@!ing game and the community sometimes but ffs it's free, run by volunteers, and really once in awhile you might have to do something like send a trace route to an email address

Thanks I Feel Better Now and I Still Want To Make Man Babies With Spidey (and Brood)
David


edit: apparently people are doing it according to the pm from Spidey. Please send them rather then holding them back, man babies comment expanded to Nuke
Last edited by MrChaos on Fri Oct 09, 2009 3:07 am, edited 1 time in total.
Ssssh
FreeBeer
Posts: 10902
Joined: Tue Dec 27, 2005 8:00 am
Location: New Brunswick, Canada

Post by FreeBeer »

hmmm... that's all we need... Broody Chaotic Arachnids.
[img]http://www.freeallegiance.org/forums/st ... erator.gif" alt="IPB Image">

chown -R us base
_SRM_Nuke
Posts: 1189
Joined: Sat Apr 17, 2004 7:00 am
Location: في واشنطن لالآن

Post by _SRM_Nuke »

MrChaos wrote:QUOTE (MrChaos @ Oct 8 2009, 10:50 PM) Thanks I Feel Better Now and I Still Want To Make Man Babies With Spidey (and Brood)

edit: apparently people are doing it according to the pm from Spidey. Please send them rather then holding them back, man babies comment expanded to Nuke
I'll try and send you a trace of exactly when it happens. 99% of the time there is no packet loss but there are random points in the early evening (east coast US time) where it seems quite a few people lose packets. Slipshod, Spidey, Brood, Viru, etc. were all experiencing it. I normally have a 25ms ping but every now and then it loses packets and you jump halfway across a sector or your rip sits on 1s forever. I think Drizzo experienced this firsthand as my gunner today haha. Its frustrating and its happening to people in disparate places (not just Euros who send packets via NYC) but to people like myself who are literally 5 miles away from the server. Also, I don't recall losing any packets in the first few weeks of the server being up, but I could be wrong. Anyway, what I'm trying to say in my incoherent babble is that the server is fine the vast majority of the time but for some people, at some times, packets get lost.
MrChaos to Sharpfish wrote:QUOTE (MrChaos to Sharpfish @ Oct 2 2011, 08:55 AM) Damn there went my hope you died in a couch fire.
notjarvis
Posts: 4629
Joined: Tue Jun 03, 2008 11:08 am
Location: Birmingham, UK

Post by notjarvis »

<--- Is disappointed MrC doesn't want to make Man Babies with him or Beatrice........
back to crying into the pillow every night I suppose


:lol:

I'll see if I can grab some more traces soon, but I've stopped collecting for a time as my machine has been erratic, and I thought my script may have something to do with it (it doesn't actually).
Last edited by notjarvis on Fri Oct 09, 2009 8:30 am, edited 1 time in total.
MrChaos
Posts: 8352
Joined: Tue Mar 21, 2006 8:00 am

Post by MrChaos »

_SRM_Nuke wrote:QUOTE (_SRM_Nuke @ Oct 9 2009, 03:09 AM) I'll try and send you a trace of exactly when it happens. 99% of the time there is no packet loss but there are random points in the early evening (east coast US time) where it seems quite a few people lose packets. Slipshod, Spidey, Brood, Viru, etc. were all experiencing it. I normally have a 25ms ping but every now and then it loses packets and you jump halfway across a sector or your rip sits on 1s forever. I think Drizzo experienced this firsthand as my gunner today haha. Its frustrating and its happening to people in disparate places (not just Euros who send packets via NYC) but to people like myself who are literally 5 miles away from the server. Also, I don't recall losing any packets in the first few weeks of the server being up, but I could be wrong. Anyway, what I'm trying to say in my incoherent babble is that the server is fine the vast majority of the time but for some people, at some times, packets get lost.

I agree with everything written by you Nuke, and why Ive been idling in NOAT. It's random, intermittent, and hard to catch. Please note at the bottom I informed him of this using Rhino's (the 10 ms comment) and Brood's experiences as examples to him. The point being they can't seem to catch it even when running random pings at the server. I'm wondering if it's not just the connection to the server.

We've installed monitors on our end without profit, rebooted a few times iirc, have every patch, upgrade we know of installed on the server. The machine is dedicated and exists solely for your and everyone's gaming pleasure (the only things running are Allegiance things and absolutely nothing else including friggin email), the amount of monthly traffic utilization is laughablely small ( in the single digits), the provider is working on their end to catch the issue... so far nada about the packet dropping issue. Oh they also did maintence on the router effecting the Euros issue since I saw the general announcement to the those with servers with them.

It's been like trying to catch a fart in a jar in a swirling wind storm. So when I spend a number of evening as the hider Chapucero (means shoddy in Spanish), sitting there randomly pming people in game to send a trace route i.e. go do it rtfn so we can capture it. Most can't seem to be assed to do it but oh boy do they have the time to bitch. I've been coming increasingly frustrated with the community as evidenced by the trantrum thrown last night.

What got lost in the tantrum is those who have been incrediblely helpful I appreciate immensely. Those who smacked some heads for clue too. Those who sent in their tracroutes. Solap will never make you jizz with happiness because everyone's latency is near zero and a packet never get's lost. If your personally experiencing an issue, tough @#(! that's the intertubes. If a small group of people is having issues, well then you got to know we are trying to fix it but sometimes thems the breaks.

MrChaos


READ ME PLEASE:

So when you have a trace route that you think is bad, paste them in an email and send them to: SolapTraces@yahoo.com. I draw the line at being your secertary because your to lazy to cut, paste it in an email. If you don't know how, or legitimately don't undestand the traceroute post it here but you're still expected to copy, paste and send it in an email.

your->you're
Last edited by MrChaos on Fri Oct 09, 2009 12:26 pm, edited 1 time in total.
Ssssh
sgt_baker
Posts: 1510
Joined: Wed Oct 20, 2004 7:00 am
Location: London, UK.
Contact:

Post by sgt_baker »

Ironically I live on one of the 'problem peer' networks mentioned above, yet have had no issue with connecting to Solap. They're doing the right thing insofar as suspecting there might be some gnarly underlying cause which has yet to be identified.
Image
Granary Sergeant Baker - Special Bread Service (Wurf - 13th Oct 2011)
spideycw
Posts: 7512
Joined: Sun Jul 06, 2003 7:00 am

Post by spideycw »

QUOTE The notJarvis's script is hosted here http://www.mediafire.com/?fr2ttenzujn.

It is a .bat file. After downloading it, you need to run it.
It will create an directory on C:\ where it will store in .txt file the result of an tracert to Solap,
executed every 30 minutes, while your PC is on.[/quote]

if anyone was looking for this post ^^
I'm sorry I don't remember any of it. For you the day spideycw graced your squad with utter destruction was the most important day of your life. But for me, it was Sunday
Idanmel wrote:QUOTE (Idanmel @ Mar 19 2012, 05:54 AM) I am ashamed for all the drama I caused, I have much to learn on how to behave when things don't go my way.

My apologies.
Adam4
Posts: 2144
Joined: Sun Sep 03, 2006 9:05 am
Location: England

Post by Adam4 »

sgt_baker wrote:QUOTE (sgt_baker @ Oct 9 2009, 02:31 PM) Ironically I live on one of the 'problem peer' networks mentioned above, yet have had no issue with connecting to Solap. They're doing the right thing insofar as suspecting there might be some gnarly underlying cause which has yet to be identified.
Post Reply