Loss of Connectivity to BINP in Novosibirek RussiaLes Cottrell. Page created: May 23 2003.Central Computer Access | Computer Networking | Network Group | ICFA-NTF Monitoring |
|
As we can see in our logs, since May 21st, 02:16 local time (that is GMT+7) we've lost the ESNet connectivity. Few traces in both directions: -- this is traceroute from noric01.slac: [noric01] ~ > traceroute sky.inp.nsk.su traceroute to sky.inp.nsk.su (193.124.167.84), 30 hops max, 38 byte packets 1 rtr-farmcore-farm0 (134.79.87.9) 0.256 ms 0.181 ms 0.158 ms 2 rtr-dmz1-ger (134.79.135.15) 0.230 ms 0.197 ms 0.200 ms 3 slac-rt4.es.net (192.68.191.146) 0.291 ms 0.236 ms 0.266 ms 4 snv-pos-slac.es.net (134.55.209.1) 0.635 ms 0.617 ms 0.673 ms 5 chicr1-oc192-snvcr1.es.net (134.55.209.54) 48.883 ms 48.826 ms 48.871 ms 6 aoacr1-oc192-chicr1.es.net (134.55.209.58) 68.883 ms 68.833 ms 68.873 ms 7 aoapr1-ge0-aoacr1.es.net (134.55.209.110) 68.981 ms 68.937 ms 68.875 ms 8 * * * 9 * * * -- this is a traceroute from www.slac traceroute to CSD-CC.inp.nsk.su (193.124.167.209): 3-30 hops, 38 byte packets 3 192.68.191.146 (192.68.191.146) 0.610 ms (ttl=252!) 4 snv-pos-slac.es.net (134.55.209.1) 0.843 ms (ttl=251!) 5 chicr1-oc192-snvcr1.es.net (134.55.209.54) 49.0 ms (ttl=250!) 6 aoacr1-oc192-chicr1.es.net (134.55.209.58) 69.1 ms (ttl=249!) 7 aoapr1-ge0-aoacr1.es.net (134.55.209.110) 69.1 ms (ttl=248!) 8 * 9 * -- this is a trace in other direction, from BINP: mx:belov {111} traceroute ping.slac.stanford.edu traceroute to ns-ext2.slac.stanford.edu (134.79.18.21), 64 hops max, 40 byte packets 1 cisco-1 (193.124.167.254) 0.346 ms 0.339 ms 0.394 ms 2 Rtc-gw (193.124.167.5) 0.587 ms 0.518 ms 0.877 ms 3 192.153.114.137 (192.153.114.137) 108.356 ms 106.759 ms 106.622 ms 4 130.87.43.2 (130.87.43.2) 107.282 ms 106.805 ms 106.494 ms 5 keksw1-ns.kek.jp (130.87.4.34) 106.589 ms 106.791 ms 106.533 ms 6 kekgw.kek.jp (130.87.4.1) 106.974 ms 106.839 ms 107.217 ms 7 KEK-P6-0.sinet.ad.jp (150.99.197.125) 385.765 ms 419.214 ms 405.685 ms 8 JT-tokyo-S1-P10-0.sinet.ad.jp (150.99.197.33) 373.749 ms 394.104 ms 403.642 ms 9 nii-S1-P4-0.sinet.ad.jp (150.99.197.22) 376.77 ms 379.923 ms 501.592 ms 10 nii-gate2-P2-0.sinet.ad.jp (150.99.199.174) 392.255 ms 395.85 ms 373.816 ms 11 nii-gate3-P0-0.sinet.ad.jp (150.99.199.178) 369.830 ms 360.757 ms 397.259 ms 12 * *^C At the same time there still is connectivity between SLAC and KEK: [noric01] ~ > ping www.kek.jp PING ccwww.kek.jp (130.87.104.100) from 134.79.86.51 : 56(84) bytes of data. 64 bytes from ccwww.kek.jp (130.87.104.100): icmp_seq=1 ttl=240 time=262 ms 64 bytes from ccwww.kek.jp (130.87.104.100): icmp_seq=2 ttl=240 time=262 ms bsunsrv1[52]% traceroute www.slac.stanford.edu traceroute to www4.slac.stanford.edu (134.79.18.136), 30 hops max, 40 byte packets 1 130.87.224.201 (130.87.224.201) 5 ms 1 ms 1 ms 2 ns1ka.kek.jp (130.87.5.10) 2 ms 2 ms 2 ms 3 keksw1-ns.kek.jp (130.87.4.34) 2 ms 2 ms 2 ms 4 kekgw.kek.jp (130.87.4.1) 2 ms 2 ms 2 ms 5 KEK-P6-0.sinet.ad.jp (150.99.197.125) 2 ms 2 ms 2 ms 6 JT-tokyo-S1-P10-0.sinet.ad.jp (150.99.197.33) 3 ms 3 ms 3 ms 7 nii-S1-P4-0.sinet.ad.jp (150.99.197.22) 43 ms 169 ms 480 ms 8 nii-gate2-P2-0.sinet.ad.jp (150.99.199.174) 4 ms 4 ms 4 ms 9 nii-gate3-P1-0.sinet.ad.jp (150.99.199.182) 195 ms 195 ms 195 ms 10 aoa-sinet.es.net (198.124.216.121) 195 ms 195 ms 195 ms 11 aoacr1-ge0-aoapr1.es.net (134.55.209.109) 195 ms 195 ms 195 ms 12 chicr1-oc192-aoacr1.es.net (134.55.209.57) 215 ms 216 ms 215 ms 13 snvcr1-oc192-chicr1.es.net (134.55.209.53) 275 ms 276 ms 276 ms 14 slac-pos-snv.es.net (134.55.209.2) 276 ms 276 ms 276 ms 15 rtr-dmz1-vlan400.slac.stanford.edu (192.68.191.149) 264 ms 264 ms 264 ms 16 * * * 17 www4.slac.stanford.edu (134.79.18.136) 334 ms * 277 ms Should something be fixed?
However, I am suspicious that this is the actual cause for of a couple of reasons. First is the differences in reported start times. Second is that the filter was configured to log all packets that it rejected and the log doesn't appear to contain any packets releveant to the connection to sky.inp.nsk.su
Looking at the raw data PingER records for SLAC to BINP, it appears the onset of loss of connectivity was after 18:30 and before 19:00 May 20, 2003 GMT (or between 11:30a and noon PDT) whichj is in reasonable agreement with the installation of the loose unicast RPF filter.