Project

General

Profile

Actions

Bug #143

closed
RA

Tunnel Outage #2 - Alle externe services onbereikbaar

Bug #143: Tunnel Outage #2 - Alle externe services onbereikbaar

Added by Redmine Admin 4 days ago. Updated about 8 hours ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Start date:
12/12/2025
Due date:
% Done:

0%

Estimated time:
ERPNext Project:
AI Reason:
AI Classified:
No
Post to Mattermost:
No
Stream:
🔧 Infrastructure
Horizon:
❄️ Icebox
Mattermost Message:

Description

Alle 12 SSH tunnels down. Tweede incident in 2 dagen. Gerelateerd: #142


Related issues 2 (2 open0 closed)

Related to Bug #142: Alle tunnels down - externe URLs niet bereikbaar (404)New12/11/2025

Actions
Related to Feature #145: Sish Server Timeout Analyse & AanbevelingenNew12/12/2025

Actions

RA Updated by Redmine Admin 4 days ago Actions #1

Root Cause Analyse

Primaire oorzaak: Sish server timeout

De SSH tunnel logs tonen:

Connection to cicd.eu1.asd.engineer closed by remote host.
Transferred: sent 1134740, received 233696 bytes, in 39913.7 seconds

De sish server sloot de verbinding na ~11 uur (39913 sec). Dit lijkt een server-side timeout te zijn.

Secundaire oorzaak: Watchdog kon niet herstarten

De tunnel watchdog detecteerde de outage correct maar kon niet herstarten:

/home/nwuite/project/asd-pma/scripts/watchdog/tunnel-watchdog.sh: line 90: bun: command not found

Cron jobs draaien in een beperkte shell environment waar bun niet in het PATH zit.

Oplossing

  1. Watchdog gefixed: Script gebruikt nu absoluut pad naar bun (/home/nwuite/.bun/bin/bun)
  2. Tunnels herstart: Alle 12 tunnels werken weer

Preventie

  • Onderzoek sish server timeout configuratie
  • Overweeg autossh met automatische reconnect
  • Monitor tunnel uptime metrics

RA Updated by Redmine Admin 4 days ago Actions #2

  • Related to Bug #142: Alle tunnels down - externe URLs niet bereikbaar (404) added

RA Updated by Redmine Admin 4 days ago Actions #3

Incident opgelost:

  • Tunnels herstart en alle 12 endpoints bereikbaar
  • Watchdog script gefixed (BUN_PATH toegevoegd)
  • Gerelateerd aan ticket #142

RA Updated by Redmine Admin 4 days ago Actions #4

Incident opgelost:

  • Tunnels herstart en alle 12 endpoints bereikbaar
  • Watchdog script gefixed (BUN_PATH toegevoegd)
  • Gerelateerd aan ticket #142

RA Updated by Redmine Admin 4 days ago Actions #5

  • Related to Feature #145: Sish Server Timeout Analyse & Aanbevelingen added

RA Updated by Redmine Admin 4 days ago Actions #6

  • Status changed from New to Resolved

RA Updated by Redmine Admin about 8 hours ago Actions #7

  • Stream set to 🔧 Infrastructure
  • Horizon set to ❄️ Icebox
Actions

Also available in: PDF Atom