Aufgabe #454: srs5100 bekommt kernel panic und startet neu - Bereich Administration Rechentechnik - Aufgabenverwaltung StuRa HTW Dresden

Aktionen

Link kopieren

Aufgabe #454

offen

srs5100 bekommt kernel panic und startet neu

Von MatthiasJakobi vor etwa 6 Jahren hinzugefügt. Vor etwa 6 Jahren aktualisiert.

Status:

Neu

Priorität:

Normal

Zugewiesen an:

Bereich Administration Rechentechnik

Kategorie:

srs5100

Beginn:

15.05.2019

Abgabedatum:

% erledigt:

Geschätzter Aufwand:

5:00 h

Aufgewendete Zeit:

0:30 h

Beschreibung

Seit 2019-05 wurde die Hardware abgestellt, da kein ordentlicher Betrieb mehr möglich war.

Das System startet normal, aber bekommt nach kurzer Zeit (keine Minute das Menü von FreeNAS auf dem Monitor) panic.

panic ergibt sich wohl beim Prozess von zfs scrub. (Es bleibt wohl bei etwa 2 TB hängen.) Die Meldung zu panic ist die Rede von der Funktion zfs_recover.

Aktionen

Link kopieren

Von MatthiasJakobi vor etwa 6 Jahren aktualisiert

Thema wurde von SRS5100 kernel panic! zu SRS5100 bekommt kernel panic durch zfs! SYSTEM ist abgeschaltet geändert

Aktionen

Link kopieren

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Thema wurde von SRS5100 bekommt kernel panic durch zfs! SYSTEM ist abgeschaltet zu srs5100 bekommt kernel panic und startet neu geändert
Beschreibung aktualisiert (Vergleich)

Aktionen

Link kopieren

Von HartmutFournes vor etwa 6 Jahren aktualisiert

panic: Solaris(panic)blkptr atxfffff8011f14f428 has invalid type 99

Bild (folgt) als Anhang

Aktionen

Link kopieren

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Um erst einmal in Ruhe "nachschauen" zu können wurde das automatische Laufen lassen von scrub "ruhig gestellt".

Erst einmal akutes Pausieren von scrub:

zpool scrub -p back

zpool status

  pool: back
 state: ONLINE
  scan: scrub paused since Mon May 20 15:04:21 2019
    scrub started on Mon May 20 12:25:45 2019
    2.78T scanned, 2.51T issued, 3.07T total
    0 repaired, 81.83% done
config:

    NAME                                                STATE     READ WRITE CKSUM
    back                                                ONLINE       0     0     0
      raidz2-0                                          ONLINE       0     0     0
        gptid/32739ff9-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0
        gptid/3a6802ac-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0
        gptid/438bc2f3-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0
        gptid/4bbd76d2-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:04:13 with 0 errors on Mon May 20 12:02:54 2019
config:

    NAME        STATE     READ WRITE CKSUM
    freenas-boot  ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        da4p2   ONLINE       0     0     0
        da5p2   ONLINE       0     0     0

errors: No known data errors

Und dann noch grundsätzlich die (wiederkehrende) Aufgabe scrub abgestellt:

WUI: Tasks -> Scrub Tasks -> beim Pool back Enabled auf no gesetzt

Aktionen

Link kopieren

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Zum "Spielen" wurde ein Stand mit checkpoint erstellt.

zpool checkpoint back

Theoretisch müsste der Stand (langfristig) wieder entfernt werden.

zpool checkpoint --discard back

Aktionen

Link kopieren

Von PaulRiegel vor etwa 6 Jahren aktualisiert

zfs set canmount=off back/backup

cannot unmount '/mnt/back/backup/lrs0x539': Device busy

zfs set canmount=off back/backup/lrs0x539

cannot unmount '/mnt/back/backup/lrs0x539': Device busy

zfs unmount back/backup/lrs0x539

cannot unmount '/mnt/back/backup/lrs0x539': Device busy

zfs unmount -f back/backup/lrs0x539

Scheint dennoch panic zu bekommen!

Aktionen

Link kopieren

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Vielleicht ist auch nur doof eines der Geräte für den Massenspeicher (Festplatte) kaputt. Dazu sollten einmal alle einzelnen Geräte (höchstens zwei auf einmal) entfernt werden.

Folglich wäre dann einfach eines der Geräte zu entfernen (beziehungsweise zu ersetzen (Handbook FreeNAS: Replacing an Encrypted Disk).

Aktionen

Link kopieren

Von PaulRiegel vor etwa 6 Jahren aktualisiert

PaulRiegel schrieb:

Vielleicht ist auch nur doof eines der Geräte für den Massenspeicher (Festplatte) kaputt. Dazu sollten einmal alle einzelnen Geräte (höchstens zwei auf einmal) entfernt werden.

Als das 3. Gerät (aus dem Pool und von links vor dem Server) herausgenommen, startete das Server wieder erst einmal. (Nun läuft scrub (von Anfang an).

zpool status back

  pool: back
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
    the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub in progress since Wed May 22 00:31:43 2019
    624G scanned at 1.05G/s, 204G issued at 354M/s, 3.07T total
    0 repaired, 6.51% done, 0 days 02:21:45 to go
config:

    NAME                                                STATE     READ WRITE CKSUM
    back                                                DEGRADED     0     0     0
      raidz2-0                                          DEGRADED     0     0     0
        gptid/32739ff9-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0
        gptid/3a6802ac-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0
        3944232543987694242                             UNAVAIL      0     0     0  was /dev/gptid/438bc2f3-4e5f-11e9-b33f-d4ae528f1318.eli
        gptid/4bbd76d2-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0

errors: No known data errors

Aktionen

Link kopieren

Von PaulRiegel vor etwa 6 Jahren aktualisiert

PaulRiegel schrieb:

PaulRiegel schrieb:

Vielleicht ist auch nur doof eines der Geräte für den Massenspeicher (Festplatte) kaputt. Dazu sollten einmal alle einzelnen Geräte (höchstens zwei auf einmal) entfernt werden.

Als das 3. Gerät (aus dem Pool und von links vor dem Server) herausgenommen, startete das Server wieder erst einmal. (Nun läuft scrub (von Anfang an).

zpool status back
[...]

Vorangegangen war der Versuch "einfach" das Gerät wieder als online in den Pool für ZFS einzubinden.

zpool online back 3944232543987694242

warning: device '3944232543987694242' onlined, but remains in faulted state
use 'zpool replace' to replace devices that are no longer present
root@srs5100[~]# zpool status                                    
  pool: back
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
    the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub in progress since Wed May 22 00:31:43 2019
    574G scanned at 684M/s, 3.25M issued at 27.0K/s, 3.07T total
    0 repaired, 0.00% done, no estimated completion time
config:

    NAME                                                STATE     READ WRITE CKSUM
    back                                                DEGRADED     0     0     0
      raidz2-0                                          DEGRADED     0     0     0
        gptid/32739ff9-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0
        gptid/3a6802ac-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0
        3944232543987694242                             UNAVAIL      0     0     0  was /dev/gptid/438bc2f3-4e5f-11e9-b33f-d4ae528f1318.eli
        gptid/4bbd76d2-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:04:13 with 0 errors on Mon May 20 12:02:54 2019
config:

    NAME        STATE     READ WRITE CKSUM
    freenas-boot  ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        da3p2   ONLINE       0     0     0
        da4p2   ONLINE       0     0     0

errors: No known data errors

zpool scrub back

cannot scrub back: currently scrubbing; use 'zpool scrub -s' to cancel current scrub

Aktionen

Link kopieren

#10

Von PaulRiegel vor etwa 6 Jahren aktualisiert

zpool status

  pool: back
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
    the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub in progress since Wed May 22 00:31:43 2019
    2.51T scanned at 311M/s, 2.50T issued at 310M/s, 3.07T total
    0 repaired, 81.40% done, 0 days 00:32:11 to go
config:

    NAME                                                STATE     READ WRITE CKSUM
    back                                                DEGRADED     0     0     0
      raidz2-0                                          DEGRADED     0     0     0
        gptid/32739ff9-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0
        gptid/3a6802ac-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0
        3944232543987694242                             UNAVAIL      0     0     0  was /dev/gptid/438bc2f3-4e5f-11e9-b33f-d4ae528f1318.eli
        gptid/4bbd76d2-4e5f-11e9-b33f-d4ae528f1318.eli  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:04:13 with 0 errors on Mon May 20 12:02:54 2019
config:

    NAME        STATE     READ WRITE CKSUM
    freenas-boot  ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        da4p2   ONLINE       0     0     0
        da5p2   ONLINE       0     0     0

errors: No known data errors

Und (kurz) danach: Down! Damn!

Aktionen

Link kopieren

Auch abrufbar als: Atom PDF

Projekt

Allgemein

Profil

Bereich Administration Rechentechnik

Aufgabe #454

srs5100 bekommt kernel panic und startet neu

Von MatthiasJakobi vor etwa 6 Jahren aktualisiert

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Von HartmutFournes vor etwa 6 Jahren aktualisiert

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Von PaulRiegel vor etwa 6 Jahren aktualisiert

Von PaulRiegel vor etwa 6 Jahren aktualisiert