physicaldrivegetter: add storcli2 physical drive getter#58
Open
ezekiel-alexrod wants to merge 3 commits into
Open
physicaldrivegetter: add storcli2 physical drive getter#58ezekiel-alexrod wants to merge 3 commits into
ezekiel-alexrod wants to merge 3 commits into
Conversation
storcli2 and perccli2 do not use the process exit code as a reliable success signal: some failures (e.g. an invalid drive) exit non-zero while still writing their JSON error payload to stdout, whereas others (invalid controller, invalid VD) exit zero. cmd.Output() turned the former into a Go error and discarded the payload, so storcli2.Decode never saw the in-JSON message and callers got a bare "exit status N". Return stdout as-is when the command exits non-zero but produced a payload, and let storcli2.Decode determine success/failure from the JSON. A non-zero exit with no stdout (or any other exec failure) stays a hard error. Verified against a live MegaRAID 9660-16i: "/cN/eX/sY show all" on a missing drive exits 255 with a "Drive not found" JSON payload. Align the storcli2 package docs (envelope.go, decode.go), which still claimed the CLI "exits 0 even on failure", with this corrected contract.
Implement ports.PhysicalDrivesGetter for storcli2 / perccli2, mirroring the ssacli getter and the megaraid v1 physical-drive parsing. PhysicalDrives() reads the whole inventory from a single "/cN/eall/sall show all" (which already carries each drive's detailed information); PhysicalDrive() reads one drive via its "/cN/eX/sY" selector (or "/cN/sY" without an enclosure, as in megaraid v1). Each "Drives List" entry maps to a PhysicalDrive: vendor/serial/WWN from Drive Detailed Information, model and type (Med) from Drive Information, size via utils.ConvertSizeBytes, and status from the State/Status pair. The State/Status mapping follows the drive-state legend printed by the binary: unlike storcli1, storcli2 has no "Failed" drive state (failure is reported in Status), and every base state (UConf, Conf, JBOD, GHS, DHS) comes in suffixed variants (Shld, Sntz, Dgrd), so states are matched by family rather than exactly. A "Failed", "Offline" or "Missing" status takes precedence over any state so a non-functioning drive is never reported as in use or available; configured, JBOD and hot-spare drives map to used; unconfigured drives are available unless bad or unsupported. Device paths are only computed for healthy JBOD drives, so a dead JBOD drive whose device node is gone does not fail the whole inventory. Verified against a live MegaRAID 9660-16i (24-drive inventory and single-drive reads). One implementation serves both binaries through the injected commandrunner.CommandRunner. Section structs live in the getter file and consume the shared storcli2.Decode + utils.UnmarshalToSlice.
Cross-checking the implementation against the StorCLI2 User Guide v1.1 surfaced three gaps: - storcli2 reports an NVMe drive with "SSD" media and the NVMe transport in its interface (the guide's sample shows Intf "NVMe", Med "SSD"), so deriving the disk type from "Med" alone could never yield NVMe. The type now checks "Intf" first. - "Unusable" is a documented drive status (alongside Good, Bad, Replace, Offline, Failed, Online, Missing); an unconfigured drive with that status was reported unassigned-good, i.e. selectable for VD creation. It now maps to unassigned-bad, like "Bad". - Per the guide, showing a nonexistent object reports success, so an absent "Drives List" section is an empty inventory, not an error (mirroring the logical-volume getter's zero-VD contract). Also pins "Replace" (a drive being copied back) as in-use in the status table.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Implements
ports.PhysicalDrivesGetterfor storcli2 / perccli2 (physicaldrivegetter/storcli2.go), mirroring the ssacli getter and the megaraid v1 physical-drive parsing. Also fixes the storcli2/perccli2 command runners to surface in-JSON errors when the binary exits non-zero, and aligns the stalestorcli2package docs with that corrected exit-code contract.How
PhysicalDrives(ctrl)→ a single/cN/eall/sall show all(itsDrives Listalready carries each drive's detailed information).PhysicalDrive(meta)→/cN/eX/sY show all(or/cN/sYwithout an enclosure, as in megaraid v1), with a not-found guard on an empty list.Drives Listentry maps to aPhysicalDrive: vendor/serial/WWN (0x-prefixed as in megaraid v1) fromDrive Detailed Information, model and type (Med) fromDrive Information, size viautils.ConvertSizeBytes, status from theState/Statuspair, andReasoncarrying the raw status (as in the ssacli getter).commandrunner.CommandRunner. Section structs live in the getter file and consume the sharedstorcli2.Decode+utils.UnmarshalToSlice.State/Status mapping
The mapping follows the drive-state legend printed by the binary, verified on a live
MegaRAID 9660-16i. Two storcli2 specifics drive the design:Faileddrive state (storcli1 had one): failure is reported inStatus. AFailed,OfflineorMissingstatus therefore takes precedence over any state, so a non-functioning drive is never reported as in use — or worse, available for LV creation.Shldshielded diagnostics,Sntzsanitize,Dgrddegraded — e.g.ConfDgrd,JBODSntz,UConfShld), so states are matched by family rather than exactly. The lone exception isShld(“JBOD Shielded”), which belongs to the JBOD family without the prefix.Resulting map:
Conf*/JBOD*/Shld/GHS*/DHS*→ used;UConf*→ unassigned good, unless the status isBador the state isUConfUnsp(unsupported) → unassigned bad;Failed/Offline/Missingstatus → failed; anything else (Unusbl,Various, …) → unknown with the raw status inReason.Device paths are computed only for healthy JBOD drives (
ComputePathsreads the real filesystem): a dead JBOD drive whose device node is gone must not fail the whole inventory. A synthetic-payload test pins this guard at the entity level (the captured fixtures contain no JBOD drive).Command runner fix
storcli2/perccli2 do not use the exit code as a reliable success signal: some failures (invalid drive) exit non-zero while still writing their JSON error payload to stdout, others (invalid controller/VD) exit zero.
cmd.Output()turned the former into a bareexit status Nand discarded the payload. The runners now return stdout as-is on a non-zero exit with a payload, lettingstorcli2.Decodesurface the in-JSON error ("No PD found."); a non-zero exit with no stdout stays a hard error. Thestorcli2package docs (envelope.go,decode.go), which still claimed the CLI "exits 0 even on failure", are reworded accordingly.Tests
Table-driven
storcli2_test.go, reusing the testify mock runner andtestdata/storcli2/fixtures (real server captures: 24-drive inventory, single drives,UConf/Good, invalid-drive failure payload). Cases: full inventory, single drive (configured / unconfigured / not found), empty-list guard, JBOD entity-level mapping with the ComputePaths guard, and unit tables for theState/Statusmapping (families,Failed/Offline/Missingprecedence, hot spares, unsupported), JBOD family detection, disk type, WWN formatting and selector building. Runner tests cover both non-zero-exit paths. The full flow was also validated end-to-end against the live 9660-16i captures.UG v1.1 alignment (second commit)
Cross-checking against the StorCLI2 User Guide v1.1 surfaced three gaps, fixed in
physicaldrivegetter: align storcli2 typing and statuses with UG v1.1:Med=SSDand the transport inIntf(the guide's sample showsIntf NVMe, Med SSD), so deriving the type fromMedalone could never yield NVMe. The type now checksIntffirst.Unusablestatus: a documented drive status; an unconfigured drive with it was reported unassigned-good (selectable for VD creation). It now maps to unassigned-bad, likeBad.Drives Listsection is an empty inventory, not an error (mirroring the logical-volume getter's zero-VD contract).Replace(copyback) is also pinned as in-use in the status table.