Skip to content

diskdiscover_and_configraid

Yuan Bai edited this page Nov 24, 2015 · 2 revisions

Overall:

This mini-design will support 2 new features for RAID supports, these functions can be shipped in xCAT-genesis-scripts, we can run these functions under xCAT genesis system.

  1. Discover disk devices under xcat genesis system;
  2. Configure RAID, including creating RAID arrays and deleting RAID arrays;

Part1: Discover disk devices

Command: diskdiscover


1. Input parameter: PCI_ID or nothing

2. Description: User can use this command to get the overview of disks/RAID from compute node; The output will be useful information for configure RAID support; The input parameter is PCI_ID, PCI_ID includes PCI vender and device ID. For example, We can find power8 SAS adapter info from http://pci-ids.ucw.cz/read/PC/1014/034a; 1014 is vender info, 034a is PCI-E IPR SAS Adapter.

3. The framework process:
    If input parameter PCI_ID is not null:
         diskdiscover read PCI_ID, it can find related PCI_SLOT_NAME;
         use different functions to get disk devices, their Resource_Path,  their status, descriptions and overview of RAID arrays;
         combine these output as a metrix;
    else:
         diskdiscover will find all advanced function disks and its related info, including PCI_ID,PCI_SLOT_NAME.....; It also show the RAID arrays;

4. Work process:

     1). Start xCAT genesis system in  P8 system, let P8 system enter xCAT genesis system shell.
     2). From xcat management node, execute ``xdsh nodename "diskdiscover 1014:034a"`` or ``xdsh nodename "diskdiscover"``.
     3). The format of output will be a metrix; The column can include PCI_ID, PCI_Address, Resource_Path, devices, status, Descriptions.

      Reference example: 

       --------------------------------------------------------------------------
       PCI_ID     PCI_SLOT_NAME  Resource_Path  Device  Description   Status
       ------     -------------  -------------  ------  -----------   ----------------
       1014:034a  0001:08:00.0   0:0:0:0        sg0     10 Array Member Active
       1014:034a  0001:08:00.0   0:0:1:0        sg1     10 Array Member Active
       -------------------------------------------------------------------
       Get ipr RAID arrays by PCI_SLOT_NAME: 0001:08:00.0
       -------------------------------------------------------------------
       Name   PCI/SCSI Location         Description               Status
       ------ ------------------------- ------------------------- -----------------
       sda    0001:08:00.0/0:2:0:0       RAID 10 Disk Array        Optimized

Part2: Configure RAID
---------------------

Script: configraid

Function: delete RAID arrays, create RAID arrays

Command format:
  
    configraid delete_raid=[all|raid_array_list|null] stripe_size_in_kb=[16|64|256] create_raid = rl#[0,10,5,6] | [PCI_ID#<num>|PCI_SLOT_name#<pci_slot_name>]|disk_num#<number of disks for one raid> create_raid = rl#[0,10,5,6] | [PCI_ID#<num>|PCI_SLOT_name#<pci_slot_name>]|disk_num#<number of disks for one raid> ...

Description:

1. Input parameters:

   delete_raid:
        1). delete_raid list raid arrays which should be removed. If its value is all, all raid arrays detected should be deleted.
        2). If its value is a list of raid names, these raid arrays will be deleted. Raid array names should be seperated by ``#``. If its value is null, no raid array will be deleted.If there is no delete_raid, the default value is null.
        3). format is : delete_raid = [all|raid_array_list|null]
        4). example: 
            delete_raid = sda#sdd

   create_raid:

       1). When we want to create a raid array, we can add a line begginning with create_raid.
       2). The format is : create_raid="rl#<raidlevel>|[pci_id#<num>|pci_slot_name#<pci_slot_name>|disk_names#<sg0>#..#<sgn>]|disk_num#<number>" ...
       3). rl means RAID level, RAID level can be any supported RAID level for the given adapter, such as 0, 10,  5,  6.
       4). pci_id is PCI vender and device ID; refer to http://pci-ids.ucw.cz/read/PC/1014/034a;
       5). disk_num is the number of disk this RAID will contain;
       6). pci_slot_name is the specified PCI location. If specify pci_slot_name, this raid will be created using disks from this PCI_SLOT;
       7). If specify pci_id, configraid will detect all disks under pci_id.
       8). If specify disk_names, configraid will configure raid using specified disks.

   stripe_size_in_kb:
       1). Currently  supported  stripe  sizes  in  kb  include 16, 64, and 256.
       2). If  stripe size is not specified, it will default to the recommended stripe size for the selected RAID level.

2. Framework main process
   1). load input parameters;
   2). If delete_raid is not null:
       If delete_raid is all:
            collect all the RAID arrays;
            delete all RAID array or specifed RAID arrays from delete_raid list;
       else:
            give information that no need to delete RAID arrays;
   3). If there are several input parameters of create_raid, handle each create_raid, find target number disks and create raid;

       If there is PCI_ID and there is no PCI_SLOT_name:
             find out all ipr ioa pci locations based on PCI_ID
             reorder ipr ioa pci locations by its primary and secondary state
             return ordered groups name pci_slot_group with PCI_SLOT_name
       If there is PCI_SLOT_name or there are both PCI_SLOT_name and PCI_ID:
             use PCI_SLOT_name, so empty pci_slot_group
             let PCI_SLOT_name into pci_slot_group
       If there is no PCI_ID or PCI_SLOT_name, but disk_names:
             find out specified disks;
       If there is pci_slot_group:
             for each PCI_SLOT_name in pci_slot_group:
             find out all member disks in IPR raid adapter, we will get multilines, sloc_of_disks_group, such as <pciloc_of_ioa1>=<sloc_of_disk1>,<sloc_of_disk2>......<pciloc_of_ioa2>=<sloc_of_disk1>,<sloc_of_disk2>......

        for each create_raid :
             if sloc_of_disks_group is not null:
                  find out its <pciloc_of_ioa1>=<sloc_of_disk1>,<sloc_of_disk2>,
                  sort disk devices by resource path
                  find the required number disks, disk_num is disks number, these disks are af disks, save as af_disks_group
                  if af_disks_group is not null:
                       if disk_num is not null:
                            if there is not enough disks on adapter, break current loop;
                            if all target disks were already in use, reuse this array, break current loop;
                       else:
                            disk_num is the number of all disks in af_disks_group
                            pick up af disks to form a list
                            create an array using specified raid_level, stripe_size_in_kb, list of af disks;
                            check if the array is ready or not.
   4). log files are saved in /tmp/

3. Other common functions called by main process:

   a). load configure file
   b). log utils
   c). delete ipr arrays
   d). create ipr arrays
   e). check ipr device status
   f). wait for ipr device status
   g). check disk format, af or jbod
   h). order resource path
   i). convert between disk scsi location and device_name
   j). sort disk devices by resource path
   k). reorder ipr ioa pci locations by its primary and secondary state
   l). format jbod disks into advanced format
   m). pick up specified number of disks from af_disks_group
   n). other

4. Work process:

   1). Execute commands in xcat MN, for example ppc64, delete all raid arrays, create a raid10 using pci_slot 0001:08:00.0 first 2 disks:

     nodeset <compute_node> cmd=configRAID delete_raid=all  create_raid=rl#10|pci_slot_name#0001:08:00.0|disk_num#2,shell
     rpower <compute_node> reset

   2). Use xdsh to monitor the process of raid building.
   3). In current stage, we will focus on the framework process, the monitor of raid building process can be considered in the future.

News

History

  • Oct 22, 2010: xCAT 2.5 released.
  • Apr 30, 2010: xCAT 2.4 is released.
  • Oct 31, 2009: xCAT 2.3 released. xCAT's 10 year anniversary!
  • Apr 16, 2009: xCAT 2.2 released.
  • Oct 31, 2008: xCAT 2.1 released.
  • Sep 12, 2008: Support for xCAT 2 can now be purchased!
  • June 9, 2008: xCAT breaths life into (at the time) the fastest supercomputer on the planet
  • May 30, 2008: xCAT 2.0 for Linux officially released!
  • Oct 31, 2007: IBM open sources xCAT 2.0 to allow collaboration among all of the xCAT users.
  • Oct 31, 1999: xCAT 1.0 is born!
    xCAT started out as a project in IBM developed by Egan Ford. It was quickly adopted by customers and IBM manufacturing sites to rapidly deploy clusters.
Clone this wiki locally