りおてくでタグ「RAID」が付けられているもの

某誌の原稿で採用されなかった説明だけれど、個人的にはやってみて面白かったので。
4KのチャンクでRAIDしておいて、そこに4095バイトのデータをファイルに書き込むと終端子と合わせてちょうど4Kになるから、それをRAIDデバイスに書き込んでからhexdumpしたら、そのまま見えるんじゃないかと考えた。ただし、ファイルシステムはext2にして他のプロセスが書き込まないこととか、noatime付けておくとか、そういったことはちゃんと考慮すること。

4キロバイトのチャンクサイズとレイアウトにleft-symmetricを指定して、RAID 5を作成

# mdadm --create --auto=yes /dev/md2 --level=5 --layout=ls --chunk=4 --raid-devices=3 /dev/sdc1 /dev/sdd1 /dev/sde1

4キロバイトの"1"だけを含むファイルを作成するコマンド例

# for i in `seq 1 4095`; do buf=$buf"1" ; done ;echo $buf > file_1
[/dev/sdc1]

04100000  03 03 03 03 03 03 03 03  03 03 03 03 03 03 03 03  |................|
*
04100ff0  03 03 03 03 03 03 03 03  03 03 03 03 03 03 03 00  |................|
04101000  33 33 33 33 33 33 33 33  33 33 33 33 33 33 33 33  |3333333333333333|
*
04101ff0  33 33 33 33 33 33 33 33  33 33 33 33 33 33 33 0a  |333333333333333.|
04102000  36 36 36 36 36 36 36 36  36 36 36 36 36 36 36 36  |6666666666666666|
*
04102ff0  36 36 36 36 36 36 36 36  36 36 36 36 36 36 36 0a  |666666666666666.|
04103000  0f 0f 0f 0f 0f 0f 0f 0f  0f 0f 0f 0f 0f 0f 0f 0f  |................|
*
04103ff0  0f 0f 0f 0f 0f 0f 0f 0f  0f 0f 0f 0f 0f 0f 0f 00  |................|


[/dev/sdd1]
04100000  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31  |1111111111111111|
*
04100ff0  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 0a  |111111111111111.|
04101000  34 34 34 34 34 34 34 34  34 34 34 34 34 34 34 34  |4444444444444444|
*
04101ff0  34 34 34 34 34 34 34 34  34 34 34 34 34 34 34 0a  |444444444444444.|
04102000  03 03 03 03 03 03 03 03  03 03 03 03 03 03 03 03  |................|
*
04102ff0  03 03 03 03 03 03 03 03  03 03 03 03 03 03 03 00  |................|
04103000  37 37 37 37 37 37 37 37  37 37 37 37 37 37 37 37  |7777777777777777|
*
04103ff0  37 37 37 37 37 37 37 37  37 37 37 37 37 37 37 0a  |777777777777777.|


[/dev/sde1]
04100000  32 32 32 32 32 32 32 32  32 32 32 32 32 32 32 32  |2222222222222222|
*
04100ff0  32 32 32 32 32 32 32 32  32 32 32 32 32 32 32 0a  |222222222222222.|
04101000  07 07 07 07 07 07 07 07  07 07 07 07 07 07 07 07  |................|
*
04101ff0  07 07 07 07 07 07 07 07  07 07 07 07 07 07 07 00  |................|
04102000  35 35 35 35 35 35 35 35  35 35 35 35 35 35 35 35  |5555555555555555|
*
04102ff0  35 35 35 35 35 35 35 35  35 35 35 35 35 35 35 0a  |555555555555555.|
04103000  38 38 38 38 38 38 38 38  38 38 38 38 38 38 38 38  |8888888888888888|
*
04103ff0  38 38 38 38 38 38 38 38  38 38 38 38 38 38 38 0a  |888888888888888.|

ICH10R入りM/Bに1TBのHDD2発なので、お決まりのやつを(^^ゞ
なんだかdmraidのパッケージにはあまり情報が無くて、まあ実際こんな感じなんだが。ちょっと投げやりすぐる...。

# rpm -ql dmraid
/sbin/dmraid
/sbin/dmraid.static
/usr/lib64/libdmraid.so.1.0.0.rc13
/usr/share/doc/dmraid-1.0.0.rc13
/usr/share/doc/dmraid-1.0.0.rc13/CHANGELOG
/usr/share/doc/dmraid-1.0.0.rc13/CREDITS
/usr/share/doc/dmraid-1.0.0.rc13/KNOWN_BUGS
/usr/share/doc/dmraid-1.0.0.rc13/LICENSE
/usr/share/doc/dmraid-1.0.0.rc13/LICENSE_GPL
/usr/share/doc/dmraid-1.0.0.rc13/LICENSE_LGPL
/usr/share/doc/dmraid-1.0.0.rc13/README
/usr/share/doc/dmraid-1.0.0.rc13/TODO
/usr/share/doc/dmraid-1.0.0.rc13/dmraid_design.txt
/usr/share/man/man8/dmraid.8.gz
/var/lock/dmraid

うーんプロプラでは無いから、まあコードを見ろ、そんな難しいことやってないよ、ドイツの科学力は世界一ぃぃぃ、って感じなんだろ、と。えーと、とりあえず、-sしろと。で、Activateするには-ayだと。

# dmraid -s
*** Group superset isw_cgejcgagfj
--> Subset
name   : isw_cgejcgagfj_Vol0
size   : 1953519616
stride : 128
type   : mirror
status : ok
subsets: 0
devs   : 2
spares : 0

# dmraid -ay
RAID set "isw_cgejcgagfj_Vol0" was activated

# ll /dev/mapper/
total 0
brw-rw---- 1 root disk 253,  1 Nov 29 22:31 VolGroup01-LogVol00
brw-rw---- 1 root disk 253,  0 Nov 29 22:31 VolGroup01-LogVol01
crw------- 1 root root  10, 63 Nov 29 22:31 control
brw-rw---- 1 root disk 253,  2 Nov 29 22:32 isw_cgejcgagfj_Vol0

あー、デバイスノードが出来てら。ちゅうことは、fdiskするか、pvcreate/vgcreate/lvcreateするかだわな。とりあえずLVMで。

# fdisk /dev/mapper/isw_cgejcgagfj_Vol0 
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.


The number of cylinders for this disk is set to 121600.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-121600, default 1): 
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-121600, default 121600): 
Using default value 121600

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 22: Invalid argument.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.

# partprobe /dev/mapper/isw_cgejcgagfj_Vol0 
# ll /dev/mapper/isw_cgejcgagfj_Vol0*
brw-rw---- 1 root disk 253, 2 Nov 29 22:33 /dev/mapper/isw_cgejcgagfj_Vol0
brw-rw---- 1 root disk 253, 3 Nov 29 22:33 /dev/mapper/isw_cgejcgagfj_Vol0p1

# pvcreate /dev/mapper/isw_cgejcgagfj_Vol0p1 
  Physical volume "/dev/mapper/isw_cgejcgagfj_Vol0p1" successfully created

# vgcreate -s 32M VolGroup00 /dev/mapper/isw_cgejcgagfj_Vol0p1 
  Volume group "VolGroup00" successfully created

# vgdisplay 
  --- Volume group ---
  VG Name               VolGroup01
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               74.31 GB
  PE Size               32.00 MB
  Total PE              2378
  Alloc PE / Size       2378 / 74.31 GB
  Free  PE / Size       0 / 0   
  VG UUID               kcEq05-mAM4-CEQ4-z9t6-HYDm-9Vrs-zvvw5v
   
  --- Volume group ---
  VG Name               VolGroup00
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  1
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                0
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               931.50 GB
  PE Size               32.00 MB
  Total PE              29808
  Alloc PE / Size       0 / 0   
  Free  PE / Size       29808 / 931.50 GB
  VG UUID               iLmukb-Sq3i-buph-6doM-kkGq-SoW3-jedrfH

# lvcreate -l 29808 -n LogVol00 VolGroup00
  Logical volume "LogVol00" created

# lvdisplay 
  --- Logical volume ---
  LV Name                /dev/VolGroup01/LogVol01
  VG Name                VolGroup01
  LV UUID                LI52WN-bwWq-rsme-9YG5-Ko10-n7tJ-0yt4E8
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                72.31 GB
  Current LE             2314
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Name                /dev/VolGroup01/LogVol00
  VG Name                VolGroup01
  LV UUID                cTWWR8-Npzf-A2ve-BKJQ-DP9O-zyA3-lSWkXz
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                2.00 GB
  Current LE             64
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1
   
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol00
  VG Name                VolGroup00
  LV UUID                0DTn9b-deMD-hv3l-JePf-XOfW-B1hQ-tmMEG0
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                931.50 GB
  Current LE             29808
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:4

ま、あとは、mke2fsしてfstabに書けばオッケーだろ。

# mke2fs -j /dev/VolGroup00/LogVol00 
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
122093568 inodes, 244187136 blocks
12209356 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
7452 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 
	102400000, 214990848

Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 29 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

念のため、iostatを確認。書けてるみたいだし。

# partprobe /dev/sda /dev/sdb

# iostat 
Linux 2.6.18-120.el5 (rhel5.rio.st) 	11/29/08

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.58    0.14    4.24   14.82    0.00   80.21

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              43.42         4.22     43091.95       2486   25411751
sda1             26.76         0.29     27359.45        170   16134144
sdb              46.54         4.24     43312.50       2499   25541815
sdb1             22.81         0.19     21382.59        114   12609528
sdc              25.96       771.38       103.56     454892      61070
sdc1              0.28        21.00         0.02      12386         14
sdc2             25.65       749.90       103.54     442226      61056
dm-0             41.45       746.74       103.54     440362      61056
dm-1              0.21         1.64         0.00        968          0
dm-2           6526.75         1.42     52207.40        837   30787223
dm-3           6526.07         0.35     52207.38        209   30787215
dm-4           6525.94         0.18     52207.36        104   30787200

おっと、忘れてた、mkinitdしておこう。

# mkinitrd -v ./initrd-2.6.18-120.el5.img `uname -r`
Creating initramfs
Looking for deps of module ehci-hcd
Looking for deps of module ohci-hcd
Looking for deps of module uhci-hcd
Looking for deps of module ext3: jbd 
Looking for deps of module jbd
Looking for driver for device sdc2
Looking for deps of module pci:v00008086d00002822sv00001043sd000082D4bc01sc04i00: scsi_mod libata ahci 
Looking for deps of module scsi_mod
Looking for deps of module sd_mod: scsi_mod 
Looking for deps of module libata: scsi_mod 
Looking for deps of module ahci: scsi_mod libata 
Looking for deps of module usb-storage: scsi_mod 
Looking for deps of module ide-disk
Looking for deps of module dm-mod
Looking for deps of module dm-mirror: dm-mod dm-log 
Looking for deps of module dm-log: dm-mod 
Looking for deps of module dm-zero: dm-mod 
Looking for deps of module dm-snapshot: dm-mod 
Using modules:  /lib/modules/2.6.18-120.el5/kernel/drivers/usb/host/ehci-hcd.ko /lib/modules/2.6.18-120.el5/kernel/drivers/usb/host/ohci-hcd.ko /lib/modules/2.6.18-120.el5/kernel/drivers/usb/host/uhci-hcd.ko /lib/modules/2.6.18-120.el5/kernel/fs/jbd/jbd.ko /lib/modules/2.6.18-120.el5/kernel/fs/ext3/ext3.ko /lib/modules/2.6.18-120.el5/kernel/drivers/scsi/scsi_mod.ko /lib/modules/2.6.18-120.el5/kernel/drivers/scsi/sd_mod.ko /lib/modules/2.6.18-120.el5/kernel/drivers/ata/libata.ko /lib/modules/2.6.18-120.el5/kernel/drivers/ata/ahci.ko /lib/modules/2.6.18-120.el5/kernel/drivers/usb/storage/usb-storage.ko /lib/modules/2.6.18-120.el5/kernel/drivers/md/dm-mod.ko /lib/modules/2.6.18-120.el5/kernel/drivers/md/dm-log.ko /lib/modules/2.6.18-120.el5/kernel/drivers/md/dm-mirror.ko /lib/modules/2.6.18-120.el5/kernel/drivers/md/dm-zero.ko /lib/modules/2.6.18-120.el5/kernel/drivers/md/dm-snapshot.ko
/sbin/nash -> /tmp/initrd.XL3918/bin/nash
/sbin/insmod.static -> /tmp/initrd.XL3918/bin/insmod
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/usb/host/ehci-hcd.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/ehci-hcd.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/usb/host/ohci-hcd.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/ohci-hcd.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/usb/host/uhci-hcd.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/uhci-hcd.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/fs/jbd/jbd.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/jbd.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/fs/ext3/ext3.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/ext3.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/scsi/scsi_mod.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/scsi_mod.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/scsi/sd_mod.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/sd_mod.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/ata/libata.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/libata.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/ata/ahci.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/ahci.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/usb/storage/usb-storage.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/usb-storage.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/md/dm-mod.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/dm-mod.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/md/dm-log.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/dm-log.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/md/dm-mirror.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/dm-mirror.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/md/dm-zero.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/dm-zero.ko' [elf64-x86-64]
copy from `/lib/modules/2.6.18-120.el5/kernel/drivers/md/dm-snapshot.ko' [elf64-x86-64] to `/tmp/initrd.XL3918/lib/dm-snapshot.ko' [elf64-x86-64]
/sbin/lvm.static -> /tmp/initrd.XL3918/bin/lvm
/etc/lvm -> /tmp/initrd.XL3918/etc/lvm
`/etc/lvm/lvm.conf' -> `/tmp/initrd.XL3918/etc/lvm/lvm.conf'
/sbin/dmraid.static -> /tmp/initrd.XL3918/bin/dmraid
/sbin/kpartx.static -> /tmp/initrd.XL3918/bin/kpartx
Adding module ehci-hcd
Adding module ohci-hcd
Adding module uhci-hcd
Adding module jbd
Adding module ext3
Adding module scsi_mod
Adding module sd_mod
Adding module libata
Adding module ahci
Adding module usb-storage
Adding module dm-mod
Adding module dm-log
Adding module dm-mirror
Adding module dm-zero
Adding module dm-snapshot

ハードディスク・ドライブの故障率に関する事実

「SATA,SCSI,ファイバチャネル・ドライブの信頼性の相違については読み取れない」の部分がちょっとショックなんですけど。あれー、SATAはコンシューマ向けだってメーカのサイトとかでガイドされてるぞ。

富士通、HDDを2基内蔵してRAID1に対応するノートPCを発売。おおっ、これはきっと需要があるよ。スペースの問題で今まで実現しなかったのかな?

タグ