Solaris9系统上安装Oracle10g RAC

1Oracle 官方安装文档中(下文中的doc 2doc 3),有多处错误。最新的 Release Notes August 2006
     
更正了一些,但还有不少未得到更正。
2
.文中的三个脚本,在使用之前请仔阅读。如不当使用导致系统损坏与本文作者无关。
3
.本文所用的安装步骤尽可能依照Oracle 的官方安装文档。但有一处“Configuring UDP parameters” 不得不改用其他
     
方法,因为 Oracle 官方安装文档中讲的方法不起作用。


Part1  Do the Pre-installation Tasks

This is a guide for installing Oracle 10g RAC with Oracle clusterware on Solaris9  SPARC 64-Bit
Enterprise Edition.
Followed intrusions in these doc:

1. Oracle Database Release Notes
   10g Release 2 (10.2) for Solaris Operating System (SPARC 64-Bit)  Part Number B15689-03
2. Oracle Clusterware and Oracle Real Application Clusters Installation Guide
   10g Release 2 (10.2) for Solaris Operating System  Part Number B14205-06
3. Oracle Database Installation Guide
   10g Release 2 (10.2) for Solaris Operating System (SPARC 64-Bit)  Part Number B15690-02

The Hardware:
Two Ultra-2 Enterprise servers. cpu:300MHz x 2, memory:1024mb. storage:Sun A1010 disk array
A Sun fiber channel host adapter(X1057A) is installed on each node to connect to the A1010
with a fiber cable. A Sun multi-pack is connected to one of the Ultra-2 servers.
A 10BT network card is installed on each node for the private network.
A crossover network cable connects the two 10BT network cards.
(Oracle10 does not support the crossover network cable, according to doc 2)
A SBUS Frame Buffer (X359A) is installed on both nodes.
Because the hardware is barely meet the requirement, it is only good for testing.

The OS:
Install Solaris9 4/04(64-Bit) on both nodes, and install the latest patch cluster.
Configure Tcpwrappers and NTP the two nodes. Solaris9 includes these two packages.
Install openssh-4.3p2-sol9-sparc-local on the two nodes.
Secure and harden the systems by following some procedures in the doc
SANS Solaris Security Step by Step Version 2.0:
http://www.depts.ttu.edu/tosm/se ... actices/solaris.pdf

The Media for Installation:
Oracle Database 10g Release2 10.2.0.1.0 for Solaris SPARC 64-bit Enterprise Edition
If possible, download 10.2.0.2 and up, so you will install less patches afterwards.
I have to copy the two DVDs to local hard disks, since the Ultra-2 does not have a DVD drive.
You might want to configure a NFS server on node1 and an auto client on node2,
so you can run the Cluster Verification Utility on node2 as well.

The steps of the installation/configuration:
1. Do the pre-installation tasks
2. Install Oracle clusterweare
3. Test/verify the clusterware
4. Install Oracle Database software only with RAC.
5. Configure ASM and make sure it is running on both nodes.
6. Create a RAC database with DBCA.

Disk & shared storage configuration:
----------------------------------------------------------------------------------------------------------
packages     required   real    type   mount point           partition
----------------------------------------------------------------------------------------------------------
DB software    4GB       4GB    UFS    /u01/app/oracle     c0t1d0s0 (not shared)
DB datafiles  1.2GB     12GB   ASM                                  c3t0d2s7,c3t1d2s7,c3t3d2s7
OCR             100MB     128MB  raw                                 c3t0d4s6,c3t2d4s6,c3t4d4s6
Voting disk     20MB      32MB   raw                                 c3t0d4s7,c3t2d4s7,c3t4d4s7
swap            400MB      1GB                                            c0t0d0s1
----------------------------------------------------------------------------------------------------------
The clusterware will be installed in it's own home directory on node1.
The redundancy level of OCR and voting disk will be "normal".
Because this is for testing only, no space allocated for flash recovery and log archiving.

Pre-Installation Tasks

A.
Network configuration
This part is done manually. Here is the files on node1 as an example:

/etc/hosts
127.0.0.1       localhost      
# node1
192.168.1.64   rac1  rac1.abc.com    loghost
10.10.10.1      rac1-priv rac1-priv.abc.com
192.168.1.69   rac1-vip  rac1-vip.abc.com
# node2
192.168.1.77   rac2 rac2.abc.com
10.10.10.2      rac2-priv rac2-priv.abc.com
192.168.1.76   rac2-vip  rac2-vip.abc.com

/etc/inet/netmasks
192.168.0.0     255.255.255.0
10.10.10.0      255.255.255.0

/etc/hostname.hme0
rac1

/etc/hostname.le0
10.10.10.1

/etc/defaultrouter
192.168.1.1

/etc/hosts.allow
ALL:    192.168.1. 127.0.0.1 10.


Run these commands to bring up the network interface le0 and test it.

# chown root:root /etc/hostname.le0
# ifconfig le0 plumb
# ifconfig le0 10.10.10.1 netmask 255.255.255.0 up
# ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
hme0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 192.168.1.64 netmask ffffff00 broadcast 192.168.1.255
        ether 8:0:20:82:47:d4
le0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 10.10.10.1 netmask ffffff00 broadcast 10.10.10.255
        ether 8:0:20:82:47:d4
#

To check the network setup, run the CVU now or run it after all tasks is done.
The command is:
/ora10.dvd2/clusterware/cluvfy/runcluvfy.sh comp nodecon -n rac1,rac2 -verbose

B.
Other pre-installation tasks except Configuring SSH
A shell script is created for doing most of the tasks: pre.install.ora10.conf.sh
The script will do the pre-installation tasks of the clusterware and most of the
pre-installation tasks of the Oracle database. It is only to run on a new installation
of a Solaris 9 system. Run this script on both nodes as root.

pre.install.ora10.conf.sh
------------------------------------------------------------
#!/bin/sh
# Pre-installation conf. on Solaris9 for installing Oracle 10g R2 with RAC.
# Written by susbin@chinaunix.net  072406

sc_name=pre.install.ora10.conf.sh

ORACLE_BASE=/u01/app/oracle;  export ORACLE_BASE
CRS_BASE=/u01/crs/oracle/product/10;  export CRS_BASE
ORACLE_HOME=${CRS_BASE}/app;  export ORACLE_HOME
PATH=$PATH:/usr/ccs/bin:/usr/local/bin;  export PATH

echo ===============================================================
echo $sc_name started at `date`.
echo " "

echo " "
echo "=============================================="
echo "Creating Required Operating System Groups and Users"
echo " "

echo "Creating groups: dba, osdba, and oinstall."
groupadd -g 201 dba
groupadd -g 202 oinstall
groupadd -g 203 osdba
echo "Check them with the command: grep 20 /etc/group"
grep 20 /etc/group
echo " "

echo "Check if "nobody" exists on the system with: id nobody"
echo ""
id -a nobody
echo " "

echo "Creating the directory "ORACLE_BASE", which is set to $ORACLE_BASE"
mkdir -p $ORACLE_BASE
echo "Check it with the command: ls -l /u01/app "
echo ""
ls -l /u01/app
echo " "

echo "Creating a user account "oracle" and set the password of it:"
useradd -u 1005 -g 202 -G 201,203 -d $ORACLE_BASE -m -s /bin/ksh oracle
echo "Check the line in /etc/passwd with: grep oracle /etc/passwd"
grep oracle /etc/passwd
echo "Set the password of account oracle:"
passwd -r files oracle
chown -R oracle install ${ORACLE_BASE}
chmod -R 775 $ORACLE_BASE
echo " "

echo "Check if the oracle account has required groups with: id -a oracle "
echo " "
id -a oracle
echo " "

echo " "
echo "=============================================="
echo "Configuring Kernel Parameters"
echo " "

echo "Save a copy of /etc/system and append eleven lines to it."
echo "Need to reboot the system so the new parameters can take effect."
cp -p /etc/system /etc/system.orig
chmod 644 /etc/system

/bin/cat << EOF >> /etc/system
set noexec_user_stack=1
set semsys:seminfo_semmni=100
set semsys:seminfo_semmns=1024
set semsys:seminfo_semmsl=256
set semsys:seminfo_semvmx=32767
set shmsys:shminfo_shmmax=4294967295
set shmsys:shminfo_shmmin=1
set shmsys:shminfo_shmmni=100
set shmsys:shminfo_shmseg=10
EOF

echo " "
echo "Check /etc/system with the command: tail -11 /etc/system"
tail -11 /etc/system
echo " "

echo " "
echo "=============================================="
echo "Identifying Required Software Directories"
echo "ORACLE_BASE is set to $ORACLE_BASE, and the size of it should be 3GB or bigger."
echo "Check it with the command: $ df -h $ORACLE_BASE"
echo " "
mount /dev/dsk/c0t1d0s0 $ORACLE_BASE
df -h $ORACLE_BASE
echo " "

echo " "
echo "=============================================="
echo "Configuring the Oracle User's Environment"

/bin/cat << EOF > ${ORACLE_BASE}/.profile
if [ -t 0 ]; then
   stty intr ^C
fi
umask 022
ORACLE_BASE=/u01/app/oracle;  export ORACLE_BASE
# for crs
CRS_BASE=/u01/crs/oracle/product/10;  export CRS_BASE
ORACLE_HOME=${CRS_BASE}/app;  export ORACLE_HOME
PATH=$PATH:/usr/local/bin:.:/bin:/usr/sbin:/usr/ucb;  export PATH
# end of crs
# for oraDB
#ORACLE_SID=rac1; export ORACLE_SID
# end of oraDB
EDITOR=vi;  export EDITOR
EXINIT='set nu showmode';  export EXINIT
EOF

chown -R oracle install ${ORACLE_BASE}
echo " "

echo " "
echo "=============================================="
echo "Configuring oracle clusterware home directory, which is set to"
echo " ${CRS_BASE}/crs "
mkdir -p ${CRS_BASE}/crs                                             
chown -R root install /u01/crs
chmod -R 775 /u01/crs/oracle
echo " "
ls -l $CRS_BASE

echo ""
echo "=============================================="
echo "Configuring UDP parameters by creating a S70ndd and put it under"
echo "/etc/rc2.d to set the two values of ndd to 65536."
echo " "

/bin/cat << EOF > /etc/rc2.d/S70ndd
#!/sbin/sh
PATH=/usr/sbin;export PATH
ndd -set /dev/udp udp_recv_hiwat 65536
ndd -set /dev/udp udp_xmit_hiwat 65536
exit 0
EOF

chown root:sys /etc/rc2.d/S70ndd
chmod 755 /etc/rc2.d/S70ndd
echo "Check the S70ndd with "ls -l S70ndd" and "cat S70ndd" "
echo " "
ls -l /etc/rc2.d/S70ndd
echo " "
cat /etc/rc2.d/S70ndd

echo " "
echo "=============================================="
echo "Verify that the /etc/hosts file is used for name resolution"
echo "with the command: grep hosts: /etc/nsswitch.conf | grep files "
echo " "
grep hosts: /etc/nsswitch.conf | grep files

echo ""
echo "=============================================="
echo "Verify that the host name has been set with: hostname"
echo " "
hostname

echo ""
echo "=============================================="
echo "Verify that the domain name has NOT been set with: domainname"
echo " "
domainname

echo ""
echo "=============================================="
echo "Verify that the hosts file contains the fully qualified host name"
echo "with the command: grep `eval hostname` /etc/hosts "
echo " "
grep `eval hostname` /etc/hosts

echo " "
echo "The pre-installation configuring tasks is done on this node."
echo "Reboot the system so the new parameters can take effect."
echo " "
echo $sc_name ended at `date`.
echo ===============================================================

------------------------------------------------------------


Check the current values of Kernel Parameters after rebooted the system.
Login as user oracle, and then run the commands on all nodes:
/usr/sbin/sysdef | grep SEM
/usr/sbin/sysdef | grep SHM

Check the env variables of oracle account for installing the Clusterware:
$ env | grep ORACLE
ORACLE_BASE=/u01/app/oracle
ORACLE_HOME=/u01/crs/oracle/product/10/crs

$

C.
Configuring SSH on All Cluster Nodes
The ssh of Solaris9 is a Sun_SSH_1.1, which has a bug. Here is the discussion about it:
http://forum.sun.com/jive/thread.jspa?threadID=94357
The instructions of configuring SSH in doc 2 is based on OpenSSH V.3.x. The doc 2 also
points out that Oracle NetCA and DBCA require scp and ssh to be located in the path
/usr/local/bin. For these reasons, I choose to install openssh-4.3p2-sol9-sparc-local
on the two nodes. Also need to set the value of "StrictModes" to "no" in
/usr/local/etc/sshd_config, or the ssh will prompt for a pasword even all configuration
tasks of shh has done.

Two scripts are created for configuring ssh. Here is the instruction on how to run them:
1. Put ssh.conf1.ksh under the home directory of user oracle on all nodes.
2. Run ssh.conf1.ksh on node1.
3. Make changes of ssh.conf1.ksh on node2 and then run it.
4. Run ssh.conf2.ksh on node1.
5. Run command on all nodes: chmod 600 .ssh/authorized_keys
6. Test the configuration on all nodes. The command is: ssh node1 [node2] date

ssh.conf1.ksh
------------------------------------------------------------
#!/bin/ksh
# Run this script as user oracle on node1, and then on node2.
# Make sure the package ssh is installed under /usr/local.
# Written by susbin@chinaunix.net       071906

# Put the hostname of the two nodes below
node1=rac1
node2=rac2

sc_name=ssh.conf1.ksh
home_dir=/u01/app/oracle
key_dir=${home_dir}/.ssh
ssh_base=/usr/local/bin

echo ================================================================
echo $sc_name started at `date`.
echo " "
echo "You need to run this script on $node1 and $node2."
echo "Make changes on this script before you run it on $node2."
echo " "

/bin/rm -r $key_dir
/bin/mkdir $key_dir
/bin/chmod 700 $key_dir
${ssh_base}/ssh-keygen -t rsa
echo " "
${ssh_base}/ssh-keygen -t dsa
/bin/touch ${key_dir}/authorized_keys
echo " "
echo "Now save the keys into the file authorized_keys."
echo " "
## comment out the lines when you run it on node2.
${ssh_base}/ssh $node1 cat ${key_dir}/id_rsa.pub >> ${key_dir}/authorized_keys
${ssh_base}/ssh $node1 cat ${key_dir}/id_dsa.pub >> ${key_dir}/authorized_keys
## end of the lines

## uncomment the lines below when you run it on node2.
#${ssh_base}/ssh $node2 cat ${key_dir}/id_rsa.pub >> ${key_dir}/authorized_keys
#${ssh_base}/ssh $node2 cat ${key_dir}/id_dsa.pub >> ${key_dir}/authorized_keys
#${ssh_base}/ssh $node1 cat ${key_dir}/id_rsa.pub >> ${key_dir}/authorized_keys
#${ssh_base}/ssh $node1 cat ${key_dir}/id_dsa.pub >> ${key_dir}/authorized_keys
#${ssh_base}/scp ${key_dir}/authorized_keys ${node1} {key_dir}
## end of the lines
echo " "
echo "It is done."
echo " "
echo $sc_name ended at `date`.
echo ==============================================================


ssh.conf2.ksh
------------------------------------------------------------
#!/bin/ksh
# Run this script after you have run ssh.conf1.ksh on both nodes.
# Run this script as user oracle on node1 only.
# Written by susbin@chinaunix.net           071906

# Put the hostname of the two nodes below
node1=rac1
node2=rac2

sc_name=ssh.conf2.ksh
home_dir=/u01/app/oracle
key_dir=${home_dir}/.ssh
ssh_base=/usr/local/bin

echo ===========================================================
echo $sc_name started at `date`.
echo " "
echo "You only need to run this script on $node1."
echo " "
${ssh_base}/ssh $node2 cat ${key_dir}/id_rsa.pub >> ${key_dir}/authorized_keys
${ssh_base}/ssh $node2 cat ${key_dir}/id_dsa.pub >> ${key_dir}/authorized_keys
${ssh_base}/scp ${key_dir}/authorized_keys ${node2} {key_dir}
echo " "
echo "You need to run command "/bin/chmod 600 ${key_dir}/authorized_keys" "
echo "on all nodes and then test the ssh configuration with command "
echo " "ssh node1 [node2] date " "
echo " "
echo $sc_name ended at `date`.
echo ============================================================
echo " "
exec ${ssh_base}/ssh-agent $SHELL
${ssh_base}/ssh-add

## The command "exec ${ssh_base}/ssh-agent $SHELL" will spawn a sub-shell.
## and the rest of your login session will runs within this subshell.
## end of ssh.conf2.ksh

---------------------------------------------------------------

D.
Configuring clusterware and database storage (ASM installation)

After installed the host adapter(X1057A) on both nodes, run command "format" on them
to make sure the shared disks have the same controller number on both nodes.

Format the disks on node1. For disks used by ASM, create a single whole-disk slice,
starting at cylinder 1, or the ASM will NOT recognize these disks as ASM candidates.

# format
...
selecting c3t0d2
[disk formatted]
format>
...
Free Hog partition[6]? 7
Enter size of partition '0' [0b, 0c, 0.00mb, 0.00gb]: 1c
Enter size of partition '1' [0b, 0c, 0.00mb, 0.00gb]: 0
Enter size of partition '3' [0b, 0c, 0.00mb, 0.00gb]: 0
Enter size of partition '4' [0b, 0c, 0.00mb, 0.00gb]: 0
Enter size of partition '5' [0b, 0c, 0.00mb, 0.00gb]: 0
Enter size of partition '6' [0b, 0c, 0.00mb, 0.00gb]: 0

partition> p
Current partition table (sun4g):
Total disk cylinders available: 3880 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders        Size            Blocks
  0 unassigned    wm       0 -    0        1.05MB    (1/0/0)       2160
  1 unassigned    wu       0               0         (0/0/0)          0
  2     backup    wu       0 - 3879        4.00GB    (3880/0/0) 8380800
  3 unassigned    wu       0               0         (0/0/0)          0
  4 unassigned    wu       0               0         (0/0/0)          0
  5 unassigned    wu       0               0         (0/0/0)          0
  6 unassigned    wm       0               0         (0/0/0)          0
  7 unassigned    wu       1 - 3879        4.00GB    (3879/0/0) 8378640

partition>

Okay to make this the current partition table[yes]? yes
...

#

Copy the partition table from c3t0d2 to other disks.

# for disks in c3t1d2s0 c3t3d2s0
> do
> prtvtoc /dev/rdsk/c3t0d2s0 | fmthard -s - /dev/rdsk/$disks
> done
fmthard:  New volume table of contents now in place.
fmthard:  New volume table of contents now in place.
#

Format the disks for OCR and voting disks. It is a good idea to put them on sloce 3-7.
The slice 0 is not a good candidate.

# format
...
selecting c3t0d4
[disk formatted]
...
partition> p
Current partition table (sun2g):
Total disk cylinders available: 2733 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders        Size            Blocks
  0 unassigned    wm       0 - 2515        1.82GB    (2516/0/0) 3824320
  1 unassigned    wu       0               0         (0/0/0)          0
  2     backup    wu       0 - 2732        1.98GB    (2733/0/0) 4154160
  3 unassigned    wu       0               0         (0/0/0)          0
  4 unassigned    wu       0               0         (0/0/0)          0
  5 unassigned    wu       0               0         (0/0/0)          0
  6 unassigned    wm    2516 - 2688      128.40MB    (173/0/0)   262960
  7 unassigned    wu    2689 - 2732       32.66MB    (44/0/0)     66880

partition> q
...

#
# prtvtoc /dev/rdsk/c3t0d4s0 | fmthard -s - /dev/rdsk/c3t2d4s2
fmthard:  New volume table of contents now in place.
# prtvtoc /dev/rdsk/c3t0d4s0 | fmthard -s - /dev/rdsk/c3t4d4s2
fmthard:  New volume table of contents now in place.
#

On all nodes, set the owner, group and permissions on the raw devices, which are the
slices for ASM, OCR and voting disks.

# cd /
# for rawdevs in c3t0d2s7,c3t1d2s7,c3t3d2s7
> c3t0d4s6 c3t2d4s6 c3t4d4s6 c3t0d4s7 c3t2d4s7 c3t4d4s7
> do
> echo $rawdevs; chown oracle:dba /dev/rdsk/$rawdevs; chmod 660 /dev/rdsk/$rawdevs
> ls -l `ls -l /dev/rdsk/$rawdevs | awk -F" " '{ print $11 }'`
> done
c3t0d2s7
crw-rw----   1 oracle   dba      118, 63 Jul 25 10:54 ../../devices/sbus@1f,0/SUNW,soc@0,0/SUNW,pln@a0000000,752ee9/ssd@1,2:h,raw
c3t1d2s7
crw-rw----   1 oracle   dba      118,127 Jul 21 12:55 ../../devices/sbus@1f,0/SUNW,soc@0,0/SUNW,pln@a0000000,752ee9/ssd@3,0:h,raw
c3t3d2s7
crw-rw----   1 oracle   dba      118,143 Jul 25 10:55 ../../devices/sbus@1f,0/SUNW,soc@0,0/SUNW,pln@a0000000,752ee9/ssd@3,2:h,raw
c3t0d4s6
crw-rw----   1 oracle   dba      118, 38 Aug  9 12:07 ../../devices/sbus@1f,0/SUNW,soc@0,0/SUNW,pln@a0000000,752ee9/ssd@0,4:g,raw
c3t2d4s6
crw-rw----   1 oracle   dba      118,118 Aug 16 10:21 ../../devices/sbus@1f,0/SUNW,soc@0,0/SUNW,pln@a0000000,752ee9/ssd@2,4:g,raw
c3t4d4s6
crw-rw----   1 oracle   dba      118,198 Jul 25 10:55 ../../devices/sbus@1f,0/SUNW,soc@0,0/SUNW,pln@a0000000,752ee9/ssd@4,4:g,raw
c3t0d4s7
crw-rw----   1 oracle   dba      118, 39 Aug 16 10:19 ../../devices/sbus@1f,0/SUNW,soc@0,0/SUNW,pln@a0000000,752ee9/ssd@0,4:h,raw
c3t2d4s7
crw-rw----   1 oracle   dba      118,119 Aug 16 10:19 ../../devices/sbus@1f,0/SUNW,soc@0,0/SUNW,pln@a0000000,752ee9/ssd@2,4:h,raw
c3t4d4s7
crw-rw----   1 oracle   dba      118,199 Aug 16 10:19 ../../devices/sbus@1f,0/SUNW,soc@0,0/SUNW,pln@a0000000,752ee9/ssd@4,4:h,raw
#

On node1, run CVU to check if all shared disks are available across all nodes.

$ cd /ora10.dvd2/clusterware/cluvfy
$ ./runcluvfy.sh comp ssa -n rac1,rac2 -s
> /dev/rdsk/c3t0d2s7,/dev/rdsk/c3t1d2s7,/dev/rdsk/c3t3d2s7
> /dev/rdsk/c3t0d4s6,/dev/rdsk/c3t2d4s6,/dev/rdsk/c3t4d4s6
> /dev/rdsk/c3t0d4s7,/dev/rdsk/c3t2d4s7,/dev/rdsk/c3t4d4s7

Verifying shared storage accessibility

Checking shared storage accessibility...

"/dev/rdsk/c3t0d2s7" is shared.

"/dev/rdsk/c3t1d2s7" is shared.

"/dev/rdsk/c3t3d2s7" is shared.

"/dev/rdsk/c3t0d4s6" is shared.

"/dev/rdsk/c3t2d4s6" is shared.

"/dev/rdsk/c3t4d4s6" is shared.

"/dev/rdsk/c3t0d4s7" is shared.

"/dev/rdsk/c3t2d4s7" is shared.

"/dev/rdsk/c3t4d4s7" is shared.

Shared storage check was successful on nodes "rac2,rac1".

Verification of shared storage accessibility was successful.

$

Pre-installation tasks are done. The next step is to install Oracle clusterware.