Saturday 26 September 2015

E-Business suite 12.2 adop issues on AIX 6.1 on NFS mount

We have done new 12.2 installation .And applied 12.2.4 patch
After 12.2.4 patch is applied, before applying "Section 8: Apply Additional Critical Patches" , we have run adop phase=prepare .

And it errors with this message :
=============================
Could not find the pattern...
=============================

Executing SYSTEM command: perl /opt02/app/oracle/bisapp/fs1/EBSapps/comn/adopclone_server1/bin/adclone.pl java=/opt02/app/oracle/bisapp/fs1/EBSapps/comn/adopclone_server1/FMW/t2pjdk mode=fscloneapply stage=/opt02/app/orac
le/bisapp/fs1/EBSapps/comn/adopclone_server1 component=ohsConfig appctx=/opt02/app/oracle/bisapp/fs1/inst/apps/qa_server1/appl/admin/qa_server1.xml appctxtg=/opt02/app/oracle/bisapp/fs2/inst/apps/RBISQA_ios190
1e/appl/admin/qa_server1.xml
EXIT STATUS: 1

======================================
Inside copyCloneLogsToFSNE()...
======================================

Creating the directory: /opt02/app/oracle/bisapp/fs_ne/EBSapps/log/adop/5/prepare_20150226_065021/qa_server1/TXK_SYNC_migrate_Thu_Feb_26_07_23_03_2015/ohsConfig_apply

Copying the directory
---------------------
SOURCE : /opt02/app/oracle/bisapp/fs1/inst/apps/qa_server1/admin/log/clone
TARGET : /opt02/app/oracle/bisapp/fs_ne/EBSapps/log/adop/5/prepare_20150226_065021/qa_server1/TXK_SYNC_migrate_Thu_Feb_26_07_23_03_2015/ohsConfig_apply

/opt02/app/oracle/bisapp/fs1/EBSapps/comn/adopclone_server1/bin/adclone.pl did not go through successfully.
LOG DIRECTORY: /opt02/app/oracle/bisapp/fs_ne/EBSapps/log/adop/5/prepare_20150226_065021/qa_server1/TXK_SYNC_migrate_Thu_Feb_26_07_23_03_2015/ohsConfig_apply.
*******FATAL ERROR*******
PROGRAM : (/opt02/app/oracle/bisapp/fs1/EBSapps/appl/ad/12.0.0/patch/115/bin/txkADOPPreparePhaseSynchronize.pl)
TIME : Thu Feb 26 08:02:04 2015
FUNCTION: main::migrateCloneComponentApply [ Level 1 ]


========================================================
The issue was during the prepare phase creates several temp files and also deletes them after the jobs are  done. When files were created, NFS server creates hidden files (.nfs***) files and when adop process tries to remove them, it was not able to remove them. And so, during recreation of the directories as the files/directories are already there , creation of the files/dir fails and so adop exits.

We have used several work arounds to avoid this issues. They worked most of the time but did nto worked 100% of the time.


Work around(s) used :

1) Continuos monitoring and copy bin and jre if required to /FMW/t2pjdk
2) When it fails delete or move t2pjdk and re kick off adop cycle
3) txkADOPPreparePhaseSynchronize.pl updated with below vlaue

$java_loc = $stage_loc . TXK::OSD->trDirPathToBase('/FMW/t2pjdk/jdk64');

These are the findings at last :
- .nfs files are created by NFS server and  when rm command is triggered and if the files are in shared library, adop process could not delete them.
- This is already handled by the following env variable setting for 32- bit programs (this is pre req for 12.2 installation)
LDR_CNTRL=PRIVSEG_LOADS@MAXDATA=0xB0000000@DSA
- However, LDR_CTRL env variable works only for 32 bit programs and will not work for 64-bit programs. This has to be provided by IBM.

Issue is ,Unfortunately the NFS team states "There is no such
NFS feature to unload shared library at user level".
And so, what is the solution ?????

There is no solution yet but there is a workaround :
slibclean

We constantly run slib clean during prepare phase. i.e. every 5 sec by cron job
Script used:
==
while (true); do
sudo -u root /etc/slibclean
sleep 5
done
==

This resolved the issue for now.

We dont have that issue now after the work around.
Oracle and IBM are not ready yet to provide solution on this...




We also had multiple other issues with adop on AIX. (ex: during fs_clone)
Note : These issues are specific to AIX on NFS mount (GPFS file system)






   
  note