Full Port Log Rhel Six Sixty Four Bit

Full ports are unique explorations into the unknown--have a bottle of
Absinthe and a low moral standard.

This is a record of the work necessary to perform a full port of HTCondor
to RHEL 6 x86_64. I started with the full port from RHEL 5 x86_64.

---------------------------------
Section 0: Perform a clipped port
---------------------------------

Do whatever is necessary to make this happen.

----------------------------------------------
Section 1: Preparation of a new glibc external
----------------------------------------------

*******************************
Locate and see how glibc builds
*******************************

We must locate the glibc revision that comes standard with the
platform.  This is different than locating the "official" glibc
release since which is distro independent. It turns out each linux
distro patches glibc according to its wishes and we need to preserve
those patches. Historically, the is enough difference between the x86
and x86_64 patches or builds that the same external CANNOT be used for
both. It is very time intensive to discover if a vendor patched glibc
source tree can be built/run properly on the another architecture.

1. Find the glibc source rpm associated with rhel 6 x86_64.

	yumdownloader --source glibc

	"Install" it as a regular user, not root:

	rpm -ivh glibc-2.12-1.25.el6.src.rpm

2. Prepare it so the patches are produced saving the output:

	cd /home/psilord/rpmbuild/SPECS
	rpmbuild -bp glibc.spec

3. Save the resulting tarball:

	cd /home/psilord/rpmbuild/BUILD
	cp -Rp glibc-2.12-2-gc4ccff1 glibc-2.12-2-x86_64
	tar czf glibc-2.12-2-x86_64.tar.gz glibc-2.12-2-x86_64

4. Now, build the glibc source package in order to find out how it is being
   configured. The purpose of this is we have to compile glibc the same way
   in our externals. Or at least as close as we can get it. We intend to
   discover the configure line which builds glibc itself.

	cd /home/psilord/rpmbuild/SPECS
	rpmbuild -bc glibc.spec |& tee build.out

*************************
Create the glibc external
*************************

0. Figure out how cmake detects the GLIBC_VERSION variable, make sure it is
	right.
1. Edit CMakeLists.txt to add a new case for the detected GLIBC_VERSION.
	Ensure no conflicts with previous version numbers happen. Fill it
	in similarly to the last recent one, taking into consideration
	the new configure flags determined by inspection of build.out
	from the previous step. You may need to comment out the GLIBC_PATCH step
	for now.
2. Create a directory in the externals based upon the name of the GLIBC_VERSION.
	Your patches, if any, will go here.
3. Place the required tarball into /p/condor/repository/externals, ensure it
	has permissions of 644.
4. Test build it with 'make glibc' after the build has been configured.

****************************************************************************
Patch glibc so it honors --enable-static-nss & other features specific to us
****************************************************************************

In the RHEL 5 port of HTCondor, we had patched glibc to make --enable-static-nss
function and also turned off the ncsd support. This is because we needed a
truly static binary, and the ncsd socket was kept permanently open which
deferred check pointing permanently.

So, we need to determine how much of this is still true and still needs
patching in the new glibc.

1. It turns out the patchfile to enable static nss from glibc external 2.7-18
	works for the rhel6 glibc.

2. Turns out glibc 2.12 apparently fixed the nscd bug so I don't need the
	patch.

3. TODO Add a patch which removes the warning about using nss function in
	statically linked executables. It sucks and just causes rust.
	[I never did this after I finished the port, alas.]

**************************
Check glibc external build
**************************

Ok, now we have to inspect the produced libraries and ensure that we are
shipping everything we should be shipping. This include libc.a and a pile
of resolver libraries like libnss_files.a, libnss_dns.a, and libresolv.a.

Since glibc is an evolving library, these names may change or there might
be additional libraries to ship along with it. So some of the above steps
may be redone upon future knowledge discovery.

***************************************************************
Start a build of HTCondor and see what breaks. Fix incrementally.
***************************************************************

Ok, lots of stuff broke. The gist of what broke (prototypes being
different, etc) looks to be because wherever we had GLIBC27 as a
preprocessor check we likely also need GLIBC212. I need to check to see
if the GLIBC27 behavior still applies to GLIBC212.

1. GLIBC212 addition steps:
	Every place I found a reference to GLIBC27, I checked to see if
	it was valid on GLIBC212 and added GLIBC212 if it was.

Oh, it franks. That's quicker than I thought. Meh, onward ho.

**********************************
Fixing the Test Suite so it builds
**********************************

Problem 1:
	condor_compile/ld let through a pile of new flags that need dealing with.
	Such as --as-needed, --no-as-needed, -ldl, and -dynamic-linker <arg>.

	Also, -lm and -lcrypt can't be found in a static linking
	context. This one might be a distro problem. I can find it via
	other means although it is a hack to do so...

	NOTE: Hrm, it seem "yum install glibc-static" will provide them
	for me, this needs to be installed on any machine wishing to
	condor_compile and produce static binaries.

	After some hacking only in our ld, all of the tests compile. I
	was surprised by this since at this time there were no multiply
	defined symbol errors or missing symbol errors, which is often
	something that can go wrong at this stage of the port.

	--as-needed and friend, -ldl, -dynamic-linker <arg> all were gotten rid
	of since they deal with dynamic linking, which we aren't doing.

*********************************************
Running the test suite and seeing what breaks
*********************************************

./batch_test -b -c -d .

Problem 1:
	All the standard universe jobs go on hold with a version mismatch.
	That's curious. So, I've updated the logging information in the shadow
	log to be more explicit about the version when they fail so I can
	determine what's wrong.

	I added debugging info into the log line to be more verbose.

	SOLUTION: It seems the test suite was brain dead and was mixing
	my newly created binaries with binaries from the local install on
	the batlab machine which are decidedly not a native full port.
	I set my path correctly and the problem went away. Oh well,
	I'll still leave the debugging in there and maybe even make it
	more precise for the message that ends up in the jobad. Can't
	hurt to have more correct information in there...


Problem 2:
	The test suite uses whatever HTCondor binaries it finds in its path
	instead of the ones you just built. The solution is obviously to
	put the ones you just built into the path. However, the hold reason
	from HTCondor about the version mismatch between the shadow and the
	stduniv job was poor, so I fixed it up to contain the version of the
	shadow, the job, and the full path to the shadow. Now, when a clueful
	user sees this, they can get a good idea of what went wrong and why.


********************************
Test Suite Passes. Code complete
********************************

Commit and push.

Done.