Wednesday, August 12, 2015

The VMware Compatibility Guide is your friend!

Fairly simple one today folks!

I was trying to do an install of ESXi 5.1 U1 on a Cisco C240 M4 and getting a really pretty purple screen saying: NOT_IMPLEMENTED bora/vmkernal/hardware/apic.c:1048

After a frustrating call to VMware support, I threw my hands up and went to the web to fix it myself (I Googled it of course). 


I got what pointed me the right direction from this link.

Steve_Delassus pointed at a KB article and then said something simple...."Check the compatibility matrix and see if your server is in it". 



Well, heck....there is was ....plain as day....




I kicked my self a couple of times, then downloaded 5.1 U3 and it installed with no issues....


So what have I learned Boys and Girls? The VMware Compatibility Guide is your friend! 

Thursday, August 6, 2015

SRM 6 NIGHTMARE!!!

***************************
Update 2

This case is now closed. Only advice that I can give at this point is be VERY careful when you are updating the Certificates in the MOB. If you do it wrong, you will be rebuilding your environment.

The below notes do work for fixing the MOB just make sure you know exactly what your syntax is supposed to be and know which Certificates you are replacing...


Back to the drawing board for me.....


*****SIGH*****


$#@%^&*&%$##^%^&!!!


***************************
UPDATE!!!

After going through all of this again (time has finally permitted me to get a back to this). I have finally got MOST of this working. I say most because I reset my PSC and vCenter Certs to the same thing and now I have to call support to see if I can change this! *Yes, I am an idiot* I will update again as I get this last part figured out.


VMware has updated KB2121701 so many times over the last couple of months that they must really be sick of that page.

The ROOT of the problem is the certificates that are associated with the PSC and the vCenter servers. If they get changed, for some reason VMware does not change them properly in the MOB *or where ever the info is stored* and it then falls on you the Admin to be clever enough to know this is the issue......


If you follow the instructions (very carefully I might add) it gives you instructions how to view what the current certificate that the MOB has listed for your PSCs and your vCenters , how to download a copy of those Certs to get a Thumbprint, how to download your current Certificates, and finally how to use the ls_update_certs.py * which you have to install a new one from the KB article* script to modify what is in the MOB pages. Below is the example from the article of the scripts you will run. I want to point out if you have multiple PSCs and vCenters you will need to do this for ALL of them! You also have to run this from the PSC server.



%VMWARE_PYTHON_BIN%" ls_update_certs.py --url https://psc.vmware.com/lookupservice/sdk --fingerprint 13:1E:60:93:E4:E6:59:31:55:EB:74:51:67:2A:99:F8:3F:04:83:88 --certfile c:\certificates\new_machine.crt --user Administrator@vsphere.local --password Password


You would need to do the above for:

1.) Your Production PSC *get the thumbprint for the old cert and download the new cert to a central location*
2.) Your Production vCenter *get the thumbprint for the old cert and download the new cert to a central location*
3.) Your DR PSC *get the thumbprint for the old cert and download the new cert to a central location*
4.) Your DR vCenter *get the thumbprint for the old cert and download the new cert to a central location*

I don't know how to make KB 2121701 easier to read but there has to be a way....it is a wealth of knowledge but....it is not easy to obtain that knowledge! 


****************************


I am trying to love VMware vSphere 6 and Site Recovery Manager 6 (SRM). I am trying to show my confidence in VMware. It's not working though.....and I know, I broke the cardinal rule of IT “never adopt early.”

VMware has been my favorite technology for a long time! I drank the Kool-aide and in my mind there is not another company that is doing the kinds of things that they are. Let's face it though....nobody is perfect.

I have now had a case open with them since June 8th about Site Recovery Manager 6 and vCenter 6, about 2 months. I have talked to some great techs there at VMware, but to me I am beginning to sense that there is a lot of confusion among their ranks about the new products. I have had techs tell me that I had to have the same certificate for both the protected and recovery site in order for things to work, and yet their install and configure manual clearly says different. I have had technicians that did not know what the VMCA is and what the function of it was, going as far as to tell me that I needed to do individual certs for each of my vCenter servers, Platform Services Controller (PSC) servers and my ESXi servers. I still have not gotten a good answer as to if SRM and the VMCA work together or if they will sometime in the future. Heck, the first month of my case was spent calling and begging their support team to call me back, it wasn’t until my VP called and started screaming that I started getting any serious traction on the case.

The frustrating part? I have done a bog standard install of SRM. I have setup my environment with VMware’s best practices. I have even gone so far as to ask the technicians to verify the install.

The PSCs are External. The vCenter Servers and the SRM servers are stand-alone VM servers.  I made my VMCAs into a subordinate Certificate Authorities to my in-house Certificate Authority so that all of my clients would trust the sites and we would not have issues.




It is exactly as VMware shows it in a standard Two-Site Topology with one vCenter Server instance per Platform Services Controller (PSC).

 














My issue??  Here goes, when I go to Site Recovery>Sites from my production server, I immediately get the below message:





Error: Failed to connect to Lookup Service at HTTPS://DRPSCSERVER.DOMAIN.COM:443/lookupservice/sdk.
Reason:
com.vmware.vim.vmomi.core.exception. CertificateValidationException: Server certificate chain not verified.

Simple right? My certificates on my vCenter must not be trust that PSC chain right? One of the servers must not be have the chain or the certificate for the DR site….but they do. VMware has verified they do. I can go to the DR PSC server from my Production vCenter Server and it shows the site as trusted…

VMware has combed the logs, and “We ain’t found….”












Now, if I try the same exact thing from the DR side what happens you ask? Same exact thing, but the error message says that it certificate chain is not valid for the Production PSC server. Which is really weird….because I can see both vCenter servers on both the Production and DR sites. Oh, and once again I can go to the Production PSC from the vCenter server and it shows the site as well.

Ahh….so it must be the PSCs don’t trust each other…..NOPE. I can go to each of the PSCs and they both trust the other.

Well so that leaves the SRM servers right? One of them must be the culprit. Well, as before …the vCenter servers all look trusted, and so do the PSC servers. The certificates that the SRM servers have are actually from the parent CA. So they are trusted all the way through….

I am bumfuzzeled….

If anyone has any advice on this PLEASE speak up! Once I get a solution I promise I will append it to this entry….