Yesterday, it worked fine. Today it is not. Might have nothing to do with BigFix (e.g., fixcentral is down) - but how to know.
And I like drinking coffee - but not Turkish coffee - so drinking coffee does not help with “reading the future and/or understanding the past and present”.
When it worked a file such as:
[root@hostname opt]# cat ./BESServer/DownloadPlugins/AIXProtocol/logs/AIXPlugin_2017-6-14_12-43-55.log
[Wed Jun 14 12:43:55 2017] AIX Download Plugin for Bigfix version 4.0.0
[Wed Jun 14 12:43:55 2017] Please make sure you have the latest version of this utility.
[Wed Jun 14 12:43:56 2017] Running plugin with DLoad::LWPUAIface
[Wed Jun 14 12:43:56 2017] Requesting FixPack information for 7100-04-04-1717.
[Wed Jun 14 12:43:56 2017] Setting base Technology Level as 7100-00.
[Wed Jun 14 12:44:41 2017] MetaData file: /var/opt/BESServer/wwwrootbes/bfmirror/downloads/ActiveDownloads/indexed_186_1 created!
[Wed Jun 14 12:44:41 2017] Download completed. Total runtime: 00:00:46
And today it fails with a shorter:
[root@hostname opt]# cat ./BESServer/DownloadPlugins/AIXProtocol/logs/AIXPlugin_2017-6-15_6-36-31.log
[Thu Jun 15 06:36:31 2017] AIX Download Plugin for Bigfix version 4.0.0
[Thu Jun 15 06:36:31 2017] Please make sure you have the latest version of this utility.
[Thu Jun 15 06:36:32 2017] Running plugin with DLoad::LWPUAIface
[Thu Jun 15 06:36:32 2017] Requesting FixPack information for 7100-04-04-1717.
[Thu Jun 15 06:36:32 2017] Setting base Technology Level as 7100-04.
[Thu Jun 15 06:36:44 2017] Download completed. Total runtime: 00:00:13
Now, I must confess - that on the sandbox I am testing/learning on - it ran out of space yesterday and the action #186 never completed. I moved files out of the filesystem to create space, manually removed “cache” files elsewhere to restore free space on the system (yesterday) and let it continue to run for 12 hours before I stopped any actions - to give it a chance to recover and continue whereever it was.
I also made the BESGather value much smaller as part of the “cleanup” activity (was 30G, but that space just isn’t there - down to 5G, then 4G, to see if I could trigger a cleanup of it’s own areas), but nothing seems to be working (setting it back to 30G did not help either, so for now I am letting it stay at 4G until I understand which directory this actually affects).
So, maybe something about “my server” is still broken - but I cannot find what.
Help is much appreciated!
p.s. One example of what I would like to be able to do is manually use curl or wget to try and download the file, of have a way to get the “Action” to output it’s actions similar to ksh -x script.ksh
Another example would point me at the “BigFix internals” way, but those are still a bit vague atm.
Would still like to improve my download skills - but I think, after reboot of the server - “things” are working again. The global log shows activity at least.
At 09:23:13 -0400 -
DownloadCRCPing command received
DownloadCRCPing command received
At 09:23:14 -0400 -
DownloadCRCPing command received
DownloadCRCPing command received
At 09:23:15 -0400 -
DownloadCRCPing command received
DownloadCRCPing command received
DownloadCRCPing command received
At 09:23:16 -0400 -
DownloadCRCPing command received
DownloadCRCPing command received
At 09:23:17 -0400 -
DownloadCRCPing command received
At 09:23:18 -0400 -
DownloadCRCPing command received
DownloadCRCPing command received
At 09:23:19 -0400 -
DownloadCRCPing command received
DownloadCRCPing command received
At 09:23:20 -0400 -
DownloadCRCPing command received
DownloadCRCPing command received
^Z
[1]+ Stopped tail -f /var/opt/BESClient/__BESData/__Global/Logs/$(date +“%Y%m%d”).log
There is an awful lot of download activity going on, for this to be showint “downloading”, but it is a step-forward from nothing.
looking deeper
Here you can see my “overkill” that probably got the file system full in the firstplace:
[root@hostname opt]# cat ./BESServer/DownloadPlugins/AIXProtocol/logs/AIXPlugin_2017-6-15_9-23-51.log
[Thu Jun 15 09:23:51 2017] AIX Download Plugin for Bigfix version 4.0.0
[Thu Jun 15 09:23:51 2017] Please make sure you have the latest version of this utility.
[Thu Jun 15 09:23:51 2017] Running plugin with DLoad::LWPUAIface
[Thu Jun 15 09:23:51 2017] Requesting FixPack information for 7100-04-04-1717.
[Thu Jun 15 09:23:51 2017] Setting base Technology Level as 7100-04.
[Thu Jun 15 09:24:07 2017] Download completed. Total runtime: 00:00:17
[Thu Jun 15 09:24:07 2017] Requesting FixPack information for 6100-09-09-1717.
[Thu Jun 15 09:24:07 2017] Setting base Technology Level as 6100-00.
[Thu Jun 15 09:24:18 2017] Download completed. Total runtime: 00:00:28
[Thu Jun 15 09:24:18 2017] Requesting FixPack information for 7100-04-04-1717.
[Thu Jun 15 09:24:18 2017] Setting base Technology Level as 7100-00.
[Thu Jun 15 09:24:27 2017] Download completed. Total runtime: 00:00:37
But I thought I had “stopped” all these activities (except the last) well before I did the reboot.
so now I have deleted all the files that were downloaded yesterday (with dynamic_*) and that freed up 5G. Now what. pause…
At 11:01:41 -0400 -
MFE: Turning FileIOError into MessageFileError in NotationFile::Write (FileIOError)
At 11:01:42 -0400 -
Error building or posting report: FileIOError
At 11:04:26 -0400 -
MFE: Turning FileIOError into MessageFileError in NotationF
Now looks like:
At 11:01:40 -0400 -
MFE: Turning FileIOError into MessageFileError in NotationFile::Write (FileIOError)
At 11:01:41 -0400 -
MFE: Turning FileIOError into MessageFileError in NotationFile::Write (FileIOError)
At 11:01:42 -0400 -
Error building or posting report: FileIOError
At 11:04:26 -0400 -
MFE: Turning FileIOError into MessageFileError in NotationFAt 11:28:01 -0400 -
Report posted successfully
At 11:28:52 -0400 -
DownloadCRCPing command received
At 11:28:53 -0400 -
DownloadCRCPing command received
At 11:28:54 -0400 -
DownloadCRCPing command received
At 11:28:55 -0400 -
DownloadCRCPing command received
DownloadCRCPing command received
At 11:28:57 -0400 -
DownloadCRCPing command received
At 11:29:05 -0400 -
DownloadCRCPing command received
At 11:29:14 -0400 -
DownloadCRCPing command received
DownloadCRCPing command received
At 11:29:15 -0400 -
DownloadCRCPing command received
So, related question: what directory, if any of these, is “monitored” by the setting _BESGather_Download_cacheLimitMB ?
I am guessing _BESRelay_HTTPServer_ServerRootPath is not part of that and/or the “actions” created by the download do not re-evaluate the Gather value (even after an error).
So, I tried waiting for “silence” on the BigFix server, removed all files from …/downloads/sha1 and …/downloads/ActiveDownloads - as I could think of no where else to “cleanup”.
And then, the BigFix server downloads the files again into …/downloads/sha1 and then goes quiet.
About all I can conclude is that:
a) I certainly made a user error
b) i am missing the “Howto activate self-healing” external site @me
So, now the question: is there anything short of reinstalling the linux server (first save the license and activationmast) and reinstall?
Curious about what I have learned - besides a lot of filepaths I wont forget for awhile.
<record>
<date>2017-06-14T12:44:02.125-0400</date>
<level>SEVERE</level>
<class>com.ibm.ecc.connectivity.ServiceProviderUpdater</class>
<method>httpsDownload()</method>
<jvmid>a27b70d9278ae2ad:5267a2dc:15ca77d7a2d:-8000</jvmid>
<sequence>6</sequence>
<thread>1</thread>
<environment>ecc version: 1.1101. ecc build date: 02/08/2013 12:18 PM. Java version: IBM Corporation 1.8.0. OS version: Linux 3.10.0-327.36.1.el7.x86_64</environment>
<message>DOWNLOADED SERVICE PROVIDER FILE /var/opt/BESServer/DownloadPlugins/AIXProtocol/ecc/serviceProviderIBM.tmp</message>
</record>
This (above) is, I expect, an entry error - some destination not correct - as I was learning how to use the interface.
The next day (yesterday) starting getting different messages AFTER this one.
I hope this provides the hint someone smarter than I needs to help me figure out how to uninstall the AIXPlugin and
install it again, and/or just get it to refresh whatever it is unhappy about.
<record>
<date>2017-06-15T05:24:23.068-0400</date>
<level>SEVERE</level>
<class>com.ibm.ecc.connectivity.ConnectivityService</class>
<method>openPathImpl()</method>
<jvmid>f64db547fcb31035:-72e93fbc:15cab114291:-8000</jvmid>
<sequence>10</sequence>
<thread>1</thread>
<environment>ecc version: 1.1101. ecc build date: 02/08/2013 12:18 PM. Java version: IBM Corporation 1.8.0. OS version: Linux 3.10.0-327.36.1.el7.x86_64</environment>
<exception>
<error>Conn.DestinationNotFound: The caller specifies that the service destination must be a registered service and the URL is not found in the service provider file, or a service destination alias name is provided and the destination is not found in the service provider file.</error>
...
from: /var/opt/BESServer/DownloadPlugins/AIXProtocol/ecc/log/
"eccTrace0.0.log" line 8004 of 146799 --5%-- col 4
The 10 sequence messages just go on and on - until I pressed stop for the last “startup”
<record>
<date>2017-06-15T19:31:43.339-0400</date>
<level>SEVERE</level>
<class>com.ibm.ecc.connectivity.ConnectivityService</class>
<method>openPathImpl()</method>
<jvmid>0c7feb2b90588e5f:384825cc:15cae1909c4:-8000</jvmid>
<sequence>10</sequence>
<thread>1</thread>
<environment>ecc version: 1.1101. ecc build date: 02/08/2013 12:18 PM. Java version: IBM Corporation 1.8.0. OS version: Linux 3.10.0-327.36.1.el7.x86_64</environment>
<exception>
<error>Conn.DestinationNotFound: The caller specifies that the service destination must be a registered service and the URL is not found in the service provider file, or a service destination alias name is provided and the destination is not found in the service provider file.</error>
<frame>
"eccTrace0.0.log" line 146733 of 146799 --99%-- col 4
Correction: one more entry - literally one - in this logfile:
<record>
<date>2017-06-15T06:26:29.550-0400</date>
<level>SEVERE</level>
<class>com.ibm.ecc.connectivity.ConnectivityService</class>
<method>openPathImpl()</method>
<jvmid>2d033c3161737781:-78c90ff3:15cab4a1a21:-8000</jvmid>
<sequence>10</sequence>
<thread>1</thread>
<environment>ecc version: 1.1101. ecc build date: 02/08/2013 12:18 PM. Java version: IBM Corporation 1.8.0. OS version: Linux 3.10.0-327.36.1.el7.x86_64</environment>
<exception>
<error>Conn.DestinationNotFound: The caller specifies that the service destination must be a registered service and the URL is not found in the service provider file, or a service destination alias name is provided and the destination is not found in the service provider file.</error>
<frame>
<class>com.ibm.ecc.connectivity.ConnectivityService</class>
"eccTrace1.0.log" line 999 of 1059 --94%-- col 8
But I do not understand the logic in the timestamps of eccTrace1 and eccTrace0.
p.s. - suggestions are REALLY appreciated!
Last logfile:
-rw-r--r--. 1 root root 445 Jun 15 19:31 AIXPlugin_2017-6-15_19-31-34.log
[root@hostname logs]# cat AIXPlugin_2017-6-15_19-31-34.log
[Thu Jun 15 19:31:34 2017] AIX Download Plugin for Bigfix version 4.0.0
[Thu Jun 15 19:31:34 2017] Please make sure you have the latest version of this utility.
[Thu Jun 15 19:31:34 2017] Running plugin with DLoad::LWPUAIface
[Thu Jun 15 19:31:34 2017] Requesting FixPack information for 7100-04-03-1643.
[Thu Jun 15 19:31:34 2017] Setting base Technology Level as 7100-00.
[Thu Jun 15 19:31:43 2017] Download completed. Total runtime: 00:00:09
So, I hope you can make better sense of the ecc timestamps
I have been “targeting” the NFS repo, and the result has been the same. So, just incase I had “just forgot” I tried the same action, but now the target is the BigFix Server (and somehow the action is suppossed to know how to get this done on the NFS repo).
The good news:
a) the AIX target is correct, so that part of the process has been correct
b) the action sees it is not relevant - and mentions that!
progress seems to be - maybe it is finally timing out (but from where?)
[root@hostname downloads]# du -h
4.0G ./sha1
0 ./ActiveDownloads
4.0G .
[root@hostname downloads]# ls sha1 | wc
748 748 30668
nothing new on nfsrepo host however
better than the alternative (which may still happen) - which would be to uninstall and reinstall bigfix. sort of like re-installing windows because you cannot find “the cause or error” - regardless of self-inflection, or not.
I’m sorry but I don’t have the answer for your technical issue. But if I could offer some suggestions for “BigFix netiquette” - I would say that creating a post then replying to yourself with 6 more lengthy posts in the same day is a bit overwhelming.
Most of us do not actually work for IBM, so we try to help out when we can but we have our day jobs to get back to. So helping someone out in 10, 30 even 60 minutes is manageable but your issue seems larger than that (maybe it isn’t, but the number of lengthy posts didn’t help).
Also, I don’t think the number of people on the forums using AIX is that large. So that may also be causing some people who would normally help out to not answer due to lack of experience with AIX.
Last of all - based on the possible complexity of your issue, I suspect most (like me) are assuming someone from IBM is going to step in and suggest that you file a PMR.
@aixtools as @Sean pointed out, deep technical problems like this do usually do better with a PMR as logs and other possibly sensitive information needs to be transferred.
The forum could give simpler answers about situations like this if a customer has had a specific experience that relates so the inital post a “This is my problem” kind of statement could have helped. I am not totally familiar with how AIX patches use an NFS repo (which is why this is in Patch) but I’m still not sure what the base problem is. If its a disk space issue on the server or another issue I’m not really sure.
Please do file a PMR so the patch team can help you directly
Conn.DestinationNotFound: The caller specifies that the service destination must be a registered service and the URL is not found in the service provider file, or a service destination alias name is provided and the destination is not found in the service provider file.
I have a the same message in one environment, it seems that the network security of your environment (may be proxy) is blocking that file from being downloaded because it’s not in the kind of
“white list” of service provides, it seems IBM must provide a " service provider file" for the security team of the network add it to the “white list” of service providers, I am not 100% sure but I have read 1 or 2 PMR and seems the case, however I have had no time to open my PMR and start to dig into this with L2 or L3 support of BigFix.
If you got the answer/fix, would you kindly share it, it would be awesome to have it since, well I need it to!