Hello,
I’m trying to create a fixlet which will look for zombie process and kill it by ending it’s parent process. the issue I’m facing is mentioned below:
The fixlet is running absolutely fine, doing all the intended work. but the issue is, it is getting stuck on the wait command and is not moving forward. I left it for 2 days and the machine stopped reporting due to this.
I’m pasting the Code below, any help would be greatly appreciated.
//CODE STARTS
action parameter query "ErrorFolder" with description "Please enter the name of folder and path" with default value ""
if { not exists folder (parameter "ErrorFolder")}
folder create {parameter "ErrorFolder"}
endif
delete __createfile
createfile until EOF
#!/bin/bash
log_message() {
echo "$(date +"%Y-%m-%d %H:%M:%S") - INFO - $1" >> {parameter "ErrorFolder"}/output.txt
}
log_error() {
echo "$(date +"%Y-%m-%d %H:%M:%S") - ERROR - $1" >> {parameter "ErrorFolder"}/error.txt
}
# Get all zombie process IDs
ZOMBIES=($(ps -e -o pid,stat | grep 'Z' | awk '{{print $1}'))
if [ ${{#ZOMBIES[@]} -gt 0 ]; then
log_message "Zombie processes found: ${{ZOMBIES[*]}"
echo "" >> $LOG_FILE
for PID in "${{ZOMBIES[@]}"; do
# Find the parent process ID (PPID)
PARENT_PID=$(ps -p $PID -o ppid= | tr -d ' ')
if [ -n "$PARENT_PID" ]; then
log_message "Attempting to notify parent process $PARENT_PID to clean up zombie $PID"
echo "" >> $LOG_FILE
# Send SIGCHLD signal to the parent process to notify it of child status change
#kill -SIGCHLD $PARENT_PID > /dev/null 2>&1 || true
#pkill -9 -P $PARENT_PID > /dev/null 2>&1
# Sleep for 4 seconds to allow the parent process to handle the zombie
#sleep 4
# Check if the zombie process still exists
if ps -p $PID > /dev/null 2>&1; then
log_message "Zombie process $PID not cleaned up by parent process $PARENT_PID, forcefully terminating it"
echo "" >> $LOG_FILE
# Forcefully terminate the zombie process
kill -9 $PARENT_PID > /dev/null 2>&1 || true
#pkill -9 -P $PARENT_PID > /dev/null 2>&1
if [ $? -ne 0 ]; then
log_error "Failed to forcefully terminate zombie process $PID."
else
log_message "Successfully forcefully terminated zombie process $PID."
echo "" >> $LOG_FILE
fi
else
log_message "Zombie process $PID cleaned up by parent process $PARENT_PID."
echo "" >> $LOG_FILE
fi
else
log_error "Unable to find parent process ID for zombie $PID."
echo "" >> $LOG_FILE
fi
done
echo "" >> $LOG_FILE
log_message "Going into sleep state for 60 seconds before rechecking for zombie processes..."
#sleep 60
# Check again for any remaining zombie processes
ZOMBIES=($(ps -e -o pid,stat | grep 'Z' | awk '{{print $1}'))
if [ ${{#ZOMBIES[@]} -gt 0 ]; then
log_message "Additional zombie processes found after 60 seconds: ${{ZOMBIES[*]}"
echo "" >> $LOG_FILE
log_message "Attempting to clean them up..."
echo "" >> $LOG_FILE
for PID in "${{ZOMBIES[@]}"; do
PARENT_PID=$(ps -p $PID -o ppid= | tr -d ' ')
if [ -n "$PARENT_PID" ]; then
log_message "Attempting to notify parent process $PARENT_PID to clean up zombie $PID"
echo "" >> $LOG_FILE
# Send SIGCHLD signal to the parent process
#kill $PARENT_PID > /dev/null 2>&1 || true
# Sleep for 4 seconds to allow the parent process to handle the zombie
#sleep 4
# Check if the zombie process still exists
if ps -p $PID > /dev/null 2>&1; then
log_message "Zombie process $PID not cleaned up by parent process $PARENT_PID, forcefully terminating it"
echo "" >> $LOG_FILE
# Forcefully terminate the zombie process
kill -9 $PARENT_PID > /dev/null 2>&1 || true
if [ $? -ne 0 ]; then
log_error "Failed to forcefully terminate zombie process $PID."
echo "" >> $LOG_FILE
else
log_message "Successfully forcefully terminated zombie process $PID."
echo "" >> $LOG_FILE
fi
else
log_message "Zombie process $PID cleaned up by parent process $PARENT_PID."
echo "" >> $LOG_FILE
fi
else
log_error "Unable to find parent process ID for zombie $PID."
echo "" >> $LOG_FILE
fi
done
else
log_message "No additional zombie processes found after 60 seconds."
echo "" >> $LOG_FILE
fi
else
log_message "No zombie processes found."
echo "" >> $LOG_FILE
fi
EOF
delete "{parameter "ErrorFolder"}/cpu.sh"
move __createfile "{parameter "ErrorFolder"}/cpu.sh"
run chmod 775 "{parameter "ErrorFolder"}/cpu.sh"
override wait
wait /bin/sh "{parameter "ErrorFolder"}/cpu.sh"
if {exists file "error1.txt" of folder (parameter "ErrorFolder")}
Exit 100
endif
//CODE ENDS
I have also tried using run command to just run the fixlet but the thing is I need exit code 0 with Fixed status, which I’m not getting using Run command. I have also tried using just Kill command instead of kill -9 but it is shutting down my client. any help would be appreciated
