Updated on 6/8/20
This version is the unabridged version. For the abridged version, you may click here.
Problem scenario
A Puppet manifest is not working, but there are no obvious error messages. When running the puppet agent command, you use the -d flag for debugging. In your manifest, you use logoutput => true stanza. But still, you cannot figure out why your manifest is not working.
You tried this command: # puppet parser validate nameOfManifest.pp
The above command had no output. On the Puppet Master server, you looked in /var/log/ to find clues. When the Puppet Agent runs and communicates with the Puppet Master server, on the back end of the Puppet Master, the activity should be somewhat logged to /var/log/puppet/puppetserver/puppetserver.log. You see no error messages anywhere that correspond to a puppet agent service running. You want to fix the manifest that is not producing the desired changes and/or effects that you want. Debugging can be difficult when Puppet appears to work perfectly. It seems hard to solve a manifest problem with no runtime errors and no errors in the logs.
How do you debug or otherwise troubleshoot a silently failing manifest that will not apply correctly?
Answer
Here are 28 possible solutions, tactics or strategies for handling a Puppet manifest that is not working when there are no explicit errors. For this posting, we define "not working" as discrepant from your desires or expectations for the manifest. There should be some suggestions to get a manifest to work (regardless if you are using Puppet with purely Linux, purely Windows, or have a heterogeneous environment with Puppet).
A bug in a manifest can be difficult to detect. Buggy manifests may not be readily reproducible either. Possible solution #28 is a collection of individual links and external articles for more tips and tricks to troubleshoot, fix and resolve your hidden or silent Puppet manifest problem. We recommend patiently reading the possible solutions that pertain to the OS of your Puppet agent having the problem.
Is your Puppet Agent running on Linux? If the answer is yes, every possible solution may help you except #10, #12, and #15.
Possible solution #1: Are you sure the manifest is not doing what it is actually supposed to be doing (as opposed to what you think it ought to do)? The manifest could have logic that, against your wishes, ensures the manifest will apply to certain servers but not the server you want.
The "puppet parser validate nameOfManifest.pp" (when run from the Puppet Master server) will find open braces for which there is no closing brace. Unmatched braces will result in a message. However this above command will not find the "problem" of having braces that are terminated in a way that excludes certain resource declarations. Here is an example:
class continualIntegration {
file { 'c:\Programs\something.txt':
ensure => present,
}
}
exec { 'dosomething':
...
}
Here the exec {} declaration in the example above is outside of the class braces. A resource declaration outside of the braces of a class will allow the puppet parser to run without finding a problem. Sometimes unmatched braces in a Puppet manifest, as long as there is an equal number of open and closed braces, will not throw an error. These problems (such as braces that prematurely terminate and thus exclude a section of the manifest) can be hard to find. If the Puppet agent is a Windows server, experienced PowerShell users may find that the Puppet manifest problem is consistent with not being administrator when the .ps1 file executes. The actual problem may be that the .ps1 file is not executing because the "exec" resource declaration statement is not in the class in the manifest. Manually look for extra commas or missing braces too. Imperfect logic can be a big cause of errorless Puppet manifest runs not doing what is intended. (See also #3 and #5 for logic-related issues.)
Possible solution #2: If you are using Puppet 3.x, verify you have a site.pp file in the /etc/puppet/manifests directory. If you are using Puppet 5.x, verify site.pp is in /etc/puppetlabs/code/environments/production/manifests. Different locations for site.pp can work, but you must configure Puppet properly. With a basic deployment of Puppet Master, manifests under other names will not work. If you want to have manifests under different names in addition to site.pp, the site.pp file must be configured to do so. Having no site.pp file in this directory can cause Puppet Agent operations to fail with no errors.
Possible solution #3: Another example is the incorrect word "requires" (commonly confused with "require") can be in a manifest for a given resource declaration. Puppet's parser validate command will not find an error if you have the word "requires" instead of the word "require".
Possible solution #4: If you recently installed Puppet agent on a server, or changed the configuration of the Puppet agent, make sure you configured it with the FQDN of the Puppet Master server. The IP address of the Puppet Master server will not work. The /etc/hosts file or configuring DNS are options to enable the FQDN to work.
Possible solution #5: Make sure the Puppet agent server has been classified properly in the ENC (External Node Classifier) with the Puppet Master. Log into the web UI Puppet dashboard if applicable. Go to "Nodes" (on the left) then go to "Classification." On the "Rules" tab go down to "Certname." Below this section should be a text field for "Node name." Type in the FQDN of the Puppet Agent server. Click the "Pin node" button on the right. Click "Commit 1 change" at the bottom.
Without a Puppet dashboard, the Puppet Master must sign the certificate from the Puppet Agent before anything will work. The lack of logging could be caused by something like this.
Possible solution #6: If the manifest has a node definition in it, you may be working with the wrong Puppet Agent server. This could give you a manifest that throws no errors. The puppet agent service could run without errors too.
class continualIntegration {
node 'north.continualintegration.com' {
file { '\var\newfile':
ensure => present,
}}}
A manifest with the " 'north.continualintegration.com' { " definition will only apply to a server with such a DNS name. This node definition could exclude the "Puppet agent" server that you want the manifest to apply to. You may be on a server with the DNS name of "south."
Possible solution #7: The manifest that you are trying to apply has not been classified properly on the Puppet Master server (e.g., in the ENC). The behavior of such a failing manifest would be that there were no errors to help you diagnose the problem when the puppet agent service runs on the Puppet Agent server. Assuming that you used "puppet module generate yourname-newmod" to create the module (associated with the manifest you are trying to run) on the back end of the Puppet Master server, you must go to the Puppet Master's web UI (also known as the dashboard). Then go to Nodes -> Classification -> Classes. There is a refresh link in the web UI that you may want to click. Then in the "Add New Class" field, type in "newmod" (the name of the module with no quotes and without "yourname"). There should be a suggestion of a matching name beneath the field. Click it then click "Add." Commit the change. The module will not be found if there is an error. On the back end of Puppet Master, run "puppet parser validate init.pp" (where init.pp is the new manifest). Make sure that there are no errors. If there are no errors, you should be able to add the class.
Possible solution #8: Temporarily modify the manifest purely for the sake of debugging. One way to diagnose the problem is to use native exec commands (e.g., whoami
or env
) and redirect the output to a file. Then after the puppet agent runs, go back and review the content of the file. Sometimes learning about the user account that issues the command can help you debug the problem. If there is a security restriction to a resource (e.g., a directory) on the Puppet Agent node, running a whoami > /path/to/debug.txt
may elucidate the problem. Using the date command multiple times can help you understand the flow of execution and the timing of the manifest. Another way to learn more about a complex yet quietly failing manifest is to use Puppet's two DSL reserved words notify
or notice
. These can help you obtain more detailed messages without using regular Bash or PowerShell commands. To learn more about adding debugging details, see this external link. Be aware that some Bash commands invoked by non-Bash (e.g., Python) programs work when a user runs the non-Bash program but not when puppet agent calls the non-Bash program. Theoretically Puppet is a declarative CM tool. But the reality is that I.T. organizations like to use procedural calls and imperative techniques.
Possible solution #9: Look at the manifest's syntax closely. Find the resource declaration that is supposed to do what you are not seeing. Are you sure you did not comment out a key stanza or entire resource declaration? In mirrored/replicated environments, multiple people can work on the manifests. In collaborative environments, someone may have been doing testing (more commonly in development and quality assurance environments). No error will be thrown if a manifest has a section that is commented out.
Possible solution #10: If the Puppet Agent node is a Windows server, remember that the manifest will be applied under the security context of the user who ran the "puppet agent -t -d" command from a PowerShell prompt [opened as administrator]. A Puppet manifest can launch a Scheduled Task with evidence it ran at a specific time in the logs, yet the effects did not happen as expected. It could work if the local system user runs the "puppet agent -t -d." For the automatic runs (by default they happen every 30 minutes on a Windows Server or when the server is rebooted), the user is "nt authority\system." This is the local system user with greater rights than a local administrator user. The user initiating the puppet agent -t -d
command may not have as great of permissions as the local system user. Therefore you should try rebooting or log off and wait 30 minutes to allow the automated run of Puppet agent to happen. Ultimately it can benefit your troubleshooting efforts to see what a manifest does running as "nt authority\system" as well as your domain credentials; you may see a difference from an automated run contrasted with the results of manually opening PowerShell as administrator and invoking "puppet agent -t -d". The differences may give insight as to why the Puppet manifest is not applying despite that there is nothing in the relevant logs and no visible errors at run time.
Possible solution #11: If the manifest works intermittently, this suggestion is highly recommended because there may be no errors when the manifest does fail when you do not use it. Use the word "require =>" in your manifest. This is particularly useful for a manifest that transfers a script then runs it. The order of operations a Puppet manifest being applied is normally from the top to the bottom. But when an exec {} resource section of a puppet manifest has the "require => " stanza, the resource operation declared in the "require => " stanza will always happen first. This can ensure the order happens the way you want it. Please note that PuppetLabs says that exec statements must be idempotent. Despite this best practice we know in the real world that manifests with exec statements that are not idempotent can work fine for certain business needs.
Possible solution #12: For Windows servers that are Puppet agents, a manifest may have a command that refers to a network share. Such a network location may use a "\\" constructor in a PowerShell command. Getting the back slashes to work correctly on the Windows Puppet agents can be complicated with little to no logging or error reporting. The manifest on the Linux Puppet Master should have a command like this (to be applied on a Windows server) if you are using back slashes:
command => 'echo "copy-item \\\serverName/path/to/integration.txt c:/temp/destination/file.txt" > c:/temp/continual.ps1'
The above has three backslashes because the two back slashes will translate to only one on the Windows server. The command above will result in a file c:\temp\continual.ps1 with this as the content:
copy-item \\serverName\path\to\integration.txt c:/temp/destination/file.txt
Possible solution #13: Did you forget to use Puppet apply? A puppet agent -t -d
run will normally use manifests compiled in the catalog. Try using puppet apply
with the manifest in question. Then try the puppet agent -t -d
command again.
Possible solution #14: If the manifest is failing to transfer a file from the Puppet Master server to a Puppet Agent server, look at the "source" declarations. Sometimes we write manifests based on other manifests and forget to change the directory path. Here is an example:
source => puppet:///modules/contint/foo.bar
The corresponding "contint" directory must actually have a subdirectory named "files." This directory "files" is not explicit in the manifest's source field declaration stanza. The "files" subdirectory should be a sibling of the "manifests" directory. Both of these directories should have a parent directory of "contint" (for this example). If the manifest you were using was for a module named "coolfun", that name may be accidentally where "contint" should be. In other words, you may have forgotten to change a word from a previous manifest that was the source of the copy you are now working with. Normally the manifest should only take files from the "files" directory for its corresponding module. It should not cross over into other modules. Scrutinize the "source" stanza key words if that is the part of your manifest that is not working. Verify the source file exists in the manifest's puppet:/// constructor with the nonintuitive convention of not having the "files" directory be explicit in the stanza directory itself. If the directory path is wrong altogether or strict permissions were introduced on the Linux server so the Puppet Master cannot retrieve the source file, this could be the root cause of your problem.
Possible solution #15: One reason logging may be limited is that your manifest has a long duration process. Some manifests initiate a complex PowerShell script. The logging may last for a few seconds starting when the Puppet agent runs despite the PowerShell script taking minutes to complete. To have continuous logging, see this posting.
Possible solution #16: This solution is for Puppet manifests that do not work properly yet have no timeout errors and no messages in the logs and have the following two traits:
- The manifest has an exec resource that invokes a Bash/PowerShell command or launches a PowerShell or Unix/Linux shell script.
- Running "puppet agent -t" eventually results in a message similar to "Catalog applied after approximately 300 seconds."
This problem can happen with or without "timeout => 3600" stanzas in the Puppet manifest. The amount of time the long-duration PowerShell/Bash script or PowerShell/Bash command takes may vary depending on environmental factors.
For Linux/Unix, use this paragraph. First determine (e.g., by estimating or by testing) the longest number of minutes the script will take. Next, use the "sleep" command. There are two ways to use the "sleep" command for this solution if the Puppet manifest transfers and runs a Bash (or other shell) script. One way is to edit the script itself to have a "sleep 3m" command on the last line (where three minutes is a long estimate of the number of minutes the script would take). A second way, and the only method if the long-duration Bash command does not involve a script, is to modify the exec resource in the Puppet manifest. Create a compound Bash statement on the "command" stanza like this: <original command> ; sleep 3m
For PowerShell, use this paragraph. First determine (e.g., by estimating or by testing) the longest number of minutes the script will take. Next, use the "start-sleep" command. There are two ways to use the "start-sleep" command for this solution if the Puppet manifest transfers and runs a PowerShell script. One way is to edit the script itself to have a "start-sleep 180" command on the last line (where three minutes is a long estimate of the number of minutes the script would take). A second way, and the only method if the long-duration PowerShell command does not involve a script, is to modify the exec resource in the Puppet manifest. Create a compound PowerShell statement on the "command" stanza like this: ; start-sleep 180
This is a work around. It is unclear why Puppet will sometimes ignore "timeout => 3600" stanzas in manifests.
Possible solution #17: Take the manifest file and put it locally on the node server (e.g., /tmp/test.pp). Then run it locally: puppet apply /tmp/test.pp
. This may not work if the manifest is complex and refers to modules and variables that are on the Puppet Master server. But it could potentially help you.
Possible solution #18: Does your Puppet manifest rely on a Puppet module? The modules that you installed may not be in the correct directory. Puppet agents may run with no errors, but the modules you installed on Puppet Master will not be deployed with runs of Puppet agent. One solution is to reinstall the modules on Puppet Master. Did you recently upgrade the Puppet Master server? For upgrading the Puppet Master server, it is advisable to migrate to a new server rather than upgrade what you already have (in place). Puppet's website says "[w]e recommend that you set up a new VM or physical system to start fresh." For directions on reinstalling Puppet agent or Puppet master, see "Possible solution #24" below.
Possible solution #19: You can try rebooting the Puppet agent server and Puppet master server. If you cannot reboot the Puppet master server, read the following as an alternative. On the Puppet master server, sometimes the "puppet master" service does not stop and start properly. A zombie process silently persists. Run this: sudo ps -ef | grep puppet | grep master
Make a note of the time this process started based on the results of this command. Run the commands to start and stop puppet master. If they don't work to stop it properly (based on the results of the sudo ps -ef
command, you may need to use sudo kill -9 <pid>
(where <pid> is the process ID found from the sudo ps -ef | grep puppet | grep master
command). Then restart the puppet master
service.
Possible solution #20: This possible solution may count as a logged activity (and not in keeping with the errorless theme of this guide), but it can be an overlooked source of an error message. On the Puppet master server the /var/log/puppetlabs/mcollective/mcollective.log may be register Puppet agent nodes when typical puppet agent -t -d
runs on those clients. Even though you do not run mco, this log on the Puppet master server could help. The information recorded on the Puppet master server in /var/log/puppetlabs/mcollective/mcollective.log may or may not help you diagnose a Puppet manifest that does not run properly.
Possible solution #21: The puppet certificate has not been signed on the Puppet master server yet. Have manifests ever worked on the Puppet agent server? If not, the problem could be that the certificate has not yet been signed. Go to the Puppet master server and run this (as the user who started the puppet master service): puppet cert list
. If you used sudo to start the service, then run sudo puppet cert list
. If you started it with the user named puppet, try this command: sudo -u puppet cert list
. You will get output if there is an unsigned certificate. The output should clearly have the hostname of the Puppet agent server if this is the problem. An unsigned certificate request could be the reason the Puppet manifest has no errors yet produces to no results (and does not work). You may want to try this command too: puppet cert list --all
Possible solution #22: On the Puppet master server run this command to see where Puppet agent will look for the manifests: sudo puppet master --configprint manifest
If the result is "No manifest" see this posting. If the result is a directory but site.pp is not in that directory, that is the problem. You must have a site.pp file in the directory this command produces. If site.pp is in that directory, try this command to see what you can see: puppet config print
# or try it with sudo. Run this command and see if anything looks out of the ordinary in its results: sudo puppet agent --configprint all
You may want to go to the Puppet master server and see what you see from this command:sudo puppet master --configprint all
Possible solution #23: Analyzing the code with one of the following tools can help you identify problems. Use https://validate.puppet.com/, puppet-lint or SonarQube to analyze the .pp code for potential problems. For puppet-lint you can go here for more information. For SonarQube, the installation is considerably more involved compared to puppet-lint. SonarQube has a community edition to install; if you need instructions for installing it on Red Hat, see this link. If you need instructions for installing SonarScanner on RHEL, see this link. The plugin to analyze .pp files (Puppet DSL code), can be obtained here. As of 6/8/20, this plugin has not been maintained for years. It may be of some value for motivated Puppet professionals.
Possible solution #24: Reinstall Puppet agent and Puppet master. We have seen partially defective installations. The puppet -V
command works, but Puppet is not actually working properly. Reinstalling Puppet is extreme troubleshooting, but it is an option to get a Puppet manifest to finally work. If you do not want to hire someone outside, and you have exhausted other options, this may be viable. To install Puppet master, click on the link according to you operating system: - CentOS/RHEL/Fedora - Ubuntu - Debian - SUSE. To install Puppet agent, click on the link according to your operating system: - CentOS/RHEL/Fedora - Debian or Ubuntu
Possible solution #25: Is something wrong with PuppetDB? If PuppetDB has incorrect/discrepant information, the puppet agent -t -d
run may fail without an error. PuppetDB is also called PDB. To troubleshoot it, see this link.
Possible solution #26: Are you 100% sure that the Puppet manifest is not actually doing what you want it to do? This one is not a logic problem. This could arise from a verification problem. You may see no errors because it is working. Re-verify the "failure".
In some instances you may have a way of verifying if the Puppet manifest worked. It could be properly running, and your method of verification is showing you a false negative [that the Puppet manifest did not run properly]. The Puppet manifest could be successful despite what you thought.
Possible solution #27: On a Linux/Unix machine, run $?
after your command. Does it return a non-zero? Returning a "1" would indicate it was not a successful command. This external page has more information. Definitely the command is not working if it returns a "1". If it returns a "0", then as far as the OS is concerned, it worked. Hopefully another possible solution can help you if that is the case.
Possible solution #28:
- Another posting of interest is this one from ContinualIntegration.com. This has six miscellaneous tips on Puppet.
- For general information about the free version of Puppet, see this web page. For the enterprise edition, you may want to open a support case.
- For an official article from Puppet named "Which Logs Should I Check When Things Go Wrong?" see this link.
- See this link for an article written in 2009 for more debugging information.
- Here are some slides from a 2016 "Puppet Troubleshooting" presentation by Thomas Uphill.
- To publicly ask a technical Puppet question, some of the forums include expertsexchange.com (this may cost money), stackoverflow.com, serverfault.com, or for Puppet on Linux/Unix (as opposed to Windows), try unix.stackexchange.com. For non-private interactive support sanctioned by Puppet, you may want try https://puppet.com/community/ or slack.puppet.com.
- For Puppet manifest recommended practices, you have several options to avoid difficult debugging sessions. There are links from Devops.com, DigitalOcean.com, Puppet.com, and Radiant3.ca. You may also want to purchase a book or two related to Puppet.
- For Puppet documentation, you may want to view this page or this archived page ask.puppet.com.
- See this link for a list of Puppet books.
Still stuck? Respectfully remember that just because a professional is certified does not mean he or she is efficient at solving real-world problems in a cost-effective way for your business. Businesses that use Puppet may be trying fairly ambitious security or automation solutions at scale. There may be many moving parts and different individuals involved.
I find it very difficult to debug puppet. How can such a “mature” product lack such basic features. Ansible is no better. The world needs better software.