CONTINUAL INTEGRATION – Page 158 – A Technical I.T./DevOps Blog

How Do You Find the Health Status of a Jira Cluster in the Web UI?

Problem scenario
When you logged into the web UI for Jira you saw one or more error messages. How do you find out of the error is still valid?

You want to find the web UI page in Jira that will tell you about the health of the cluster for one of the following error categories:

-Supported platforms
-Database
-File system
-Indexing
-Attachments
-Cluster
-Secondary Storage

Where do you go to view the status of detected problems or potentially detected problems?

Solution
For the categories listed above, do these four steps:
1. Log into the web UI of Jira as an administrator.
2. Go to the gear symbol in the upper right hand corner and click on it.
3. Go to "System" in the menu that drops down from the gear symbol.
4. Go to "Troubleshooting and support tools" on the left hand side.

For the cluster itself do this:
1. Log into the web UI of Jira as an administrator.
2. Go to the gear symbol in the upper right hand corner and click on it.
3. Go to "System" in the menu that drops down from the gear symbol.
4. Go to "Cluster Info" on the left hand side

How Do You Troubleshoot a Process in Linux That Seems to Hang Forever?

Problem scenario
You have a long-running process that seems to not be progressing. The problem is reoccurring. How do you troubleshoot it?

Possible solution #1
Start a duplicate terminal session. Use "sudo ps -evf | grep foobar" where "foobar" is the name of the process. Try the "top" command to see if the are memory or CPU constraints. Other commands that may be useful are: strace, dbux, lsof (list open files), sar (system activity report)

Possible solution #2
Reboot the server and try again.

Possible solution #3
Scroll down. Sometimes a terminal appears to be at the very bottom, but you may have accidentally scrolled up to view something, and what appears to be hanging is not hanging.

Possible solution #4
Is there a deadlock happening? If two processes are started and will not stop until the other one is complete, then they may stay active. You may need to manually intervene. Suspending or interrupting a process may be in order. Is it possible that another administrator is suspending the process? To learn more about managing suspended Linux commands, see this link.

Possible solution #5
If for security reasons you cannot create a duplicate session and the other suggestions above did not work, can you set up a monitoring solution? Using Dynatrace, Zabbix, or Nagios can help you look into the details of the problematic server. See these postings for how to set up a monitoring solution and an agent:

How Do You Troubleshoot the NFS Message About “Transaction order is cyclic”?

Problem scenario
You try to start the NFS service with a command like this: sudo systemctl start nfs.service

However you get an error like this: "Failed to start nfs.service: Transaction order is cyclic. See system logs for details. See system logs and 'systemctl status nfs.service' for details."

Possible Solution #1
Modify the /etc/fstab file. Is it corrupt? You may need to remove recently added entries and reboot the server.

Possible Solution #2
You may want to look at the system file for nfs.service. Was it recently modified? You may want to add this stanza:
DefaultDependencies=no

How Do You Instantiate a class in Python?

Problem scenario
You inherited code from a different programmer. You want to test out an object. How do you create an object of a pre-made class?

Solution
Look at how many variables the class has. Take this code for example:

class Cartesian(object):
    def __init__(self,a = 0,b = 0):
        self.a = a
        self.b = b

    def distance(self):
        return (self.a**2 + self.b**2) ** 0.5 # Pythagorean theorem.

You find variables "a" and "b". So you should use two arguments when you create the object. Here is a line of code that should appear in the same program beneath the class definition:

contint = Cartesian(4, 1)

To test it out, place this line of code at the bottom:

print(contint.distance())

This program may show how it all works more clearly:

class Cartesian(object):
    def __init__(self,x = 0,y = 0):
        self.x = y
        self.y = x

    def distance(self):
        return (self.x - self.y)

contint = Cartesian(4, 1)

print(contint.distance())

How Do You Troubleshoot the Error “rsync mkdir failed… rsync error in rsync protocol data stream (code 12)”?

Problem scenario
You try to use rsync or you use Ansible's synchronize module. It fails with this message: "rsync mkdir failed... rsync error in rsync protocol data stream (code 12)"

Solution
Is the destination directory writable? Verify that the destination exists and has the correct permissions for being able to write to it with the user the Ansible playbook uses.

How Do You Address a Problem Message about a Directory Being Read-only when It Is Not Read-only?

Problem scenarios
One or both of the following applies to your situation:

#1 In Linux you try to move a directory that is not read only (e.g., its permissions are rwxrwxrwx or 777). You get an error about it being read-only. Your user is the owner of the directory.

#2 You are running an Ansible playbook and trying to change the attributes of the file. You get a failure message like this: "Err no 30: read-only" when you run the playbook.

What should you do?

Possible Solution #1
With shared directories you can get an error about it being read-only when it is not read-only. Try to copy the directory and its contents to a different location (e.g., cp -R /path/of/source /path/to/destination). Delete or remove the problem directory. Now use the backed up directory in the new location for whatever purposes you need.

Possible Solution #2
Stop sharing of the directory momentarily. Then you may be able to perform the operation.

Possible Solution #3
Reboot the server that appeared to be the cause of the problem. Perhaps there is a corruption issue or the apparent permissions need to be updated. Rebooting may fix the problem or help you diagnose the problem.

How Do You Solve an Ansible Problem about an SSL Certificate Error?

Problem scenario
You run an Ansible playbook. You receive an error about the SSL certificate not being valid. What should you do to get the playbook to transfer a file from a website using SSL or TLS to a managed node?

Possible Solution #1
If the problem pertains to retrieving a file from a website URL, this may apply. If the problem is between the managed node and the website and not between the control node and the website, use the get_url module to download the file to the Ansible control server. It can be easy to get one server (the control node) to have the SSL certificate. To learn how to bypass the managed node for bringing down the file from the website to the client (e.g., in situations where the control node has the SSL certificate), this posting has explicit directions for using the get_url module.
Once the file is brought down to the control node, the playbook could then use the copy module to transfer the file to the managed node. There is an intermediate copy step, but this can save time if you have many managed nodes that need a big file.

Possible Solution #2
Modify the playbook and use "validate_certs: false" with the get_url module (underneath it and indented).

Possible Solution #3
Update the certificate on the managed node.

What Should You Do When Linux Is Not Finding Your USB Camera?

Problem scenario
In Linux in the /dev/ directory you looked for the camera. In /dev/ and could not find a file associated with your web camera. You verified the locally attached camera is connected via a USB cable to your physical computer.

When you try to use it, you see this message:

"--- Opening /dev/video0...
stat: No such file or directory"

What should you do?

Solution
1. Verify the camera is turned on. It the camera is turned off, you may not see it.
2. Verify the camera works on another computer. If it does, try a different USB port on the Linux computer with the problem. If it does not, the camera may need to be replaced.

How Do You Get a Jira Instance to Work when The Web UI is Not Working but The Listener is Active on The TCP/IP port?

Problem scenario
A Jira deployment is failing. The web services are listening on port 8080 or whichever port you configured. But the web UI does not load. What should you do?

Solution
Root cause: "...JIRA is not designed to import data from a new version into an older version." taken from Atlassian's website.

Has the database been used in the past? If you downgraded your Jira instance, the database needs to be empty or restored from a backup. Once Jira has been upgraded and connected to a SQL database, that database will not work for older versions of Jira.

How Do You Troubleshoot Kibana Not Working Correctly?

Problem scenario
Elasticsearch works but Kibana does not work. You are using a command to start kibana but you get a warning about kibana-monitoring and a cluster_block_exception error being blocked by service unavailable. What should you do?

Possible Solution
Reboot the server.