CONTINUAL INTEGRATION – Page 118 – A Technical I.T./DevOps Blog

How Do You Use Azure Repositories?

Problem scenario
You want to use Git in an Azure Repository. What do you do?

Prerequisites
You have a Linux VM with Git installed.

Procedures
1. Log into Azure.
2. Go to https://dev.azure.com/
3. Click on "New Project"
4. Enter a project name. For your first project, you should probably have "Project visibility" set to "Private."
5. For the prompt "What service would you like to start with?" Choose "Repos".
6. You should see the project name and at the top of the page it will say "is empty." Click "Generate Git Credentials".
7. You will see a username and password. You will only get one opportunity to store this password. Copy it now.
8. Copy the HTTPS URL for your Git repository.
9. Go to your server with Git. Run these commands:

git clone https://URL_COPIED_FROM_ABOVE cd date > file.txt git add . git commit -m "Added a file" git push origin master

How Do You Install Spring Framework with Docker?

Problem scenario
You want to deploy the Spring framework with Docker. How do you do this?

Solution
Warning: The last step in this is not a security "recommended practice." Only follow these directions (with sudo docker run…), if the server is not that important or you are in a very secure network. One published book says you can use "sudo docker …" as long as the server is not in production (page 43 of Docker Up and Running).

Prerequisites
This assumes that you have installed Docker. If you need assistance, see this posting.

Procedures
1.a. Install Spring framework with these directions.
1.b. Run these commands:

sudo ln -s /home/ec2-user/gs-spring-boot-docker/complete/target/dependency/BOOT-INF /home/ec2-user/gs-spring-boot-docker/complete/BOOT-INF

sudo ln -s /home/ec2-user/gs-spring-boot-docker/complete/target/dependency/META-INF /home/ec2-user/gs-spring-boot-docker/complete/META-INF

2. Move into the "complete" directory. (To find it, use sudo find / -name complete -type d )
Here is an example: cd /home/jdoe/gs-spring-boot-docker/complete/target/dependency/BOOT-INF/lib/

3. Run this command: docker build -t "ricepaper:contint" .

4. Mentally obtain the image ID with this command: docker images

5. Create the container with this command, but substitute abcd1234 with the image ID (found in step #4):
docker create -it abcd1234 bash

6. Find the container ID with this command: docker ps -a

7. Start the container with a command like this, but substitute efgh5689 with the container ID:
docker start efgh5678

8. Run this command:

sudo docker run -e "SPRING_PROFILES_ACTIVE=prod" -p 8080:8080 -t springio/gs-spring-boot-docker

9. From a web browser, go to the server's IP address over port 8080. That is compose a URL like this where x.x.x.x is the external IP address of the server: http://x.x.x.x:8080

Place that URL that you just created in a web browser. Go to the link. You should see the browser window displaying "Hello Docker World".

How Do You Troubleshoot the HDFS Error “failed on connection exception: java.net.ConnectException: Connection refused;”?

Problem scenario
You have a multi-node Hadoop cluster running Hadoop version 3. You run this command: hdfs dfsadmin -report

You receive an error that includes this message: "failed on connection exception: java.net.ConnectException: Connection refused; "

What should you do?

Potential Solution
Run these three commands:

bash /usr/local/hadoop/sbin/stop-dfs.sh hdfs namenode -format bash /usr/local/hadoop/sbin/start-dfs.sh

What is a Secret in Kubernetes?

Question
What is a Secret in Kubernetes?

Answer
It is a ConfigMap with sensitive data that is encoded in Base64 text. What is a ConfigMap? It is a .yaml with a special format. There is always a "data:" section that is part of the YAML definition of a ConfigMap. The key-value pairs in the "data" section will have keys that appear in regular text; the values of the keys will be Base64-encoded. This encoding supports binary data as well as plain-text.

To create a Secret, you'll need to use openssl. This posting indirectly gives information on using openssl.

To create a Kubernetes secret, you may want to use Chapter 5 or page 216 from the book Kubernetes in Action by Luksa.

You may want to see the posting What is a ConfigMap in Kubernetes?.

How Do You Troubleshoot the AWS Error “could not get token: NoCredentialProviders: no valid providers in chain. Deprecated.”?

Problem scenario
You run this command: kubectl get svc

You receive this:
" could not get token: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors"

What should you do?

Solution
Install and configure the AWS CLI. If you need assistance with this, see this posting.

What is a Service, a Deployment, an HPA, a Pod, an Ingress, a Secret, and a Node in Kubernetes?

Question
In Kubernetes you have heard of these things: service, deployment, HPA, pod, ingress, secret, configmap, and node. What are they?

Answer
Click on the relevant question for its answer:
What is a Secret in Kubernetes?
What is a ConfigMap in Kubernetes?
What is a Deployment in Kubernetes?
What is an Ingress Resource in Kubernetes? There is a second article that contrasts it with a Service here.
What is an HPA in Kubernetes?
What is a Service in Kubernetes? There is a separate article that contrasts it with an ingress resource here.
What Is the Difference between a Node and a Pod in Kubernetes?

How Do You Deploy AKS (Azure’s Kubernetes) Using the GUI?

Problem scenario
You want to use Azure's Kubernetes PaaS offering using the GUI. What do you do?

Solution

1. Sign into the Azure portal.
2. Click on "Services" on the left.
3. Click on "Containers".
4. Click on "Kubernetes services".
5. Click "Add".
6. Fill out the required fields.
7. Click the "Review + create" button.
8. Click the "Create" button.
9. You are done. If you remember the name you chose for the cluster, you can run "az aks" commands on the cluster if you set up the Azure CLI. If you need assistance with setting up the Azure CLI, see this posting. If you gave the name foobar for the cluster, this command should work: az aks list | grep foobar

If you do not see one of the options that you need in the steps above, you may need to registered your Azure account with greater services. If you want to see the details of what you need, see this external posting.

What Are the Recommended Practices of Monitoring?

Question
AWS's Five Pillars of Well-Architected Framework recommend monitoring as a component of three of the pillars (operational excellence, reliability and performance efficiency).

There is no question that monitoring is important inside or outside of a public cloud. What patterns or traits might a good centralized monitoring system have (consistent with some phrase as "best practices")?

Answer
Here are 15 characteristics of good monitoring.

1. Alert thresholds of various metrics should follow a Goldilocks approach as opposed to a cry-wolf approach. You do not want to be notified of small disturbances. Normally disk space utilization above 50% of what is available is not a concern. But if disk space on a server has reached 90% utilization, you may want to begin to take action. A study of I.T. professionals indicated that reducing false alarms in monitoring was a challenge for 79% of those questioned.

You must define or ascertain SLIs and SLOs. SLOs are closely related to the threshold that you will be notified about. SLIs will be the metrics themselves. Certain quantifiable levels should notify you before your SLOs are lost so you have time to react.

2. Ensure redundancy for the monitoring solution. A single point of failure can be a problem. If there is only one centralized monitoring server, you will want it to have two NICs, two power supplies connected to different UPSes, two CPUs, two RAM chips, and a RAID to ensure the server is up. If the monitoring system itself goes down, will you be notified of that? If not you may detrimentally rely on false positives.

3. Test the monitoring solution itself. From time to time you should manually create an artificial event to trigger emails/pages/text messages are being sent. To generate large amounts of web traffic, we recommend Gatling. To generate a CPU or RAM load on a server we recommend using Bash. While some people prefer Ruby or Python for infrastructure automation, these languages have useful tools to stop the program from consuming too many resources. Higher level languages are therefore not as equipped to generate an artificial load as Bash.

Chaos engineering is the practice of robustly testing resilience and high availability. The term comes from what NetFlix designed and used called "chaos monkey." This program deliberately corrupts random servers -- in production -- on an ongoing basis. This tests the monitoring and alerting and it guarantees people on their toes. The professionals do not necessarily know if the problem arose from chaos monkey or from human operations. Management could manually trigger some chaos to see if protocols are followed correctly.

4. Have a method of deploying new servers that ensures it will be monitored from the beginning. When you deploy a new server, it must receive configuration (e.g., an agent installed and configured) so the monitoring system will work with it. Certain firewall rules must allow for this communication. Once installed and configured it will alert you to monitored events as they happen. With public clouds this is less of a concern because alerts are often from PaaS offerings.

5. Consider performance with your frequent checks. Some automated SQL commands can lock tables or otherwise generate a load on the database server. The client agent by itself on the monitored server could use some RAM and CPU depending on what it is checking for. Network communication between the client and server can contribute to congestion depending on how much is sent and how frequently it is sent. Setting interval durations and configuring the minimum data retrieval as necessary can help ensure you allow the servers to perform well while monitoring the precious servers carefully. For reading logs on a disk (as a part of monitoring), you may want to consider putting such logs on a dedicated disk as they may be written to frequently. A performance bottleneck could be prevented based on the location of where the logs are stored.

6. There should be multiple ways of viewing the monitoring system. You will want a website that the professionals can go to for visual checks. A dashboard can have colors to give an overview of the status of the critical systems. There should be a way to notify individuals via phone call or text message. It can be desirable to also have a physical flat panel monitor to be in a common area to ensure the employees and management can see the health of the systems that they are monitoring. According to The DevOps Handbook, public telemetry helps create institutional learning and a productive atmosphere (page 203).

7. Use monitoring in Development and QA environments. The overhead and dependencies should be similar for testing purposes anyway. Parity in lower environments with the upper environments is one of the factors in 12 factor apps.

8. Document procedures for when problems occur and methods for turning off alerts. (While we technical bloggers are biased toward favoring documentation, we are not alone. You can read more about this here.)

The Agile Manifesto says that the signatories found value in comprehensive documentation. Having human operating procedures in place can help with large and growing teams so people know what to expect. AWS's Five Pillars of Well-Architected Framework recommend you annotate documentation. Outsourcing and ambitious automation can be easier with good documentation.

9. Ensure that the group responsible for deploying infrastructure packages and the group responsible for the CI/CD pipeline can leverage a monitoring solution. For security and pragmatic professional productivity reasons, leveraging a centralized monitoring tool for both infrastructure and code deployment can be advantageous. Some special work may go into configuring the monitoring a build and release pipeline, but it is highly recommended.

10. Network latency can degrade SaaS performance. TCP/IP collisions can degrade network performance significantly. Tuning and monitoring the network (e.g., with Cacti) can be one part of monitoring your servers. Some events are triggered because of network congestion. This congestion can happen independent of the server's OS, application or databases.

11. High-quality monitoring involves SIEM (security information and event management). Access logs should be analyzed for certain patterns such as HTTP status codes (according to page 271 of Expert Python Programming). IDSes and IPSes can change their threshold sensitivity rules upon certain events. With large amounts of data, statistical operations can enable anomaly detection with artificial intelligence and big data. Apache Metron is an ambitious project with multiple use cases. One such utilization is to collect monitoring data from network sensors (that collect network packet data). There can be a blurry line between logging and monitoring (see this posting for logging).

12. Remember to do black-box monitoring. New Relic monitors at the application level which is quite different from obsessing over abstract SLIs. Application level monitoring focuses on business performance metrics and not on theoretical CPU/memory consumption levels. Sometimes theoretical monitoring tools can report that everything is fine when a critical application is totally unusable to a customer. Robotic Process Automation can do pattern recognition to deal with images and legacy technologies. Selenium is not recommended for monitoring, but it can be used in this way.

13. Evaluate different monitoring tools thoroughly before you choose one. There are advantages and disadvantages of many options. Some companies want close integration with a ticketing system such as ServiceNow. In today's world of hybrid (private-public) clouds, one key SLO may be to stay under a certain budget. Certain monetary costs may need to be closely monitored. Many organizations like hosted, subscription services for monitoring such as Datadog or PagerDuty. You may want to use CloudWatch if you are using AWS. The big offerings such as AWS, Azure, GCP, DigitalOcean and RackSpace all have their own solutions for monitoring their customer's cloud. These options cost money, and you could monitor services using more affordable methods. If you host your own solution, you will want to be notified if the monitoring solution itself fails. There are many considerations to weigh.

For monitoring servers you may consider AppDynamics, collectd, Dynatrace, Ganglia, Icinga, Instana, LogicMonitor, Monit, Nagios, Sensu, Service Assurance, Spotlight, Sumo Logic, Sysdig, Zabbix, or Zenoss. Open source options may have unsupported bugs, but they can scale without licensing costs. Open source tools can also be more flexible for your organization to modify and design specialized features. Systems' non-functional requirements are monitored by tools that are reliable, scalable and maintainable; ideally your choice will enable you to analyze historical trends that is not monetarily expensive. Every enterprise and use-case has different priorities.

If you want to try out Nagios (an open source tool), click here if you are using a Red Hat derivative or click here if you are using a Debian distribution of Linux. If you want to try out Zabbix (an open source tool), click here. You will likely want to consider multiple options as many monitor servers but do not monitor network traffic. For monitoring network traffic you may want to look at Cacti or SolarWinds.

14. Remember to ensure you monitor containers. For Kubernetes and Docker container monitoring, we recommend Prometheus. You may also want to try OpVizor a separate and sophisticated proprietary tool. For the Crashloopbackoff setting in YAML files related to Kubernetes, configure it wisely as it pertains to readiness and monitoring; see this posting for more information.

15. Architect ample observability (and possibly measurability) in the systems.

"You can't manage what you can't measure." -Peter Drucker
https://www.contractguardian.com/blog/2018/you-cant-manage-what-you-cant-measure.html

The book The Mythical Man-Month (on page 243) says "[m]ethods of designing programs so as to eliminate or at least illuminate side effects can have an immense payoff in maintenance costs." For coding (as opposed to systems administration and operations work), use metaprogramming to increase observability. "Metaprogramming is a technique of writing computer programs that can treat themselves as data, so they can introspect, generate, and/or modify itself while running." (This was taken from page 158 of Expert Python Programming. The newer 4th edition is available here.) Observability increases the number of "users" so-to-speak of a given back-end service. The Mythical Man-Month (on page 242) says "[m]ore users find more bugs." Linus' law as defined in The Cathedral and the Bazaar (on page 30) says that "[g]iven enough eyeballs, all bugs are shallow." Developers, operations professionals, QA, system designers benefit from observability for debugging sessions and root cause analysis. There is a cost to engineering minute observability, but we recommend some costs be accepted for the sake of monitoring.

The allegory of the cave in Plato's Republic illustrates limitations to understanding and the nature of learning new things. Some cave-dwelling people may be accustomed to seeing shadows and hearing echoes because of the fact that there is only one wall they can see and hear from. What they observe is different from what people outside the cave observe. Increasing observability is like going outside the cave. You cannot forget that observability may accompany false positives and false negatives of individual components. Remember that monitoring may not be a substitute for communication and humans verifying that services are up or down; complete system tests are less theoretical than individual component tests.

16. Have a disaster recovery plan that does not rely on monitoring. One reason is that on 12/7/21, AWS had a serious outage. Being arguably the most trusted public cloud in the world, if Amazon can have such a problem, it can surely happen to other companies.

You must have a way to fix systems without monitoring, and your system needs an ability to communicate with customers when an outage happens. It is not cheap to have such DR means in place, but it is advisable and complementary to good monitoring. Here are some quotes from Amazon's statement about the outage:

...

This congestion immediately impacted the availability of real-time monitoring data for our internal operations teams, which impaired their ability to find the source of congestion and resolve it.

...

Our Support Contact Center also relies on the internal AWS network, so the ability to create support cases was impacted from 7:33 AM until 2:25 PM PST.
https://aws.amazon.com/message/12721/

Finally, you may want to read The Art of Monitoring by James Turnbull because it was cited in The DevOps Handbook. You may also want to read Practical Monitoring by Mike Julian.

How Do You Make a Web UI Service Accessible to Outside Web Browser Traffic?

Problem scenario
Sometimes you have a web service listening on the loop back IP address on a non-standard port.

When you run nmap -Pn localhost, you see a service is listening on a given port (e.g., 9200). When you run nmap -Pn on an internal or external IP address, you do not see a service listening on that given port. You want to direct traffic to this listening service (e.g., 127.0.0.1:9200 or localhost 127.0.0.1:9200).

How do you make a web service on a server accessible on its external IP address? How can you get web browsers to go to the web service when it is only available from curl commands run from the back-end?

Solution
This example will be fore port 9200. You can use other ports as you wish.

1. Either turn off SELinux or set it to "Permissive". Run this to find out its status: sudo getenforce
If you get "sudo: getenforce: command not found", then SE Linux has not been installed and you can go to step #2. This following command would set it to the "Permissive" state: sudo setenforce Permissive

2. Install Nginx. If you need assistance, see this posting for Debian/Ubuntu Linux or this posting for CentOS/RHEL/Fedora.

3. Modify nginx.conf. Replace the server {} block in nginx.conf with this:

    server {
      listen 80;
      listen [::]:80;

      server_name _;

      location / {
          proxy_pass http://127.0.0.1:9200/;
      }
    }

The entire nginx.conf file should look like this:

# For more information on configuration, see:
#   * Official English Documentation: http://nginx.org/en/docs/
#   * Official Russian Documentation: http://nginx.org/ru/docs/

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 1024;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    # Load modular configuration files from the /etc/nginx/conf.d directory.
    # See http://nginx.org/en/docs/ngx_core_module.html#include
    # for more information.
    include /etc/nginx/conf.d/*.conf;
    server {
      listen 80;
      listen [::]:80;

      server_name _;

      location / {
          proxy_pass http://127.0.0.1:9200/;
      }
    }


# Settings for a TLS enabled server.
#
#    server {
#        listen       443 ssl http2 default_server;
#        listen       [::]:443 ssl http2 default_server;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        ssl_certificate "/etc/pki/nginx/server.crt";
#        ssl_certificate_key "/etc/pki/nginx/private/server.key";
#        ssl_session_cache shared:SSL:1m;
#        ssl_session_timeout  10m;
#        ssl_ciphers PROFILE=SYSTEM;
#        ssl_prefer_server_ciphers on;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        location / {
#        }
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#    }

}

What Is The Immutable Bit vs. The Sticky Bit?

Problem scenario
You have heard of the immutable bit and want to know how it is different from the sticky bit. What is the immutable bit versus the sticky bit? What are the differences between the two?

Solution
We like the term "immutable flag" as opposed to "immutable bit" to help distinguish the two. We have three parts to explain this.

Part 1: What is the immutable flag?

If you have a file called foobar, run this command: lsattr foobar

You should see something like this: -------------e-- foobar

Now set the immutable flag: sudo chattr +i foobar

List the attributes again by running this: lsattr foobar

You will now see something like this: ----i--------e-- foobar

The immutable flag keeps even the root user from deleting a file. Any rm command on foobar will fail -- even with the sudo before the command. To remove the immutable flag, run this command first:

sudo chattr -i foobar

Now the file foobar can be deleted. Using the ls -lh command on the files above before and after you set the immutable flag, you will find that the sticky bit is never set or unset. The immutable flag will prevent the file from being modified or deleted. A file having this attribute will also be hidden from ls -lh command results.

Part 2: What is the sticky bit?
It can describe a quality called a "restricted deletion directory" (as The Linux Bible does on page 268). To learn how it works, run this command on a given file (where foobar is the name of the file): ls -lh foobar

You will see permissions like this: -rw-rw-r--

Now set the sticky bit with this command: chmod o+t foobar

Run this command on a given file (where foobar is the name of the file): ls -lh foobar

You will see the permissions look like this: -rw-rw-r-T

Now remove the sticky bit with this command: chmod o-t foobar

Using the lsattr in between setting and unsetting the sticky bit, you will see no change to the immutable flag.

While the sticky bit is set and the directory is in a directory not owned by the user you are logged in as, you will not be able to delete the directory unless you are root. The sticky bit is ideal if you log in as root or manage other users on the Linux server and want a directory to never be deleted. If you want users to be able to troubleshoot but they cannot delete a file and know how to use ls -lh but may not be inclined to use lsattr commands, the sticky bit can be ideal.

The sticky bit can be known as "the saved-text bit" (according to page 300 of The Linux Programming Interface).

Part 3: Contrasting the two
The sticky bit, to be effective, depends on the permissions of the parent directory -- otherwise the file can be deleted by a user. To circumvent the immutable flag, one must use the chattr flag -- otherwise the file cannot be deleted. For preserving a file from deletion, the stronger method would be using the immutable flag. There are times when the sticky bit is more practical for a systems administrator's needs.