CONTINUAL INTEGRATION – Page 142 – A Technical I.T./DevOps Blog

What Does name in Python Do?

Problem scenario
What does the name syntax signify?

Solution
In the Python interpreter name keeps track of how a function or file is called. What does it signify exactly? Rather than explain this one in words, do the following:

1. Create foo.py with the following line (the name of the next file without its extension):

import bar

2. In the same directory as foo.py above, create bar.py with the following lines:

print ("Hello World from %s!" % __name__)

if __name__ == '__main__':
    print("Hello World again from %s!" % __name__)

3. Run these two commands and examine the output:

python foo.py

It prints out this:
Hello World from bar!

python bar.py

It prints out this:

Hello World from main! Hello World again frommain!

How Do You Troubleshoot the Kubernetes Problem “connection to the server localhost:8080 was refused”?

One of the following applies (with #1 being related to Kubernetes anywhere and #2 only being relevant to running Kubernetes in GCP).

Problem scenario #1 (any Kubernetes)
You run a command like this: kubectl get svc

You get an error like this: The connection to the server localhost:8080 was refused - did you specify the right host or port?

What should you do?

Problem scenario #2 (in GCP only)
You created a Google Kubernetes cluster. From a server with kubectl, you run a kubectl command. Here is an example:

kubectl cluster-info dump

You receive this error: "The connection to the server localhost:8080 was refused - did you specify the right host or port?" What should you do?

Possible Solution #1
What is your $KUBECONFIG variable set to? Run this: echo $KUBECONFIG

This may remedy the problem, but replace "verycool" with the name of your cluster:

export KUBECONFIG=$KUBECONFIG:~/.kube/config-verycool

(You may want to do this: cd ~/.kube and run ls -lh.)

Possible Solution #2 (for GCP only)
1. Log into the web UI for GCP.
2. Search for "kubernetes engine".
3. Find the clusters that exist.
4. Mentally identify the name (e.g., standard-cluster-1) of the cluster and the region (e.g., us-central1-a).
5. In the upper righthand corner click on the icon ">_" (to activate the shell).
6. Stop the VM.
7. Click on the hyperlinked name of the VM.
8. Click "Edit"
9. Check the option to "Allow full access to all Cloud APIs" (unless the VM was created with the configuration such that it can run gcloud commands governing the Kubernetes engine).
10. Click Save.
11. Turn the VM on.
12. From the VM run this command (but substitute standard-cluster1 for the name of your cluster): gcloud container clusters get-credentials standard-cluster-1 --region us-central1-a

Possible Solution #3 (for EKS)
If you are running EKS (in AWS), see this posting.

How Do You Find the Security Group ID Values in AWS?

Problem scenario
You want to list the Security Group ID value in AWS.

Solution
Prerequisite
Install and configure the AWS CLI; if you need assistance, see this posting if you can you the pip command and you are using Ubuntu. If you cannot use the pip command or you are not using Ubuntu, see this posting.

Procedures
Run this command:
aws ec2 describe-security-groups | grep GroupId

How Do You Troubleshoot a kubectl Command That Returns “The connection to the server localhost:8080 was refused – did you specify the right host or port?” when using AWS and EKS?

One of the following problems pertains to you.

Problem scenario #1
You have a kubectl server. You created some new EKS clusters. How do you get the kubectl server to interface, control or manage the new EKS clusters.

Problem scenario #2
You are running EKS. You run this command:

kubectl get cluster-info

But you receive this message: "The connection to the server localhost:8080 was refused - did you specify the right host or port?"

What should you do?

Solution

Find the name of your EKS clusters. Run this command: aws eks list-clusters
The results will be names that you can use for this command:

aws eks update-kubeconfig --name foobar

# substitute foobar with the name of a cluster in the result of step #1

3. If you are still getting the error, see this posting or this posting. If you are not using GCP or AWS, you may want to see this posting. If you are using GCP, see this posting.

What Is Knowledge of Branching Strategies?

Problem scenario
You are preparing for an interview for a job (build/release engineer or DevOps engineer) where a requirement is "knowledge of branching strategies." There are different patterns, models, paradigms, workflows and even philosophies associated with branches in repositories. What is knowledge of branching strategies in the context of code versioning systems and the CI/CD pipeline?

Solution

***Updated in January of 2022.***

Background
Open source projects may have different branching strategies from enterprises developing in-house proprietary software. Code versioning systems are the holders of repositories subject to branching. CVSes can be referred to as SCMs (Source Code Managers or Source Control Managers), VCSes (Version Control Systems), or DVCSes (such as Distributed Version Control Systems). Distributed version control systems (contribute to self-organizing teams (according to page 394 of Continuous Delivery) which is desirable from an Agile perspective.

Some companies have big development teams whereas others have small teams. The tolerance for unstable repositories varies from organization to organization. Some companies allow for unstable repositories where many developers have the ability to write changes. Other enterprises give read-only access to the majority of the employees to ensure the repositories are stable; this would normally require a dedicated release manager (in the form of an individual such as a lead developer or a merge team, Continuous Delivery page 407). The internal dynamics may dictate whether or not you will create a branch of a source code repository or not and how that branch will exist. Professionally we advise you to be the flexible type of person who can work according to how the company tells you.

If your professional position will determine the branching strategy, now in 2022, there are two major categories of branching strategies: trunk-based development (that would include mainline development and branching by abstraction) and regular branching (by feature, team, release etc.).

Trunk-based Development / Mainline Development without Branches
"It only takes two branches performing a refactoring in a tightly coupled codebase to bring the entire team to a halt when one of them merges. It bears repeating that branching is fundamentally antithetical to continuous integration" (Continuous Delivery, page 410).

One option is "trunk-based development." A quick, but technically inaccurate, description of this strategy is to not have other branches. Code can be committed directly to the mainline (or main branch of the repository). Trunk-based development involves all branches ultimately being merged to the main branch (Continuous Delivery, page 405). (Technically you can still have branches with trunk-based development, but the branches you would have would be ephemeral.)

The DevOps Handbook cites Gary Gruver (on pages 144 and 145), and his success with not having another branch. Trunk-based development is the lack of a second branch. Forking a repository is creating a second repository; it is like cloning one but "fork" connotes the publication of an independent copy of a repository. A branch is an independent copy of a repository within a repository for the purpose of possibly merging it with the trunk. To develop in a mainline way is another phrase that describes trunk-based development. It can be worthwhile to forgo a branch for a single-source of truth. It can deter developers from committing code if it means that they will need to do a significant amount of work. They may do more testing before they commit their code. This strategy may be accepted by the business if there is little to no need for a multi approval process; some development projects lend themselves to this. Code versioning systems by their very nature preserve archived versions, thus trunk-based development may work for numerous businesses.

Trunk-based development can be ideal for small teams because each person must have knowledge of the implications. Additionally it can be ideal for developers who are responsible for managing the production environment or those developers who work on low-risk and iterative changes. The DevOps Handbook recommends it for large teams because as the number of developers increases, the amount of work for merging branches goes up exponentially (page 147). The DevOps Handbook says that trunk-based development is associated with greater productivity (page 151).

Some CI/CD pipelines have a feature that code commits are rejected if any build or deployment tests fail. Individual code commits or PRs are undone in what are called "gated commits" (page 148 of The DevOps Handbook). Gated commits are implemented to ensure that the main branch is releasable. They are a feature in some trunk-based development environments.

The DevOps Handbook (page 186) cites Paul Hammant who gives mixed treatment toward trunk-based development. Paul Hammant, depending on the circumstances, can recommend "branching by abstraction" (a term he coined). As branching by abstraction is a strategy that does not have a repository branch, we mention it here in the section of trunk-based development. It is a strategy that is ideal for major refactoring of an application (according to page 350 of Continuous Delivery).

If you are skeptical about trunk-based development (and you think it would hinder using a repository from a pragmatic perspective), see the section of the book Continuous Delivery (by Humble and Farley) "Keeping Your Application Releaseable" on page 346.

Trunk-based development is ideal for Terraform development and operations. An O'Reilly publication on Terraform says "…for any shared environment (e.g., Stage, Prod), always deploy from a single branch." (Taken from page 304 of Terraform: Up & Running, 2nd Edition by Yevgeniy Brikman (O'Reilly), Copyright 2019, 978-1-492-04690-5.) The DevOps Handbook says (on page 117) that failures in code migrations are more commonly attributed to differences between the source and destination environments than problems with the code itself (e.g., a lack of robustnesses or sufficient exception handling).

A big advantage of trunk-based development is that all code is continually integrated (according to page 405 of Continuous Delivery). To learn more about trunk-based development, you may want to read these links:

Other Branching Strategies (with Long-Lived Branches)
A second option (of branching strategies) is to have branches. A branch to a repository can be ideal in a large team with a new developer. A new developer can thoroughly refactor code, making major changes, and commit the changes to a branch other than the main branch. Experimentation can be boundless with a branch that will be ignored. If the developer's workstation is lost, the changes can be safe in the repository.

Before all the parameter files and source code are merged (and thus overwrite) the existing files of what is currently in the main branch, the professional can create a pull request. This is a request which will usually require another person (e.g., a more tenured employee who knows the businesses coding conventions and idiosyncrasies) to approve the merge of code manually. A branch can prevail in a merge to the main branch if the pull request is approved (e.g., by at least one other person besides the original committer, but it depends on the configuration of the repository). Pull request approval requirements vary from group-to-group.

Companies with high turnover and large numbers of developers may prefer to have branches. The branches do not have to be deployable during the development process. Code repositories have archived versions of the code. Some companies find a formal code branch merging approval process tedious for the developers. The DevOps Handbook admits that trunk-based development is controversial (on page 151); the book Continuous Delivery concurs that trunk-based development is controversial (on its page 36). In 2022, it is still common for enterprises to not use trunk-based development. An advantage of having long-lived branches is that you are not restricted to incremental development and some legacy code (or suboptimal code) may be tightly-coupled and lend itself to non-incremental refactoring.

Branching by Feature and/or Function
A branch may be created to isolate the development of a specific feature. Once the feature has been regression tested, the feature branch can be merged with the main branch. Early and deferred branching patterns can be readily associated with feature or function branching. Early branching and deferred branching have advantages and disadvantages (as page 390 of Continuous Delivery explains).

Branching by Component
Logical subportions of code or an application can be considered a component. These modules may support a non-functional requirement. Their definition is somewhat nebulous and subjective. However certain software systems may have numerous components, and branching for these units may enhance the teamwork associated with branching. Early and deferred branching patterns can be readily associated with component branching. Early branching and deferred branching have advantages and disadvantages (as page 390 of Continuous Delivery explains).

Branching by Story
In an agile environment, a developer may be given a user story. The story's content may be in the penumbra of a bug fix or a new feature. A user story may have nothing to do with a bug fix or a new feature. Some environments produce job duties wherein a developer may want to create a branch associated with a user story (e.g., a case number in a Jira system). It can be a good way to track precisely what is attempted to be solved or improved for posterity. These branches would be merged with the main branch after enough testing stages to ensure it was not released to production without confidence it was ready.

Branching by Environment or Platform Technology
A repository may have Dockerfiles, PowerShell scripts, Bash scripts, and other code that must run in certain pods or on servers with a particular operating system. Having branches for a given platform (e.g., Docker, Windows, or Linux), can allow developers to work in parallel and merge their code back into the main branch when they are ready. This branching strategy would lend itself to having different CI/CD pipelines for each branch; the pod or server doing the building/compilation would have to be appropriate for the architecture involved. Dependency fulfillment is crucial to testability.

Branching by Team (Organizational Branching)
A branch can be created for a team within a larger business. The members of the team can commit their code to this branch, and then the branch can be merged into the main branch. A team branch would likely have an owner to manage policies that govern who can check code into it (Continuous Delivery, page 414). For release cycles that are more than two weeks, you would likely have a CI/CD pipeline for each team branch; but be warned that having multiple developers making changes can be difficult to reconcile when it is time to do a Pull Request and merge the team branch with the main branch.

Branching by Release (x.y Releases That Never Merge with the Main Branch)
Some code is used by various departments or customers. For non-hosted code a development team may want to have different branches for different versions of the software. The branch names can be that of the versions (e.g., 1.1, 1.2, and 1.3). By keeping the branches unmerged, it is one of the few times a long-lived branch is advisable to maintain. Continuous Delivery, on page 382, says that these branches are never merged with the main branch. You would likely have a CI/CD pipeline for each branch if you use this strategy.

Git Flow (a Branching Strategy)
One common way to have branches is to use Git Flow. Git Flow is more commonly used than trunk-based development. Git Flow is a strategy where you have five different branches (according to this external site): main, develop, feature, release and hotfix. Some of the non-main branches will receive merges from the main branch with the main branch's changes prevailing. That is, the Git administrator will merge changes from the main branch to a given feature branch. This way your organization can keep working on a feature, and this non-main feature branch can receive updates such as critical hotfixes. Git Flow is considered a set of guidelines (rather than firm rules), according to this video. You can read more about the details here.

A disadvantage of branching is that code can not go through the CI/CD process until you merge to main unless you have a pipeline for each branch. Another disadvantage is that merging code is inherently error prone (according to page 400 of Continuous Delivery).

Conclusion
Ideally you will have confidence that a repository has production readiness from automated testing in lower environments. But automated testing has its limitations and lower environments are often not exactly identical to production. Having the computing resources for a different CI/CD pipeline for each branch can be expensive. Moreover having identical lower environments to the production environment is also an expense that some companies do not want to pay for.

The benefits of having a shared single source of truth are many (and described in detail in The DevOps Handbook on pages 290 through 292). To generalize the benefits of an accessible repository that numerous professionals can work on, we can see how "trunk-based development" has its place (without extra branches which can be confusing for parallel development of code). A code commit to a repository could be the impetus for the CI/CD process. With automated testing following a triggering event, the promotion of the code to higher environments can happen safely. You can have more relevant testing, in theory, with trunk-based development. Merge conflicts should be avoided with commits to the mainline.

Having long-lived branches can be an alternative that can be an efficient way for programmers to collaborate. A developer can merge code changes to a number of different files with an approved pull request. The submission of a pull request will send a notification that a request for approval of the code changes has been made. The approver will see the changes via the diffs. This can be an efficient way for team-based development with clear ownership of code before it is integrated.

Code versioning systems support parallel collaboration of different individuals. The concurrent coding efforts can be ultimately merged (unless a release branch remains a release branch or unless you commit code all to the main branch). Code versioning systems record a history of revisions also known as changesets. Revision history of a merge will display a left and right revision. A merge or potential merge can have conflicts described as text-based, semantic, or syntactic. (To learn more about these types of conflicts, see this external posting or pages 367-369 of Elements of Programming Interviews in Python.) If new code introduces a flaw, a developer should be confident that recovery is possible by retrieving a previous version. This confidence can conduce to frequent commits or merges.

One tactic with a local copy of a Git repository is the need to do a "rebase." When someone does a rebase, changes that they did not make from the remote repository are brought down to the local copy. This action is collaborative because it allows a programmer to keep developing in a workspace while not diverging from the remote, central Git repository. The integration of code (merging of two or more developers) can involve conflict resolution. Rebasing periodically enables individuals to defer upstream conflicts while working on an up-to-date code base. Rebasing could readily happen with trunk-based development or when using a branch by team strategy. The programmer can then merge her changes upstream when she is ready to contribute at a time when it would maximize productivity.

Different branching strategies have their place depending on the environment, the team size, the goals, its discipline and workload toward the CI/CD process, and how formal the approval process is for pull requests and releases to production. Defining a branching strategy can help you communicate with your team as well as govern team behavior. Having carefully designed methods can help you rapidly evolve a software product and handle merge conflicts near a release. Team productivity as opposed to individual productivity is usually the goal. You want to be confident that your code has production readiness at the time of the release and minimize untimely merge conflicts. Whatever paradigm you choose, consider the implications carefully.

For Further Reading / Miscellaneous
The DevOps Handbook and Continuous Delivery were both co-authored by Jez Humble; he seems to have had a smaller role in The DevOps Handbook, but a substantial role in Continuous Delivery.

To learn more about Git branching strategies, read these articles:

https://www.creativebloq.com/web-design/choose-right-git-branching-strategy-121518344
https://nvie.com/posts/a-successful-git-branching-model/
https://gitversion.net/docs/learn/branching-strategies/
https://git-scm.com/book/en/v2/Git-Branching-Branching-Workflows
https://medium.com/free-code-camp/an-introduction-to-git-merge-and-rebase-what-they-are-and-how-to-use-them-131b863785f
https://docs.microsoft.com/en-us/azure/devops/repos/git/git-branching-guidance?view=vsts
https://stackoverflow.com/questions/2428722/git-branch-strategy-for-small-dev-team
https://barro.github.io/2016/02/a-succesful-git-branching-model-considered-harmful/
https://hackernoon.com/a-branching-and-releasing-strategy-that-fits-github-flow-be1b6c48eca2
https://www.perforce.com/blog/vcs/best-branching-strategies-high-velocity-development
https://pradeeploganathan.com/git/git-branching-strategies/
https://launchdarkly.com/blog/git-branching-strategies-vs-trunk-based-development/

This git command "git bisect" is relevant to reconciling different branches:
https://git-scm.com/docs/git-bisect

To view a list of Git books, see this page.

To learn more about branching strategies in general, see these links:
https://martinfowler.com/articles/branching-patterns.html
https://www.agileconnection.com/article/picking-right-branch-merge-strategy
https://www.youtube.com/watch?v=U_IFGpJDbeU

Chapter 10 of Expert Python Programming, 3rd Edition delves into branching strategies and managing code. The 4th Edition may be even better.

Remember that some code changes can have a problem that is not captured until run-time. Code may compile fine, but fail when it is run. So you must have a CI/CD pipeline with automated testing of the final product; static code analysis and code linting are not enough.

For resolving conflicts of merge attempts, with older non-git technologies, there was a question about optimistic and pessimistic locking techniques. In 2022, we tend to think that these locking techniques no longer come into play for resolving code merge conflicts.

To disambiguate multiway branching in coding from code versioning branching strategies, we want to assert that conditional logic in programming is completely separate. Code versioning systems can support branches of directories of files, branching in coding (such as multiway branching) can refer to if/then logic or other arithmetic-based flow control for processing a program.

How Do You Create a GCP Bucket?

Problem scenario
You want to create a GCP bucket. What should you do?

Solution

Log into GCP. Go here:
https://console.cloud.google.com/storage/browser?_ga=2.100337182.-1663846808.1542719582
Click "Create bucket".
Enter a name. Configure settings as desired.
Click "CREATE".

How Do You Troubleshoot the Ansible Playbook Error Associated with “[Errno 2] No such file or directory”?

Problem scenario
When running an Ansible playbook using the java_cert module you receive an message "[Errno 2] No such file or directory". How do you fix this?
(If you were using a file linking step in the playbook and not the java_cert module, see this posting.)

Solution
Use the "executable" attribute of the java_certs module (https://docs.ansible.com/ansible/latest/modules/java_cert_module.html) to specify the full path of the keytool file. It may be that you have Java installed, but the keytool file is in a location that is not part of the environment variables of the OS. To find where the keytool is, log on to the managed node. Run this command: sudo find / -name keytool
The above location (the result of the command above) should be the value of the "executable" attribute underneath the java_cert module.

What Are The Advantages of a Service over an Ingress in Kubernetes?

Problem scenario
You have read that Ingresses have benefits compared to Services. You know the two are different for routing external traffic to reach a Kubernetes pod. When would you want to use a Service instead of an Ingress?

Possible Answers

When you want to direct traffic to Pods based on a selector and not an IP address. Ingresses use IP addresses*, but Services use pod selectors**. Labels can be designed arbitrarily. For performance and functionality, you may want to design a nuanced assignment of web traffic to specific pods. For more information about labels and selectors, see this internal page or this external page. (Ingress resources can leverage Services to not use IP addresses, according to pages 145 and 146 in Kubernetes in Action, but we think bypassing an Ingress resource and going straight to the Service would have performance benefits.)
When you want to keep the Ingress Controller off.*** For security reasons you may want to keep the Ingress Controller off. To create an Ingress resource, the Ingress Controller must be turned on. All things constant, Ingress resources would be safer than Services, but we cannot deny that keeping a component (the Ingress Controller) off has its security benefits too. Some Kubernetes clusters may have a policy to keep the Ingress Controller off for various reasons.
If you have a hardened Kubernetes deployment, network traffic at and above layer 5 of the OSI model may be substantially filtered/blocked. Thus Ingress controllers would not work; they operate at layer 7 of the OSI model. Services with NodePort operate at layer 4 of the OSI model.
If you want to leverage BGP with networking on baremetal. If this is a requirement, a Service of the LoadBalancer type would be preferred over an Ingress. The source link for this one is here.
Services pass the traffic to an Endpoints -- not to pods directly (according to page of 325 of Kubernetes in Action). (Endpoints is plural even when you would think it is singular.)

For more information on the differences between the two, see this posting.

*"The Nginx installations route directly to the pods' IP addresses, bypassing the associated Service's virtual IP. " Taken from https://www.joyfulbikeshedding.com/blog/2018-03-26-studying-the-kubernetes-ingress-system.html

** "The set of Pods targeted by a Service is usually determined by a selector…" Taken from https://kubernetes.io/docs/concepts/services-networking/service/#service-resource

*** "…Ingress controllers are not started automatically with a cluster." Taken from https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/

How Do You Find if Your EC-2 Servers Are in a VPC?

Problem scenario
You want to know if your EC-2 servers are in a VPC or not. What do you do to determine if your EC-2 servers are in a VPC?

Possible Solution #1
To use the web console, see this external posting.

Possible Solution #2
This is a programmatic, character-based solution.

Prerequisite
This assumes that you have installed the AWS CLI. If you need assistance, see this posting.

Procedures
This assumes that your AWS CLI has been configured for the region that you want to query. You may need to repeat this process for each region of the EC-2 servers that you are concerned about.

Run this command:
aws ec2 describe-instances | grep -i vpc

How Do You Find the Subnet ID Values in AWS?

Problem scenario
You want to list the subnet ID values in AWS. What do you do?

Solution
Prerequisite
Install and configure the AWS CLI; if you need assistance, see this posting if you can you the pip command and you are using Ubuntu. If you cannot use the pip command or you are not using Ubuntu, see this posting.

Procedures
Run this command:
aws ec2 describe-subnets | grep -i subnet