CONTINUAL INTEGRATION – Page 131 – A Technical I.T./DevOps Blog

How Do You Use a YAML File or a YAML Manifest in Kubernetes?

Problem scenario
Kubernetes can use YAML files for configuration. The book Kubernetes in Action by Luksa refers to these files as manifests (pages 148), YAML manifests (page 155) or "pod manifests" (page 451). The Kubernetes website refers to this YAML file as "the PodSpec" here. Pod templates are defined inside these .yaml files (as a subset of the file itself). How do you use these YAML files?

Solution
1. To create one, know that there are maps and lists. Lists have merely arguments. Maps are key-value pairs. Quotes around the keys and values are optional. There should be no "tab" keys when formatting them. To learn more about the formatting, you can see this external posting.

2. You can make changes to services that already exist with these .yaml files. You would run a command like this: kubectl apply -f filename.yaml

The resource specified in the .yaml file named "filename.yaml" will be updated accordingly.

How Do You Troubleshoot an Error like This in Groovy “groovy.lang.MissingMethodException”?

Problem scenario
You run a Groovy program like this: groovy foobar.groovy

You see this message:

Caught: groovy.lang.MissingMethodException: No signature of method: java.lang.Boolean.call() is applicable for argument types: (foobar$_run_closure5$_closure6) values: [foobar$_run_closure5$_closure6@6150c3ec] Possible solutions: wait(), any(), wait(long), and(java.lang.Boolean), each(groovy.lang.Closure), any(groovy.lang.Closure) groovy.lang.MissingMethodException: No signature of method: java.lang.Boolean.call() is applicable for argument types: (foobar$_run_closure5$_closure6) values: [foobar$_run_closure5$_closure6@6150c3ec] Possible solutions: wait(), any(), wait(long), and(java.lang.Boolean), each(groovy.lang.Closure), any(groovy.lang.Closure) at foobar$_run_closure5.doCall(foobar.groovy:56)

What should you do?

Solution
Are you returning a variable with a closure? Look at the relevant return statement itself. Returns should be like this:

return foobar

Return statements should not be like this:
return ${foobar}

When You Have Two Branches in a Git Repo, How Can You Reconcile the Differences When You Merge Them?

Problem scenario
When you have created a branch of a Git repo, and there have been changes made to the master branch and this second branch you created, what are your options?

Possible Solution #1. You can merge the two branches (e.g., with a pull request). One branch will prevail if there are any conflicts. You can select which branch to prevail with GitLab, Atlassian Bitbucket, or regular Git.

Possible Solution #2. You can use git rebase. This will apply your second branch to the current state of the upstream master branch. git rebase makes the master branch as the main base. If no changes happened to the master branch since the second branch was created, using git rebase in this scenario would have the same effect as merging the second branch with the master branch with the option to have the conflicts be resolved with the second branch's changes prevailing. In all instances, the branch you are working in will have its changes preserved after a git rebase.

OpenStack and Virtualization API Quiz with Answers

OpenStack and Virtualization API Quiz with Answers
(For the DevOps and ETL Quiz, click here. For the answers to the DevOps and ETL Quiz, click here. For the Python Quiz, click here. For the answers to the Python Quiz, click here.)

1. What are the HTTP operations associated with the acronym CRUD?

Answer: CRUD stands for Create, Read, Update, Delete. The corresponding HTTP calls (or "verbs") are Post, Get, Put, and Delete. Despite the letter "p" being absent from CRUD, the "patch" HTTP request is associated with Update. Patch is a partial update. For more information, see this blog. Increasingly the CRUD acronym, originally associated with SQL database DML commands, is being associated with the HTTP-based API commands. Introduction to JavaScript Objection Notation by Lindsay Bassett (ISBN 978-1-491-92948-3, published by O'Reilly Media Inc., in 2015) refers to web APIs as having CRUD operations on page 51. Here is another example of the CRUD acronym being associated with APIs.

2. Does the API log for glance capture glance activity directly from the CLI (that bypasses horizon)? As a hint, the default location of the glance log is this: /var/log/glance/api.log

Answer: Yes. The glance api.log captures CLI glance commands that bypass Horizon. The glance api.log normally captures Horizon-originated glance operations such as a Horizon GUI click (which would invoke an API call). The horizon_access.log captures the front end operations, HTML behavior, and relevant response codes from the OpenStack components for the individual operations. For more information about OpenStack CLI and API logging, see this article.

3. The "Create Image" feature in Horizon is an API call with an underlying "Post" call. When a user clicks the "Create Image" button, will this underlying Post activity be captured in the /var/log/apache2/horizon_access.log ?

Answer: Yes. Similarly, Glance commands will be captured in the api.log file in /var/log/glance/. For more information about OpenStack CLI and API logging, see this article.

4. How is neutron different from nova-network?

Answer: Neutron is the current networking component for OpenStack. While nova-network is still around, it is considered legacy. (1) Why is this an OpenStack API question? "Neutron itself is an API..." according to Dan Radez (on page 38 of his book OpenStack Essentials 1st edition). The second edition of OpenStack Essentials was released in 2016.

(1) OpenStack Operations Guide (written by Fifield, Fleming, Gentle, Hochstein, Proulx, Toews and Topjian) says that nova-network is a "legacy networking option."

5. Are OpenStack APIs for Swift most commonly written in the Swift programming languages?

Answer: No. OpenStack APIs can be developed RESTfully and independent of one language. The Swift programming languages are separate and have not been adopted in any substantial way for OpenStack Swift API calls.

To be clear about the three main "swift" technologies in computing today, please note the following disambiguation:

i. OpenStack Swift is a component for object storage.

ii. The interpreted Swift programming language (supported by the National Science Foundation, the University of Chicago, and other groups) is a "simple tool for fast, easy scripting on big machines." This quote was taken from Swift's webpage here (which has information about this Swift language): http://swift-lang.org/main/

iii. The separate, compiled Swift programming language created at Apple is a programming language for iOS and is designed to integrate well with Objective-C. For more information see Apple's webpage.
https://developer.apple.com/swift/

iv. SWIFT can stand for The Society for Worldwide Interbank Financial Telecommunication. SWIFTNet Link is an API for banks to use for secure communication. SWIFT infrastructure refers to servers on a secure network supporting inter-bank connectivity and relevant APIs. To read more about SWIFT standards and technology, you can try links from Microsoft, Wikipedia, or SWIFT itself.

6. VMware API Question
vRealize Orchestrator APIs cover what percentage of vSphere APIs?

Answer: 100%. The source was https://www.vmware.com/files/pdf/products/vrealize/VMware-vRealize-Orchestrator.pdf.

7. VMware API Question
vRealize Orchestrator APIs cover what percentage of VMware vCloud Director APIs?

Answer: 100%. The source was https://www.vmware.com/files/pdf/products/vrealize/VMware-vRealize-Orchestrator.pdf.

8. VMware API Question
What is vAPI?

Answer: It is VMware's newest (as of 2016) feature to bring control of its technologies into one API. To read more about vAPI, read this link and this link.

9. OpenStack API Question
When someone on an OpenStack API conference call refers to "moss" (as in green moss on a tree) what might she be referring to?

Answer: The word "moss" is a homophone of three acronyms, MOS (Mirantis OpenStack), MaaS (Metal-as-a-Service) and MaaS (Model-as-a-Service). MOS and Metal-as-a-Service are frequently used in the OpenStack community. Model-as-a-Service is a newer term that is not necessarily related to OpenStack; you used to be able to read more about it at http://blog.syncsort.com/2016/08/big-data/expert-interview-series-cyber-security-with-apache-metron-and-storm/.

10. Generic Virtualization Question
What is virtualization on virtualization called (when a guest virtual machine's host is a virtual server itself)?

Answer: Nested virtualization. For more information, see this link.

11. OpenStack API Question
Which of the following three are the most common types of request and response parameters involved with OpenStack APIs?

a) XSD Lists
b) XSD Strings
c) XSD Dicts
d) JSON strings
e) JSON integers
f) Python strings
g) Python integers

Answer: A, B, and C. For more information, see this link.

12. API Virtualization Question
Which AWS tool primarily functions to monitor and record API activity in a given AWS account?

a) Amazon CloudWatch
b) Amazon CloudFormation
c) Amazon CloudTrail
d) Amazon AppStream

Answer: C. For more info see this link.

13. Generic API Question
Django's REST framework includes API requests with responses that are in unbrowsable JSON exclusively?

True
False

Answer: False. www.django-rest-framework.org reports "[t]he Web browsable API is a huge usability win for your developers."

14. OpenStack API Question
Does the OpenStack API include the Patch HTTP request?
Yes
No

Answer: No. http://developer.openstack.org/api-ref-image-v2.html

15. AWS API Question
Can AWS Gateway API endpoints (URLs) be isolated within a Virtual Private Cloud and thus hidden from the Internet?
Yes
No

Answer: Yes. Amazon API Gateway endpoints are can be accessible to the internet but otherwise hidden to the public. They could be configured to rely on Direct Connect. See this FAQ link or this Amazon article for more information.

16. General API Question
Service virtualization is not useful during development yet useful in production?
Yes
No

Answer: No. When isolated from the Internet for development and risky testing, virtualizing an API endpoint in a development environment is very useful. A RESTful API endpoint on a local web server can simulate the functionality of a web-exposed endpoint. When development is complete and the application can be promoted to production, the real URL can be substituted. See this link for more information.

17. When automating AWS operations with the AWS API Gateway, as with AWS SDKs and the AWS CLI, there is no need to sign the request?
True
False

Answer: False. AWS SDKs and the AWS CLI do not require signing requests. The AWS API calls do require signing the requests. See this link for more information.

18. For AWS, what is the difference between a t2 instance and a t1 instance?

a. t1 instances are more compute optimized
b. t2 instances are more compute optimized
c. t1 instances are "previous generation instances"
d. t2 instances are "previous generation instances"

Answer: C
See either of these links for more information: https://aws.amazon.com/ec2/instance-types/ or
https://aws.amazon.com/ec2/previous-generation/

19. What is the name of the project to standardize REST APIs?

a. The Swagger Specification
b. The OpenAPI Initiative
c. The Reverb Initiative
d. The Wordnik Initiative
e. The REST Model

Answer: B See this link for more information.

20. What technology supports XML and JSON for Restful APIs that has the initials "HAL"?

a. Hypertext Application Language
b. Hypermedia Application Language
c. Hardware Abstraction Layer
d. Hybrid Automation Layer

Answer: A
For more information see these links:
https://dzone.com/articles/introduction-hypertext-0
http://stateless.co/hal_specification.html

21. There are three levels of RESTful web services. Which level is concerned with self-documentation, verbs, and resources?

Level 1 is concerned with ______________?
Level 2 is concerned with ______________?
Level 3 is concerned with ______________?

Answer:
Level 1 is concerned with resources.
Level 2 is concerned with verbs.
Level 3 is concerned with self-documentation.

For more information, see these links:
https://dzone.com/articles/introduction-hypertext-0
https://www.infoq.com/news/2010/03/RESTLevels

22. What is the virtualization API tool that supports the LXC Linux container system, the Xen hypervisor on Linux servers, and the KVM/QEMU Linux hypervisor?

a. cffi
b. Bhyve
c. oVirt
d. libvirt

Answer: D. See http://libvirt.org/

23. Which of the following is a tool that provides centralized management for virtual servers?

a. XPCOM
b. Bhyve
c. oVirt
d. CORBA
e. Requests

Answer: C. See http://searchservervirtualization.techtarget.com/definition/oVirt

24. virsh is part of which tool?

a. XPCOM
b. Bhyve
c. oVirt
d. CORBA
e. libvirt

Answer: E. http://www.ibm.com/developerworks/library/os-python-kvm-scripting1/

25. What does the term REST stand for?

Answer: REpresentational State Transfer
See http://www.acronymfinder.com/REST.html

26. What is an HTTP library for Python?

a. XPCOM
b. Bhyve
c. oVirt
d. CORBA
e. Requests

Answer: E. Taken from http://docs.python-requests.org/en/latest/user/advanced/#advanced

27. What does BaaS stand for?

a. Backend-as-a-Service
b. Balancer-as-a-Service
c. Box-as-a-Service
d. Bytes-as-a-Service

Answer: A. Source https://www.techopedia.com/definition/29428/backend-as-a-service-baas

28. REST APIs support direct bilateral communication. Is this true or false?

Answer: False. See the answer to the second question on this PDF. REST APIs receive and respond to requests. But true duplexed communication is not possible with REST APIs.

OpenStack Books

How Do You Find the Storage Space Displaced by a Directory and All of Its Files and Subdirectories on a Linux server?

Problem scenario
How do you find the space used on the disk (e.g., hard disk, SAN or NAS) from the files and subdirectories of a given directory via a Linux command prompt?

Problem scenario
Run a command such as this: sudo du -sh /path/to/subdirectory

It will show you how much space is being consumed (or utilized) by the "subdirectory".

How Do You Write the Equivalent of a “hello world” Program with Machine Learning in Python?

Problem scenario
You want to be able to say you ran a machine learning program. You know some Python basic. What do you do to write a very simple machine learning program?

Solution
Prerequisite
This assumes that pip has been installed. If you need assistance see this posting.

Procedures

1. Run this command:
sudo pip install numpy scipy scikit-learn

2. Create a file like this called ml.py:

#This program was adapted from Google Developers here: https://www.youtube.com/watch?v=cKxRvEZd3Mw
from sklearn import tree
features = [[140, 1], [130, 1], [150, 0], [170, 0]]
labels = [0, 0, 1, 1]
clf = tree.DecisionTreeClassifier()
clf = clf.fit(features, labels)
print ("This program uses a decision tree that is part of the sklearn package")
print clf.predict([[140, 1]])
print ("Above is the pair with the 140 value and below is the pair with the 130 value")
print clf.predict([[130, 1]])
print clf.predict([[150, 0]])
print ("Above is the pair with the 150 value and below is the pair with the 170 value")
print clf.predict([[170, 0]])

3. Run it like this: python ml.py

How Do You Troubleshoot Cassandra when It Hangs on the Message “ColumnFamilyStore.java Initializing”?

Problem scenario
You start Cassandra with this command: ./bin/cassandra
You see one of the following messages:

INFO [MigrationStage:1] 2018-04-06 19:01:07,144 ColumnFamilyStore.java:391 - Initializing system_auth.resource_role_permissons_index INFO [MigrationStage:1] 2018-04-06 19:01:07,163 ColumnFamilyStore.java:391 - Initializing system_auth.role_members

No progress is happening. What should you do?

Solution
Possible Solution #1. Try rebooting the server. This could help the problem.

Possible Solution #2. This next one is merely a workaround. It is not a best practice.

Edit the ColumnFamilyStore.java file. To find it use this:
sudo find / -name ColumnFamilyStore.java

Comment out line 389. It should look like this after you comment it out:
//logger.info("Initializing {}.{}", keyspace.getName(), name);

Start Cassandra again. But know that you modified the source code without a though quality assurance process. This could have serious ramifications in the future.

Possible Solution #3. Wait 60 minutes. If you are patient enough, it is possible that the problem will go away on its own.

How Do You Troubleshoot the Java Program Message “com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Unknown database foobar”?

Problem scenario
Your Java program returns this message: "com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Unknown database foobar"

What should you do?

Possible solution #1
Are you using Amazon Aurora? AWS has a MySQL PaaS offering. You may have a section of the Java code that looks like this:

Connection con=DriverManager.getConnection("jdbc:mysql://foobar-us-west-2b.abcdefghijk.us-west-2.rds.amazonaws.com:3306/foobar")

Remove the last "foobar" from the end. The connection string should look like this for Aurora databases:

jdbc:mysql://foobar-us-west-2b.abcdefghijk.us-west-2.rds.amazonaws.com:3306/foobar

Recompile the program. Execute it again. (We know this is contrary to the documented and canonical construction of MySQL strings for JDBC connectivity. The root cause of your problem may have been that Aurora MySQL databases are different in this respect.)

Possible solution #2 The database name "foobar" does not exist. Did you type the server or database name incorrectly? Have you created the database? Did you not use to the correct server?

In Python, What Are Some Advantages with Calling a Function or as a New Thread?

Question
Python supports the creation of new threads for [bound or unbound] functions. They can help with multiprocessing. New threads are ideal for non-blocking operations like serving a GUI. If you want a server to begin certain operations in parallel with others, you may want to use new threads as opposed to new processes (which can provide the same parallel processing benefit). What are some advantages of using a thread to call a function?

Answer
The main reason you use threads is to leverage the capability of the server when you have complex operations (e.g., separate Python functions) that should run simultaneously. (Technically only one thread can execute at a time. But with small amounts of delay, non-blocking operations can happen in a way that will seem concurrent to the user.) Here are the advantages of using new threads (as opposed to forking a process):

The server should perform better by not having the overhead of separate processes. A thread is not a separate process.
It will produce no new processes; therefore you will not have to clean up zombie processes. Systems administrators have less work when there are fewer zombie processes.
Different threads have access to the same memory addresses of the process (page 186 of Programming Python by Mark Lutz). Global variables in a Python program with multiple threads can make programming complex things simple. See "How Do You Troubleshoot a Python Error "UnboundLocalError: local variable 'x' referenced before assignment"?"
New threads are architecture agnostic. This is different from Python's operating system forks (using os.fork would create a new process for a function call). If you are developing software with Python for Windows and Linux/Unix, using new threads is preferable to using system forks.
A program can perform independently after the thread starts. This can allow for simultaneous execution of both the program and new thread. See "How Are Processes and Threads Different from Each Other in Python?" or "How Is a Process Different from a Daemon in Linux?"

How Do You Troubleshoot the Message “ERROR: but there is no HDFS_DATANODE_USER defined.”?

Problem scenarios
One of the following apply to you.

Situation 1:
You run "start-dfs.sh" and it seems to work, but the "jps" command does not show that "DataNode" is running.

Situation 2:
You run "sudo bash start-dfs.sh" but you receive this message:

ERROR: Attempting to operate on hdfs datanode as root ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation. ERROR: Attempting to operate on hdfs secondarynamenode as root ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.

Solution
1. Modify start-dfs.sh. Find the comment "e.g., HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER=hdfs". Underneath the comment section, place these four stanzas:

HDFS_DATANODE_USER=root HDFS_DATANODE_SECURE_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root

2. Modify stop-dfs.sh. Find the comment "e.g., HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER=hdfs". Underneath the comment section, place these four stanzas:

HDFS_DATANODE_USER=root HDFS_DATANODE_SECURE_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root

3. Try "sudo jps". Without sudo, jps could return fewer services than are actually running. In most cases, you would not use "sudo" to start dfs. To learn how to set up a multi-node cluster of open source Hadoop that can be administered with a user without sudoer rights, see this posting.