To see if Docker has started, do this command:
ps -ef | grep -i docker
If that returns only a service for the grep itself, then Docker is not running. Occasionally the Docker service won't start through traditional methods. But some users have found that this command will work reliably:
docker daemon &
The "&" allows for the next prompt to return. This method is explicit to new users of Docker too. This method provides more verbose informational messages to print to the console as compared to "systemctl docker start."
How do you install two or more RPM packages when they depend on each other?
Question: How do you solve circular dependency problems when installing RPMs in RedHat Linux?
Problem Scenario: For example, you keep trying to install different RPMs, but they always require a different installation. By exhaustively going through the dependencies, you find a circle of dependencies. This is sometimes called mutual recursion.
Root cause: Human error.
Solution: The way to resolve circular dependencies is with a yum localinstall command with a list of each of the RPM packages afterward. For example if packageA.rpm depends on packageB.rpm to be installled, and packageB.rpm depends on packageC.rpm to be installed, and finally packageC.rpm depends on packageA.rpm to be installed, what do you do? Put packageA.rpm, packageB.rpm, packageC.rpm in the local directory. Then do this:
yum localinstall packageA.rpm packageB.rpm packageC.rpm
For people new to patching RedHat derivatives, learning to apply different patches simultaneously solves the circularly dependent problem. If there is an error message still, and it seems impossible to solve, look closely at the error message. The error message may have a subversion in the requires message (e.g., a subtle .5 after a number) that is slightly higher than one of the versions that you are trying to install. Certain combinations of .rpm files can be finicky or particular with each combination of .rpm versioned files. The solution is possible however. Once you have the correct versions of potentially long-named .rpm files, do the following:
Step #1: Go to the directory where your .rpm files are.
Step #2: Issue on of the following:
yum localinstall *.rpm
rpm -ivh *rpm
Any persistent error message may be telling you something. There may be a version incompatibility.
Possible Problems With Rendering PHP with Apache
Problem scenario
Your PHP code is being displayed in a raw fashion. It is not being rendered or presented nicely as it should. You see raw PHP text when you open a web browser and go to the .php page. What should you do?
Solution
If the PHP page is blank (or completely white), see this posting.
Is your PHP program using Linux Bash commands? If so, see this posting.
If a PHP page is retrievable from a web browser, but shows only text and the actual source code in PHP, Apache is at least somewhat working. What is the root cause? One potential root cause is that PHP has not been installed. If you are running Ubuntu Linux, run this command to remedy two potential root causes: sudo apt-get -y install php7.0 libapache2-mod-php
Another root cause of this problem may be that Apache web server has not been configured correctly. Find the httpd.conf file (e.g., if you are not running Ubuntu, because with Ubuntu there will be an apache2.conf file instead). You may want to consult with this external posting. Otherwise find the "AddType" section of the httpd.conf file. Ensure that this line is present: AddType application/x-httpd-php .php
The second stanza to look for is the LoadModule one. In the LoadModule section of the httpd.conf file, ensure that a line like this is present (assuming you are using PHP5):LoadModule php5_module modules/libphp5.so
You may need to replace "modules" above with the absolute path to the libphp5.so file. The libphp5.so file may not be present after a Drupal installation and configuration. Proper Drupal* installations use the ./configure script with a flag like this: --with-apxs2=<pathToAPXS>
The value <pathToAPXS> should be the results of this command: which apxs
If the results of this which apxs command are nothing, try one of these commands:
-If you are using a RedHat derivative OS): # echo "extension=apc.so" > /etc/php.d/apc.ini
-If you are using a Debian distribution: # echo "extension=apc.so" > /etc/php5/conf.d/apc.ini
The flag would be --with-apxs (with no "2") if the Apache version is older.
*Drupal is a content management system. To find out how it is pronounced, go here.
How To Potentially Solve an HTTP 403 Error on An Apache Server
Problem scenario: You are trying to access a file on a website. But you get the 403 Forbidden error every time. What are some different things to look for to fix this problem?
Solution:
If you do not have access to the back-end of the web server, try these:
- Clear the cache/history from your web browser.
- Clear the cookies from your web browser.
- If you are using wget, try this:
wget -U firefox http://continualintegration.com/
(where continualintegration.com is the URL of the website). - Verify the URL is correct. Have you entered it with correct case sensitivity? Some URLs can be case sensitive.
If you have access to the web server, try these steps:
- Verify that read permissions are given to the user trying to access the file. For example, -r--r--r-- would be the minimum permissions needed. You could use this command: "chmod 644 file.txt" (with no quotes where file.txt is the file you are tryint to retrieve). If the parent directories of the file on the Apache server have strict permissions (at the regular, OS user level on the back end of the server, users on the web front end may encounter the 403 "Forbidden" message in their web browsers or from a wget command.
- Another cause of this problem is the <Directory> </Directory> section of the httpd.conf file could be configured with a "Require all denied" stanza. If this line appears, the DocumentRoot directory will be locked down. That is, no web page will be visible beyond the default "Testing 123" Apache page. To open up the DocumentRoot directory tree, change the "Require all denied" to "Require all granted". This way the Apache server will present a web page to requesters without a "Forbidden" or 403 error. By default, if you install Apache web server and change only the DocumentRoot directory from something besides the default /var/www/html, other references to this very same directory will not be changed. Therefore you need to change those references in httpd.conf. To find httpd.conf, try this Linux command "find / -name httpd.conf" (with no quotes).
- The problem could be intermittent. Read this for more information.
OpenStack Sahara Documentation
Some open source projects don't always listen to contributors' feedback. We reported a couple errors that we found in OpenStack documentation to openstack.org. Here are the errors we saw (as of 2/2/17):
#1 If you go to this link, you'll find two "Storm EDP" links:
http://specs.openstack.org/openstack/sahara-specs/
One points to this link: http://specs.openstack.org/openstack/sahara-specs/specs/liberty/storm-scaling.html
We see no reason why the title/header of this above page is "Storm EDP" and not "Storm Scaling." My attempt at a contribution was to not have two "Storm EDP" links in the first link of this post.
#2 We found this ungrammatical sentence here (which needs the word "needs" instead of "need"):
"Sahara need more flexible way to work with security groups." This was taken from: http://specs.openstack.org/openstack/sahara-specs/specs/juno/cluster-secgroups.html
The OpenStack Foundation probably has limited resources. But if they had a way to listen to each person's contributions, progress would be more rapid. The more requests are ignored, the less likely contributions will be made.
SaltStack Technology and Terminology
SaltStack provides for more complex configuration management than Ansible (another Python-based) configuration management tool. Some people have criticized Salt for having too many new vocabulary words. Like all complex technologies, they take time getting used to. To help learn about Salt, I thought I'd provide an overview.
An SLS file is a SaltStack State file. This file is the basis for determining the desired configuration of the client servers that are called Salt Minions. A pre-written State file is called a formula in the world of SaltStack. Just like sodium and chloride can be the basis of other compounds, formulas can be the basis of complex desired state configurations. Grains, in SaltStack terminology, are data about a Salt Minion. A grain may include information such as an OS type of a Minion server. The data is generated from the minion. Pillars are data about Salt Minion servers too. But pillars are are stored on the Salt Master server. Pillars are encrypted, and they are ideal for storing sensitive data that should only go to certain Salt Minion servers. Pillar sls files have data like a state tree (a collection of sls files) except that pillar data is only available for servers that have a given "matcher" type.
Beacons, in the context of SaltStack, are constant listeners for a condition to be met. If used properly the beacon can have a corresponding action to be taken from the "reactor system." A reactor sls file will have a condition and trigger an action because of the beacon listener.
The first two paragraphs were a combination of original content and content paraphrased from these two links: Pillar and Highstate. The final paragraph was paraphrased from this link on Reactors.
Containerization Has Its Advantages Over Virtualization
Containers, such as Docker, communicate to each other through a shared kernel. Guest virtual machines communicate to each other through the hypervisor or host operating system. Containers enjoy faster communication as staying within a shared kernel allows for more rapid communication than leaving a virtual machine and going out to a hypervisor (or host operating system) to communicate with another virtual machine. Containers allow for sequestration of processes and fewer operating systems licenses compared to having a comparable solution with virtual machines. Virtual machines can separate processes but require an operating system license for every virtual machine.
DevOps and ETL Quiz
Extract-Transform-Load workflows involve considerable architecture including a workflow over a network to take data from a flat file and ingest it into a database. Automation is one way to manage the ETL support system. DevOps Engineers commonly support database installations and configurations. DevOps engineers commonly support continual delivery pipelines. This automated process (involving automatic deployments) is often similar to automating an ETL process. DevOps engineering, build and release engineering, automation development, and ETL design are all interdisciplinary fields of information technology. This is a quiz related to both DevOps and ETL topics.
1. What is the DevOps tool for databases?
a. QuerySurge
b. Beehive
c. Stratos
d. DBMaestro
2. What does mung mean?
___________________________________________________________________________
3. What does idempotent mean?
___________________________________________________________________________
4. What is the name of the process of actively preparing data for serialization (e.g., data that was not otherwise logically contiguous on disk for a buffer) called? This process may include modifying data from one programming language or interface so it is compatible with a different programming language or different interface.
a. Almquist variation
b. inmoning
c. scrum transition
d. marshalling
5. How is an imperative process different from a declarative process?
___________________________________________________________________________
6. What is a common tool that both ETL Developers and DevOps Engineers use?
___________________________________________________________________________
7. Which of the following can you not create an AWS Data Pipeline with?
a. AWS Management Console
b. AWS Command Line Interface
c. AWS SDKs
d. AWS APIs
e. None of the above
8. Mesos Clusters cannot work with both HDFS and Digital Ocean?
True
False
9. Hadoop YARN cannot act as a scheduler for OpenShift?
True
False
10. Which of the following Apache products can create ETL jobs?
a. Accumulo
b. Pig
c. Stanbol
d. Lucene
11. Which of the following is not an ETL product?
a. IBM InfoSphere Datastage
b. Oracle Warehouse Builder
c. Business Objects XI
d. SAS Enterprise ETL server
e. Stratos
f. Informatica
g. Apache Hadoop
h Talend Big Data Integration
12. In Informatica are mapplets only able to be used once without logic?
Yes
No
13. Which of the tools below are tools designed to aide ETL process testing and validating data warehouses themselves?
a. QuerySurge by Real-Time Technology Solutions
b. DBMaestro
c. Apache Cassandra
d. Apache Stratos
e. ETL Validator by datagaps inc.
14. What is an example of cooked data in the context of ETL/Devops?
a. Machine-corrupted data (e.g., from disk failure)
b. Content that was corrupted maliciously
c. Cleansed data
d. Intentionally masked data (to hide identities)
15. What is the technique that divides a table of a database into different subcomponents, such as partitioning columns, to improve read and write performance?
a. data marting
b. impedance matching
c. sharding
d. redis
16. What tool allows you to designate when Docker containers process ETL jobs without manual configuration?
a. Pachyderm
b. Chronos
c. Overwatch
d. emerge-sync
17. Which of the following can readily be used as a superior ETL platform?
a. Hadoop
b. Teradata
c. Proxmor
d. Note Beak
18. There is consensus that small companies should use Informatica or a supported, proprietary ETL tool as opposed to an in-house developed tool.
True
False
19. Which of the following has an open source version:
a. Talend Integration Suite
b. Pentaho Kettle Enterprise
c. CloverETL
d. All of the above
e. None of the above
20. What is a data lake?
a. A synonym of data warehouse
b. A buffer of streamed data
c. An archive of metadata about previous real-time data streams
d. A pool of unstructured data
21. What is a data swamp?
a. A dense data lake
b. A severely degraded data lake
c. A synonym of a data warehouse
d. A pool of unstructured data
e. An archive of metadata about previous real-time data streams
22. Snappy is the name of which two concepts?
a. The REST API for SnapChat
b. A data compression and decompression library with bindings for several languages
c. A Linux package management system
d. An automation scheduler for Informatica
e. An open source component to migrate SSIS packages to PostgreSQL
23. In a SQL database you have a left table with four rows and a right table with seven rows, what is the highest number of rows that can be returned with an inner join?
a. 0
b. 4
c. 11
d. More than 11
24. Which of the following provide Sqoop based connectors (choose all that apply)?
a. Teradata
c. Talend Open Studio
c. Informatica (modern versions)
d. Pentaho
25. What is a continuous application?
a. The namesake of CA traded on the Nasdaq as CA
b. An application that encompasses data streaming (e.g., ETL processes) from start to finish that adapts itself to the data stream(s) in real-time
c. An application that leverages ETL processing
d. An application receiving continuous integration (or continual integration)
e. An application receiving continuous delivery (or continual delivery)
f. An application receiving continuous deployments (or continual deployments)
g. An application that is always available through fault tolerance and load balancing
26. DevOps expert Gene Kim got his start with a security product called Tripwire, known for its emphasis on changes to files. There is a tool that keeps track of changes to a database. Which product below concerns itself with tracking changes of database schemas?
a. MongoDB
b. DBVersion
c. Databasegit
d. Liquibase
27. Which product enables you to quickly make copies of SQL Server databases for your Test, QA or development environments? Choose the most accurate answer.
a. Canonical's Juju
b. RedGate's SQL Provision
c. Apache Hamster
d. Apache Numa
28. The SQL Server database back ups are not working or you get false positives that your back up solution is successfully backing them up. What solution should you for a practical back up solution?
a. Write you own PowerShell script that backs up the database
b. Implement AlwaysOn Availability Groups
c. Implement RedGate's Toolbelt
d. Implement Apache Impala
29. Which AWS tool can perform ETL jobs? Choose two.
a. DMS (Database Migration Services)
b. DMS (Data Manipulation Service)
c. Glue
d. Cognito
e. Federation
30. Test Kitchen works for which of the following?
a. Chef
b. Terraform
c. PowerShell DSC
d. All of the above
*** See answers to quiz. ***
DevOps Books
How do you use the source keyword in Puppet’s DSL (when writing a manifest)?
When writing a Puppet manifest you can use the "content" reserved word. You then have quotes around the actual text content of this file right in the manifest itself. This works for a file that you want to create on a Puppet Agent server as long as the content is roughly one line of text. But for a binary file, this will not work (as it cannot appear in the manifest). The "source" reserved word allows you to point to a specific file on the Puppet Master server. The Puppet constructor has three slashes after the colon. Here is an example of the "source" reserved word and the Puppet constructor:
source => puppet:///modules/goodFolder/foo.bar
What is important to know is the following non-obvious facts.
1) The corresponding goodFolder must actually have a subdirectory named "files." This directory "files" is not explicit in the source field declaration.
2) Puppet.conf must have a main section that tells Puppet where to look for the "modules" subdirectory.
[main]
default_modules = /etc/puppet/modules/
3) The path to the modules including the subdirectories (named goodFolder in this example and files itself) must have permissions that allow the Puppet process to access them. This is true of the file foo.bar too.
4) Some subdirectory besides "files" must be in "modules" to house "files." The goodFolder in the example satisfies this.
Once you know these four facts, you can use the valuable source reserved word. On a final note, if the destination of the file in the manifest is configured for a heterogeneous operating system relative to the Puppet Master OS (e.g., c:/temp/foo.bar is the destination of the manifest file transfer yet the Puppet Master is running on Linux or the destination is /tmp/foo.bar and the Puppet Master server is running on Windows), you may get an ignorable error when you compile the manifest. But this is only true if your manifest doesn't specify nodes or classes to ensure Puppet does not attempt to apply a manifest for a non-applicable OS. The caching of the catalog will find that the path "must" be fully qualified when it compiles. Compilation will be successful and the manifest will run when the Puppet Agents connect. So don't be surprised when this ignorable message is displayed.
Update on 12/28/16: For troubleshooting manifests that are not doing what you expect despite no messages or few errors in the logs, see this posting.
Six Puppet Configuration Tips
Deploying Puppet Master and Puppet Agents for the first time can involve a significant amount of troubleshooting. In this post, I want to review six miscellaneous points that may arise. These are somewhat random, but they can serve in the rudimentary stages of quickly getting a proof of concept established.
1. With a default configuration, Puppet Master on Linux will run manifests with only one name in only one location: /etc/puppet/manifests/site.pp
Many DevOps engineers do use manifests with different names. However, absent special configuration, this the only file name and location that will work.
2. A reality of efficient I.T., particularly in non-production environments with open source technologies, is to ignore certain error messages. If you compile a manifest (e.g., with the puppet agent -t site.pp command), you may be able to ignore this subsequent error if it pertains to the Puppet Master FQDN:
"Error: Could not retrieve catalog from remote server: Error 400 on Server: Could not find default node or by name with ..."
3. To find an error in a Puppet Manifest, try this command: puppet parser validate nameOfManifest.pp
It will find errors such as upper case class names. But it will not find an error such as a time when a resource declaration uses "requires => ..." The correct Puppet DSL reserved word for a given resource declaration is "require" with no "s."
4. Network Time Protocol (ntp) must be configured and running on the Puppet Master and Puppet Agent servers. The time difference between a Puppet Master server and Puppet Agent node may seem insignificant to an individual person. To see if ntp is running, try this:
sudo ps -ef | grep ntp
If ntp is not running on a Puppet Agent, manifests will appear to compile and run without errors on either the Puppet Master or Puppet Agent server. Here is how to get ntp to automatically start.
First, go into the /etc/crontab file. Second, add this entry: * * * * * root service ntpd start
Third, save the file and exit. Now ntp will start every minute regardless of who is logged in.
5. /etc/puppet/puppet.conf can, by default, have the same content on the Agent nodes as the Master nodes. One entry should be like this in the [main] section:
server = FQDNofPuppetMasterServer
This tip clarifies how multiple servers may have the same file and how it relates to the inter-server configuration of Puppet.
6. Problem scenario: Facter does not pick up the correct value from a Puppet Agent node with Windows Server.
Solution: Go to the Puppet Agent node. Open PowerShell. Run this: puppet facts
If the result says something like "no default action," go to the Control Panel -> Uninstall Programs. See if Puppet is installed. If it is, verify it says "Puppet Agent." Puppet Master could be installed, but that will not give you facter.
Update on 12/28/16: For troubleshooting manifests that are not doing what you expect despite no messages or few errors in the logs, see this posting.