Monday, May 27, 2019

Quantum Computing


 Spinning coin - curvature is due to CMOS rolling shutter
One of the biggest conceptual and architectural changes to computing is quantum computing - using the nature of quantum mechanics to change the way calculations are done. Modern computers already use quantum mechanics in some fashion due to semiconductors. However, that use is mainly to create logic gates that mirror the gates created with vacuum tubes and relays. In a quantum computer, the logic directly relies on important features from quantum mechanical probabilities: superposition and entanglement of states. Quantum computers are believed to be able to quickly solve problems that the fastest modern computers struggle to solve - the Quantum Supremacy (of quantum computers over classical computers for a type of problem). Theoretical work shows that widely used asymmetric key cryptography would be easily broken meaning secure data like credit card or password information would be insecure. There are limits as quantum computers won't solve problems that normal computers cannot solve now. Richard Feynman, the noted physicist, first mentioned quantum computers in 1959 and in the 80s, Paul Benioff and Yuri Mann developed the ideas for these computers.

Some Background 

Probability is the likelihood of something happening. Flip a coin and the probability of it landing heads up is 1/2. That's the same probability for landing tails up. Added together, the probability for heads or tails is 1 (assuming that the coin won't land and remain stationary on its edge). We think of probabilities as values between 0 and 1. 0 is no chance of an event occurring while a value of 1 indicates it will occur. Usually, the value is somewhere in between. With quantum mechanics, this is more complicated - probability in these systems comes from squaring the probability amplitude. That amplitude can be positive or negative and combining amplitudes that are positive or negative will create regions of stronger or weaker amplitude (it's more interesting as the amplitudes are complex numbers). Square those modified amplitudes to get probabilities. This is how atoms bond together in molecules, liquids or solids - the electrons (quantum mechanical wave-particles) of an atom can have positive and negative amplitudes that combined with electrons of other atoms create bonding and anti-bonding combinations. In mathematical terms, classical probability is like:
Sum(p_i) = 1
where each p_i >= 0.
In quantum mechanical terms, probability is like this:
Sum(|a_i|^2) = 1
where each a_i is a complex number and can be negative

Superposition is an important concept in understanding quantum mechanical systems. It's the concept behind Schrodinger's Cat if you've heard of that. A quantum mechanical system can be composed of multiple states all at the same time. It's like a spinning coin - heads, tails, heads, tails. Only when you measure the system/state or check the coin do you get a definite answer. Measurement causes a collapse of the superposition of states into one state. Before measuring, the 'spinning' or unchecked state of the coin could be seen like this:
|Total State> = (|head> + |tail>)/sqrt(2)
The probability of that state is the square (eliminate the head-tail situation):
1/2Heads + 1/2 Tails = 1 (as the total probability needs to be 1).  
While the measured state would be, if the coin is heads up:
|Total State> = |head>
with probability being 1 and Heads = 1.

Entanglement is when two quantum systems combine so that again there is a way of describing the total system as a sum of the two. Imagine that we had two coins and each must show the opposite of the other; if one is heads, the other has to be tails. These two coins would be entangled - while the coin analogy is simplistic, this does happen with electrons and something called spin. Two electrons can be in the same 'state' along as one is spin up and the other is spin down. (I put 'state' in quotes as the total state of each electron actually includes the spin.) If you measure one of the coins, if you check if it's heads or tails, then you instantly know what the other coin is. Einstein referred to this as a 'spooky action at a distance' (see a description of EPR). Recent experiments have shown that it's true for particles separated by hundreds of miles but are still entangled. Entropy (information theory entropy) can also be used as a measure of entanglement (the more entropy then more entanglement). To say this another way: for entangled think correlated in probability and that outcomes are related. If two probabilities are uncorrelated, then the outcome is simply the combined probability (P_coin1(heads) times P_coin2(tails)). Correlated outcomes aren't that simple and can't be decomposed like that. Entanglement makes it an inseparable whole. If heads is 1 and tails is 0, then the two entangled states would be: |01>+|10>.

If this is a little confusing, Feynman's comments: "safely say that nobody understands quantum mechanics." Which probably isn't true as Feynman understood it, but it's not easy to understand. The important part is that quantum mechanics allows parts of a system to interact in such a way that 'calculations' by the system can use the entangled probabilities to reach a solution much faster than a classical computer.


Quantum Computing

Normal computers have bits, 0s and 1s, in registers or memory and use these to perform calculations with normal digital logic gates. These 0s and 1s are held in memory with electricity and have to be bolstered frequently to keep them at 0 or 1. In a quantum computer, the qubit is the basic bit. As above, the qubit doesn't hold a 0 or 1, but holds a superposition of 0 and 1 - it holds those amplitudes mentioned above. Having a larger number of qubits means that they can be entangled. Entangling them means the group of qubits can hold all possible states at once. Two qubits can hold the states 00, 01, 10, 11 all at once. A classical computer would have to hold these as 4 separate states, requiring more bits. Ultimately, the number of possible solutions that a quantum computer can keep at once scales as 2^n where n is the number of qubits. A normal computer would need 2^n memory for the same number of states. A quantum computer with two qubits allows a superposition of 4 states, three qubits allows 8 states, ..., n qubits allow 2^n states. Qubits hold the state of the interaction while it reduces the memory required.

The result of a quantum calculation with n qubits would only return an n 'bit' answer even though the calculation ran over 2^n bit space. Again, with n bits in a classical computer, n amount of information can be encoded; a quantum computer will span 2^n space, but only return n bits of information.
This rapid increase in possible states and the ability to cover all of the states is what leads to the potential speed up in quantum computers.

Calculations are done by 'loading' the information into the qubits, influencing their state, much like setting the state of memory in a normal computer even if the method is very different. Depending on the type of quantum computer, this loading is done with microwaves or lasers. Operations are reversible and operate on the quantum mechanical analog of a normal n-bit register; classical computer gates are not reversible - you can't know the input from the output. (In normal computers, the NOT gate is reversible; also normal computers could be built with Toffoli gates which can be reversed using extra memory.) One of the most well-known quantum computing gates is the Hadamard gate which transforms specific states into a superposition of states and back again. As an example, in a classical computer running a probability calculation (random/stochastic/probability/transition matrices), the goal is to preserve that the sum of the probabilities = 1. In a quantum computer, operations like the Hadamard gate (unitary matrices) preserve that sum of squares of the amplitude = 1. Positive and negative amplitudes combine to increase or decrease the probabilities of some outcomes while keeping the overall probability of an outcome correct. The entanglement of the qubits means that affecting one affects all - in the two coins example above, setting one coin to heads meant the other was tails. Another analogy is pulling a red ball out of a bag of red and green balls - a normal computer would pull one, but quantum entanglement means that pulling one is like pulling two or more out.

Again, the quantum computer qubits hold all values until measured; a classical computer would hold one value in the same number of bits. Measurements are then done (again with microwaves or lasers) to 'collapse' the 2^n states into one that is the result with n bits of info (the original number of qubits). Measurement is not reversible.

Issues with Quantum Computers
Due to the nature of quantum computing, the outcomes are probabilistic - only provide a valid solution with some probability. This is true in normal computers, too, but the effort over decades of work has been to reduce the error to near 0 so that we trust the outcome. Almost no one thinks about error correction when using a normal computer these days. Error correction in quantum computing is in its infancy. The computers have to be kept super cold and isolated as much as possible as any noise could affect the calculation. Realistically, calculations would need to be run several times to make certain of the result. The errors aren't just value errors as in a classical computer, but errors in the way the qubits interact. While the quantum computer is built to have its qubits interact with each other as much as possible (entanglement), it's also built to keep it from interacting with the environment as little as possible! The noise can cause decoherence which means the qubits stop interacting as one group. Environmental issues like heat, noise, electromagnetic or other radiation all work against the system. The redundancy of calculations for error corrections have been estimated to be as high as hundreds, thousands or even millions of times; fixing this is a key area of research. A recent advance has reduced the number of qubits needed for cracking 2048 PKI and overcome errors from billions to millions. Scaling the lifetime of the qubits before decoherence - IBM in 2017 achieved 90 microseconds for 50 qubits.

Scaling quantum computers is another challenge not just to eliminate environmental factors. They need to be scaled in coherence time and number of qubits as well as programming techniques. A small number of qubits easily fit close to each other, but when the number of qubits grows very large, distances from each other become an issue. In terms of programming, the same challenge facing classical computers exists - simple hardware with complex languages or complex hardware with simple languages (or the worst of both). The difficulty with quantum computers is the complexity of understanding the operations. A computer with n qubits might require n^3 logic 'gates' or operations to achieve the results.

Current Work

There are two main types of quantum computers now: ion/atom traps and superconducting circuits. Ion traps use lasers to write, read and trap the ions suspended in the electromagnetic fields. The ions in the computer interact in a way that you could think as vibrating together. Similarly, atoms can be trapped in an optical lattice.



The other main type is superconductor based. These are often modeled on superconduction quantum interference devices (SQUIDs) small rings of current. Oscillating microwaves to set and read values. The devices are connected together via radio frequency channels which could be part of a printed circuit board. A company called D-Wave have been working on quantum computers for years. They structure the computer to solve the problem so requires almost no code; their computers don't connect all qubits together and can't solve all types of problems. Their computers resembling annealing systems using transverse tunneling fields (analogous to temperature in real annealing). Google and IBM are working on more generic computers that link all qubits. In 2017, IBM has produced systems with up to 50 qubits. In 2018, Google has raised that to 72 qubits. Compared to classical computers doing simulated annealing or quantum Monte Carlo calculations, Google says quantum annealing is 100M times faster. IBM offers a test quantum computing environment if you want to sign up.

Quantum supremacy is when a quantum computer is able to solve a classical computer science problem faster than a supercomputer (or any classical computer). It's uncertain how large a quantum computer will have to be to achieve this, but it's expected soon by some, others doubt it will happen. To be clear, quantum computers will solve the same types of problems that classical computers can, but they'll solve them much faster. They're both Turing machines. Those problems a classical computer can't solve like the halting problem or NP-complete problems will be unsolvable by a quantum computer, too. In acronyms, the quantum computer is BQP bounded error, quantum, polynomial time set thought to be bigger than BPP bounded error, probabilistic polynomial time set. In particular, quantum computers should be good at factoring large numbers, calculating optimized paths (like courier routes), quantum search (Grover's algorithm), and physics and chemistry problems around the quantum nature of the universe as nature isn't 'classical' it's quantum mechanical.

With the ability to factor large numbers quickly, quantum computers will crack traditional cryptography easily. In 1994, Shor developed an algorithm that can quickly factor large numbers or discrete logarithms. This algorithm based on  modular exponentiation and quantum Fourier transforms can find the factors of public key cryptography (PKI for public key infrastructure) and render PKI insecure. In 2016, NIST requested submissions of quantum-resistant cryptography algorithms. Currently, 26 algorithms are in the later stages of evaluation. The key weakness of PKI is the asymmetric key portion which relies on large prime numbers or discrete logarithms being difficult for classical computers to factor. Symmetric keys are much more resistant to quantum computing, but Grover's algorithm do make them susceptible - thankfully, doubling the symmetric key size is often enough in a post-quantum world.

Some approaches to post-quantum cryptography include one time keys (hashes or OTP), simultaneous/multivariate equations, lattice-based and error correcting codes. All rely on the difficulty of decoding the information without all the input or the time it takes to decode. Current PKI also relies on the hardness of solving problems. It's possible a classical computer can crack a PKI key as well, but the algorithms have been designed to make it take thousands of years.

The drive to improve quantum computers is growing as being the first with a viable quantum computer is important financially and strategically. Microsoft has entered the race with Google, IBM, D-Wave and researchers around the world. There's a course to learn to program for quantum computers from Microsoft and Google, IBM has a system online for learning more,  Google has it's own playground and scripting language, Microsoft has announced a language Q#, and researchers at labs are driving this forward.

Sunday, April 28, 2019

Sizes of Large Directories with Gluster, SSHFS, NFS

This post covers a few things: checking the number of entries or rough sizes of a directory, a look at the behavior of Gluster, NFS, SSHFS, and SFTP for remote directory sizes, and some info on mounting remote file systems using NFS or SSHFS (Gluster is a topic for another day).

First, how to check the rough size of a directory. We know that checking the number of files in a large directory locally can be slow. Doing that check over a remote mount like Gluster can be much slower and even cause the Gluster mount to crash.
Generally, under Linux/Unix, you can get a rough estimate of the size of a directory by looking at the output of ls -l in the parent directory of that directory. For example, let's use a test directory to check behavior: ~/test. We've added two subdirectories dir_small with no files in it and dir_large with 5000 files in it. ls -l ~/test gives:
ls -l ~/test
drwxrwxr-x. 2 jm jm 143360 Apr 26 16:41 dir_large
drwxrwxr-x. 2 jm jm   4096 Apr 26 16:56 dir_small

Here dir_small has the smallest size possible and dir_large is larger due to the 5000 files in it. Remember that in Linux/Unix, a directory is just a special type of file that keeps information about the list of files in that directory. 

In order to count the files or just a guide as to which is the largest directory, the safest options are:
ls -l ../ # only a rough guide, check parent directory to see directory 'size'
find dir_large -type f -print | wc -l # lists files one by one
echo * | awk -F ' ' '{print NF}' # provides the list of files in one line
ls | wc -l # performs more ops than echo * (as checked by strace)
If you think you have a large directory, don't do ls -l as that will stat each file requiring far more ops and time.
 
I point all of this out as this was tried using Gluster with some issues. In case you don't know, Gluster or GlusterFS is a clustered file system. We've used it for stand alone clusters within our data centers (DC). It can do cross DC replication although we've found too many problems so don't use that feature.

We had an issue the other day with a system mounting a Gluster volume and we wanted to check on files in a large collection of directories.

Interestingly, Gluster's directory sizes don't show the usual file size info. For the same directories as above, it showed
drwxrwxr-x. 2 jm jm   4096 Apr 26 16:41 dir_large
drwxrwxr-x. 2 jm jm   4096 Apr 26 16:56 dir_small


Running other commands like ls and find can be painful with a large Gluster volume. Some have given it a reputation for being almost unusable for certain interactive operations like ls, find, du, etc. Ultimately, the solution was to use find and it's one file at a time checking; the alternative is to run this commands on the Gluster server rather than the client system.

Since Gluster has a different meaning for directory size, we thought we'd see whether that is Gluster or the FUSE system that it uses. Interesting, the Gluster brick, the file system being exported for remote mounting, shows the normal directory size. That size would be expected as it's a normal Linux file system. We also tried NFS which passes through the usual directory size:
drwxrwxr-x. 2 jm jm 143360 Apr 26 16:41 dir_large
drwxrwxr-x. 2 jm jm   4096 Apr 26 16:56 dir_small

NFS doesn't use FUSE though so this only confirms that normal directory sizes are passed through with other remote file systems. 

We also tried SSHFS which uses FUSE for mounting. Just like NFS, SSHFS showed the right directory sizes:
drwxrwxr-x. 2 jm jm 143360 Apr 26 16:41 dir_large
drwxrwxr-x. 2 jm jm   4096 Apr 26 16:56 dir_small

By the way, SFTP also showed the normal directory sizes, but you might assume that by now.

So, this looks like Gluster is the issue here and has chosen to re-interpret the meaning of a directory size.

Information on mounting NFS and SSHFS file systems
Instructions for SSHFS:
SSHFS isn't installed on all systems. On my test system (Fedora 29), I needed to install with
sudo dnf install fuse-sshfs
# make a mount point
mkdir ~/test/mount_point
# to mount, do:
sshfs remote_user@remote_host:/remote_directory ~/test/mount_point
# to unmount, do:
 fusermount -u ~/test/mount_point

Instructions for NFS:
#edit the export file and add your export directory
vi /etc/exports
# add  /home/jm/test localhost(ro) *.local.domain(ro)
#start nfs-server
sudo systemctl start nfs-server
sudo exportfs -a
# check the exported file system is available from the local or client system:
showmount -e test-nfs-server.local.domain
showmount -e localhost
# make a mount point
mkdir /tmp/mount_point
sudo mount -t nfs test-nfs-server.local.domain:/home/jm/test /tmp/mount_point
 #to unmount, use the usual umount:
sudo umount /tmp/mount_point
# on the nfs server, un-export the fs:
sudo exportfs -ua
#stop nfs if you want:
systemctl stop nfs-server
 

Monday, April 22, 2019

MySQL Galera Split Brain

The Galera cluster option for MySQL is one advantage MySQL has over Postgres. The clustering allows high availability and good performance. However, it's not without its issues and one or two of those is the split-brain problem. There can be two different kinds of split brain: poorly configured clusters that can't achieve a quorum for the split you're worried about and, more rarely, an update issue with the nodes which is interesting if frustrating.

MySQL Galera clusters have worked well and provided good uptime. The normal configuration is to have more active nodes in the primary location or data center. In the event of the link between the primary and secondary sites failing, the primary cluster should continue to run. It will continue running provided it has the majority of quorum votes on its own. It is important, therefore, to make sure that the quorum is achievable without the secondary location. One option is to keep the number of active nodes higher in the primary location. Another is to adjust the pc.weight of each node to make sure that the weight is larger in the primary location. See the Galera docs about setting the weight of a node. Either of these options makes the primary location safe from failures of the other locations, but still presents a problem if the primary data center has an issue. The remaining option is to use a 3rd location or witness to break ties or provide a quorum - you could do that with your own software, with a full set of servers in a 3rd location or use Galera's own solution. Galera's solution is garbd, the Galera Arbitrator, which acts as a witness or voting system when you only really have two main locations.

The second split brain issue is more interesting - i.e. it isn't a simple configuration or quorum issue. In this one, a Galera cluster shuts down on its own after detecting an issue. All of the active nodes except one would report something like this:

 "Duplicate entry 'entry_value_being_inserted' for key 'Key_for_column', Error_code: 1062 "
It might include "handler error HA_ERR_FOUND_DUPP_KEY" as well.

The issue here is that the Galera replication has pushed updates to every node. The replication pushes the change synchronously, but applies asynchronously - flow control is used to prevent nodes from getting too far behind - see the Galera docs (and first sentence) ('commits asynchronously' from these Galera docs). As you'd expect, it's RBR - row-based replication, not statement-based. What happens here is a case of a node falling behind. Each of the other nodes see an inconsistency and ,in order to protect the cluster, shut themselves down. Unfortunately, this could be every node shuts down except the one lagging, inconsistent node. With only one node active, Galera will realize it dosen't have the majority of votes to maintain the cluster and shuts the remaining node down. In order to recover the cluster, you need to find the last node that was running and start it with the bootstrap option. Then start every other node as normal. This issue doesn't happen often based, but it's good to understand it and how to recover it when it does happen. By the way, there are some related issues that could do the same so see this link for how to recover: this for fixing this and related issues.

Saturday, April 20, 2019

ActiveMQ Network of Brokers Again

Network of Brokers with ActiveMQ
In the past, I've written about the ActiveMQ network of brokers and some of the issues that can come up (see for example: https://www.randomparallels.com/2012/07/network-of-brokers-revisited.html).

That was some years ago, so what is it like now? Overall, it's very good and reliably works between different data centers over a WAN link. It also provides for an important use case: the ability to write to one location and read anywhere. This is write-read is done via the 'local' broker to the apps so that if both the producer and consumer are in the same location, it's done 'locally' without having to commit to all brokers (or a quorum) everywhere first. That fact can have important performance benefits. However, there does seem to be a limit to the number of queues that can be bridged across a network of brokers - I don't know the number or what affects it, but assume a few hundred at most. The good news is that you can have a number of network connectors between two brokers and that means you can have much more than 100s of queues bridged between the same pair of brokers. The key is to create new networkConnectors as needed to handle groups of related (or similarly named) queues or topics. Here's an example of different groups of queues per networkConnector within the networkConnectors part of the ActiveMQ configuration xml:

Here's an example of the network connector config:
<networkConnectors>
        <networkConnector name="QUEUE_SET_1" duplex="true" uri="static:(tcp://192.x.x.x:61616)">
                <dynamicallyIncludedDestinations>
                        <queue physicalName="stock.nasdaq.>" />
                </dynamicallyIncludedDestinations>
        </networkConnector>
        <networkConnector name="QUEUE_SET_2" duplex="true" uri="static:(tcp://192.x.x.x:61616)">
                <dynamicallyIncludedDestinations>
                        <queue physicalName="stock.nyse.>"/>
                </dynamicallyIncludedDestinations>
        </networkConnector>
  </networkConnectors>


Splitting out groups of queues like this allows the brokers to bridge more queues (or topics, the same applies for them) between the brokers without any issues. For another example, look at the Duplex Connector example on the ActiveMQ docs or in this article.

Sunday, March 3, 2019

Java vs C Benchmarks


(Written a few years ago, but moving to my main blog - original date March 2014)
Benchmarks are much like statistics - so easy to mislead people about what they mean.  They are useful to be able to compare systems, programs, etc, and it is not uncommon when a new system is available, that everyone wants to try out their app to see how fast it will be. This was common at universities where the latest big system needed to be tested with every individual's idea of the 'most important' programs. Of course, the benchmark is really a factor of a number of changes - hardware, compiler, configuration, libraries, etc.

A common benchmark these days is between computer languages. It used to be that C and FORTRAN were always considered the fastest languages with C++ close behind.  With the increase in the number of languages and investment in making ones like Java much faster than the first iteration (Java 1.1 was about as fast as bash for running a program, but now it is close to C in many benchmarks), comparisons between languages are extra fuel on the fire of programming language wars/discussions.

There is a great site for such comparisons: Computer language benchmark games (previously shootout)

This shows that C, C++, and now Rust instead of FORTRAN still generally hold the favored positions although not for every micro-benchmark. Sometimes Java or another language will win out. The benchmark games also include source code and compile commands if you want to repeat it yourself.

Another fun tool is SciMark2 a Java benchmarking tool that also has a C version.

Running these two versions on one of my older computers (AMD Athlon II 3200) showed that Java (open_jdk_1.7 in this case) won by a small margin even when using profiling and various GCC compiler options for performance.  This result surprised me since the Java JVM is in C++ and seems to be compiled with GCC/G++ in this case - ok, yes, the JVM can do JIT (just in time compilation - runtime based optimizations), but to actually beat C was surprising. (Not all JVMs are written in the same language - some are C, an early IBM version was in Smalltalk. GCC is written in C++ since 2012, so, I'm using C/C++ a little interchangeably.)

The slowest C benchmark was the SOR method with the C version getting ~560 and the Java version about 780 - the other micro-benchmarks were either in C's favor or equal.  Looking at the SOR.c file, it had a simple loop over three vectors in a 2D matrix/array.  There was little to improve on ... at first sight and I suspect this is why Java was winning as it was doing optimizations that GCC wouldn't produce despite options such as -funroll -loops.  Manually unrolling the loop by a factor of two so that every loop iteration performed two SOR steps raised the C version performance to 3200 and unrolling the loops to 4 steps per iteration raised the performance to 4800. The lesson is clear that with a little experience in optimizing C, turning a well-written piece of code into a fast piece of code can be quite easy.  Did loop unrolling make a difference with Java? Yes, but only a little.  The Java version was improved up to 800 (from ~780), but that's a small difference and hardly worth the tuning effort as Java really did do most of the work for you.

Now, what does this tell you - C is probably still the champion after all a Java program is converted to Java bytecode which is then run on a JVM which itself is a C++ program - to make Java truly faster than C/C++ is to make C/C++ faster than C/C++.  The difference is that the C++ program called the JVM puts more effort into optimizing itself than does 'run of the mill' C (and GCC compilations).  If you know and are willing to optimize C, you'll most likely get some benefit, if not Java may be a good choice!

What does all of this tell you - that benchmarks are easily twisted to suit what you want to say if you're just willing to find the right one.  Anyway, have some fun with the benchmarks and results above to tell your own story.

Thursday, January 3, 2019

Ansible Examples

Ansible is an excellent tool for system automation. It’s similar in some ways to Puppet, Chef, and Salt. However, unlike Puppet and Chef, Ansible doesn’t require a local agent to be running on the system being managed. This is an advantage as it saves time and initial configuration (of course, there are reasons that you might want a local agent running…). By the way, the name Ansible comes from the book Rocannon’s World by Ursula K LeGuin.

Ansible uses the idea of playbooks to coordinate activities - a playbook might have several plays and they could be chosen individually or in some combination. Plays contain tasks which contain modules. Roles are another fundamental concept in Ansible and refer to the way a system might be used or configured. Thirdly, there are hosts. So, a playbook like “webservers.yml” would use the “webservers” role to make the “webservers” hosts (the target hosts) into web servers. You might also have a “base” role that you’d want to be applied to the “webservers” first to get them into a basic configuration.

Let’s start with listing out the hosts that we want to manage. Create an ansible directory and cd into it:
mkdir ansible && cd ansible

Create and edit a host file and put in entries like this:
192.168.0.22
localhost

That adds two hosts to our list - localhost and another server at 192.168.0.22

To make sure that’s right, run this command to list the hosts:
ansible all --list-hosts -i ./hosts #list all hosts using the hosts file

Normally, Ansible likes to use /etc/ansible/hosts as its file, but I’d like to keep it local to the work.

To make sure that everything can communicate correctly, try running this:
ansible all -m ping -i ./hosts #this runs an ad hoc ansible command using the ping module
Output:
192.168.0.22 | SUCCESS => {
       “ansible_facts”: {
                “discovered_interpreter_python”: “/usr/bin/python”
       },
       “changed”: false,
       “ping”: “pong”
}
and another (uses the default command module):
ansible all -a /bin/date -i ./hosts #this runs a single command rather than a module

By the way, if you haven’t set up your SSH keys and want to use a simple username and password pair, try this:
ansible all -a /bin/date -i ./hosts -u username -k #set username to the username you want to use

When the remote system doesn’t have Python installed, it's useful to use the raw module:
ansible all -m raw -a “ls -l”

One that shows all of the ansible_facts visible by default is:
ansible all -m setup -i ./hosts #loads of info (use as {{ ansible_facts[‘node’] }}

Ansible also has a flag to test the actions without running them: -C. The above example with /bin/date
ansible all -C -a /bin/date -i ./hosts
Instead of connecting to the remote system and showing the date, it produces:
192.168.0.22 | SKIPPED

Let’s update the hosts file to group the host(s).
---
[cloudserver]
192.168.0.22
[local]
locahost

Test this if you want by running something like this:
ansible cloudserver -m ping -i ./hosts

One-off commands are fun, but the power of ansible comes from the playbooks. Here’s a simple playbook, let’s call it simple_playbook.yml
---
- name: check ping and date
   hosts: cloudserver
   remote_user: username
   vars:
            file_value: “Here are the file contents”
      
   tasks:
      - name: get date
           command: /bin/date

       - name: copy file to remote server using template
         template: src=template/play_file.j2 dest=./play_file.txt

        - name: list files
          shell: ls -l play_file.txt

         - name: start http
           service: name=httpd state=running

Create a local directory called template and add a file to it called play_file.j2. In that file, add a Jinja2 variable placeholder:
{{ file_value }}

Run this playbook:
ansible-playbook simple_playbook.yml -i ./hosts -k #adding the -k option in case SSH keys aren’t set

If you want to check the syntax first, run this:
ansible-playbook simple_playbook.yml -i ./hosts --syntax-check

There’s also a --verbose option that will add much more detail to the output.

That is it; a short set of examples of Ansible.

Thursday, March 30, 2017

When Apache webserver won't start & semaphores

Our alerting system started telling us one of our Apache systems (Graphite front end) wasn't running. After a while I joined in the diagnosis of what why it still wasn't running and how to get it running again. Trying to restart Apache only produced this:

service httpd start  ; tail -f error_log
Starting httpd:                                            [  OK  ]
[Tue Mar XX 06:21:41 2016] [notice] Digest: done
Configuration Failed
[Tue Mar XX 06:23:33 2016] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Tue Mar XX 06:23:33 2016] [notice] Digest: generating secret for digest authentication ...
[Tue Mar XX 06:23:33 2016] [notice] Digest: done
Configuration Failed
[Tue Mar XX 06:24:39 2016] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Tue Mar XX 06:24:39 2016] [notice] Digest: generating secret for digest authentication ...
[Tue Mar XX 06:24:39 2016] [notice] Digest: done
Configuration Failed

It was looking fine for a fraction of a second and then died with "Configuration Failed". A quick check of apachectl -t showed that the conf file was fine and that no one had changed it, so it wasn't referring to that. I also checked directory permissions, disk space and ionodes available - all looked fine. One option was that we could move all of the conf/vhosts files and try a clean start, but that wouldn't explain the problem.

Going deeper instead, I ran strace (it's possible I could have used httpd -e DEBUG instead of strace):

strace -f -o /tmp/apache.trace /usr/sbin/httpd

34583 read(10, "FGF95a\t\timage/unknown\n#\n# GRR 95"..., 4096) = 4096
34583 read(10, " The contributor claims:\n#   I c"..., 4096) = 851
34583 read(10, "", 4096)                = 0
34583 close(10)                         = 0
34583 open("/etc/httpd/conf/magic", O_RDONLY|O_CLOEXEC) = 10
34583 fcntl(10, F_GETFD)                = 0x1 (flags FD_CLOEXEC)
34583 fcntl(10, F_SETFD, FD_CLOEXEC)    = 0
34583 read(10, "# Magic data for mod_mime_magic "..., 4096) = 4096
34583 read(10, "o figure out what's inside.\n\n# s"..., 4096) = 4096
34583 read(10, "FGF95a\t\timage/unknown\n#\n# GRR 95"..., 4096) = 4096
34583 read(10, " The contributor claims:\n#   I c"..., 4096) = 851
34583 read(10, "", 4096)                = 0
34583 close(10)                         = 0
34583 write(2, "[Tue Mar 07 00:29:06 2017] [noti"..., 92) = 92
34583 open("/dev/urandom", O_RDONLY)    = 10
34583 read(10, "#\27\356'm[7T\266\265\373\374\203\16/_\375\236\10\200", 20) = 20
34583 close(10)                         = 0
34583 write(2, "[Tue Mar 07 00:29:06 2017] [noti"..., 49) = 49
34583 mmap(NULL, 500008, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0x7f1cd2f02000
34583 semget(IPC_PRIVATE, 1, IPC_CREAT|0600) = -1 ENOSPC (No space left on device)
34583 write(2, "Configuration Failed\n", 21) = 21
34583 select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
34583 close(9)                          = 0
34583 close(8)                          = 0
34583 close(7)                          = 0
34583 munmap(0x7f1cd2f02000, 500008)    = 0
34583 close(6)                          = 0
34583 close(5)                          = 0
34583 munmap(0x7f1cc8690000, 2248032)   = 0
34583 munmap(0x7f1cc82ea000, 3823040)   = 0
....

That looked weird. These two lines were interesting:
34583 semget(IPC_PRIVATE, 1, IPC_CREAT|0600) = -1 ENOSPC (No space left on device)
34583 write(2, "Configuration Failed\n", 21) = 21

Checking ipcs -s showed a large number of apache owned semaphores still in the system. Since apache wasn't running and wouldn't run, we decided to clear these down with a simple:
ipcs -s | grep apache | awk '{print $2}' | while read id
do
   ipcrm sem "$id"
done

This cleared up the semaphore list and allowed apache to restart. Under normal working conditions, apache only uses 2-4 semaphores. Looking on the web, this appears to be related to abrupt stops of Apache. Off to check if the team had been doing this and not realizing :)
Hope this helps anyone else trying to solve this problem.