Arrfab's Blog Linux tips and tricks …

31Oct/13Off

Debug for the winners !

Recently I had to dive back into Ansible playbooks I wrote (quite) some time ago. I had to add some logic to generate different application templates based on facts/packages being installed on the managed nodes. Long story short (I'll not describe the use case here as it's quite complex), I decided that injecting directly some kind of logic in the Jinja2 templates was enough .. but not.

Let's take a very simplified example here (don't even look at the tasks but rather at the logic explained how to get there, once again this is a 'stupid' playbook) :

---
- hosts: localhost
connection: local
user: root
vars:
- myrole: httpserver

tasks:
- name: registering a variable only if myrole is httpserver
command: /bin/rpm -q --qf '%{version}' httpd
register: httpd_version
when: myrole == 'httpserver'
- name: pushing the generated template
template: src="/img/spacer.gif">

handlers:

Now let's have a look at the (very) simple logic.txt.j2 :

{% if httpd_version is defined -%}
You're using an Apache http server version : {{ httpd_version.stdout }}
{% else %}
You're not using an http server, or not defined in the ansible machine role
{% endif -%}

Easy, and it seems it was working when myrole was indeed httpserver :

cat /tmp/logic.txt
You're using an Apache http server version : 2.2.15

But things didn't work as expected when myrole was something else, like for example dbserver

TASK: [registering a variable only if myrole is httpserver] *******************
skipping: [localhost]

TASK: [pushing the generated template] ****************************************
fatal: [localhost] => {'msg': "One or more undefined variables: 'dict' object has no attribute 'stdout'", 'failed': True}

hmm, as the register: task was skipped, I was wondering why it then complained about the httpd_version.stdout as I thought that httpd_version wasn't defined .. but I was wrong  : even if 'skipped' the variable exists for that host. I quickly discovered it when adding a debug task in between the other tasks in my playbook :

- debug: msg="this is http_version value {{ httpd_version }}"

Now let's see what can be wrong :

TASK: [debug msg="this is http_version value {{httpd_version}}"] **************
ok: [localhost] => {"msg": "this is http_version value {u'skipped': True, u'changed': False}"}

Very interesting : so even when skipped, the variable httpd_version is still "registered" by the register: feature but marked as skipped.

Let's so change our "logic" in the Jinja2 template then ! :

{% if httpd_version.skipped -%}
You're not using an http server, or not defined in the ansible machine role
{% else %}
You're using an Apache http server version : {{ httpd_version.stdout }}
{% endif -%}

And now it works in all cases ..

It's a (very,very,very) simplified example, but you get the idea and using the debug module (don't forget to call ansible-playbook with -vvv to see those messages too !) can quickly show you where your issue is when having to troubleshoot something. As Patrick Debois was saying : "you gotta love Ansible for its simplicity" spacer

Filed under: Ansible, CentOS, SysAdmin Comments Off
23May/13Off

Rolling updates with Ansible and Apache reverse proxies

It's not a secret anymore that I use Ansible to do a lot of things. That goes from simple "one shot" actions with ansible on multiple nodes to "configuration management and deployment tasks" with ansible-playbook. One of the thing I also really like with Ansible is the fact that it's also a great orchestration tool.

For example, in some WSOA flows you can have a bunch of servers behind load balancer nodes. When you want to put a backend node/web server node in maintenance mode (to change configuration/update package/update app/whatever), you just "remove" that node from the production flow, do what you need to do, verify it's up again and put that node back in production. The principle of "rolling updates" is then interesting as you still have 24/7 flows in production.

But what if you're not in charge of the whole infrastructure ? AKA for example you're in charge of some servers, but not the load balancers in front of your infrastructure. Let's consider the following situation, and how we'll use ansible to still disable/enable a backend server behind Apache reverse proxies.

spacer

So here is the (simplified) situation : two Apache reverse proxies (using the mod_proxy_balancer module) are used to load balance traffic to four backend nodes (Jboss in our simplified case). We can't directly touch those upstream Apache nodes, but we can still interact on them , thanks to the fact that "balancer manager support" is active (and protected !)

Let's have a look at a (simplified) ansible inventory file :

[jboss-cluster]

jboss-1

jboss-2

jboss-3

jboss-4

[apache-group-1]

apache-node-1

apache-node-2

Let's now create a generic (write once/use it many) task to disable a backend node from apache ! :

---
##############################################################################
#
# This task can be included in a playbook to pause a backend node
# being load balanced by Apache Reverse Proxies
# Several variables need to be defined :
#   - ${apache_rp_backend_url} : the URL of the backend server, as known by Apache server
#   - ${apache_rp_backend_cluster} : the name of the cluster as defined on the Apache RP (the group the node is member of)
#   - ${apache_rp_group} : the name of the group declared in hosts.cfg containing Apache Reverse Proxies
#   - ${apache_rp_user}: the username used to authenticate against the Apache balancer-manager
#   - ${apache_rp_password}: the password used to authenticate against the Apache balancer-manager
#   - ${apache_rp_balancer_manager_uri}: the URI where to find the balancer-manager Apache mod
#
##############################################################################
- name: Disabling the worker in Apache Reverse Proxies
local_action: shell /usr/bin/curl -k --user ${apache_rp_user}:${apache_rp_password} "https://${item}/${apache_rp_balancer_manager_uri}?b=${apache_rp_backend_cluster}&w=${apache_rp_backend_url}&nonce=$(curl -k --user ${apache_rp_user}:${apache_rp_password} https://${item}/${apache_rp_balancer_manager_uri} |grep nonce|tail -n 1|cut -f 3 -d '&'|cut -f 2 -d '='|cut -f 1 -d '"')&dw=Disable"
with_items: ${groups.${apache_rp_group}}

- name: Waiting 20 seconds to be sure no traffic is being sent anymore to that worker backend node
pause: seconds=20

The interesting bit is the with_items one : it will use the apache_rp_group variable to know which apache servers are used upstream (assuming you can have multiple nodes/clusters) and will play that command for every host in the list obtained from the inventory !

We can now, in the "rolling-updates" playbook, just call the previous tasks (assuming we saved it as ../tasks/apache-disable-worker.yml) :

---

- hosts: jboss-cluster

serial: 1

user: root

tasks:

- include: ../tasks/apache-disable-worker.yml

- etc/etc ...

- wait_for: port=8443 state=started

- include: ../tasks/apache-enable-worker.yml

But Wait ! As you've seen, we still need to declare some variables : let's do that in the inventory, under group_vars and host_vars !

group_vars/jboss-cluster :

# Apache reverse proxies settins
apache_rp_group: apache-group-1
apache_rp_user: my-admin-account
apache_rp_password: my-beautiful-pass
apache_rp_balancer_manager_uri: balancer-manager-hidden-and-redirected

host_vars/jboss-1 :

apache_rp_backend_url : 'https://jboss1.myinternal.domain.org:8443'
apache_rp_backend_cluster : nameofmyclusterdefinedinapache

Now when we'll use that playbook, we'll have a local action that will interact with the balancer manager to disable that backend node while we do maintainance.

I let you imagine (and create) a ../tasks/apache-enable-worker.yml file to enable it (which you'll call at the end of your playbook).

Filed under: Ansible, CentOS, Cluster, Linux, SysAdmin Comments Off
30Oct/12Off

Using Openssh as transport for Ansible instead of default paramiko

You've probably read that Ansible uses by default paramiko for the SSH connections to the host(s) you want to manage. But since 0.5 (quite some ago now ...) Ansible can use plain openssh binary as a transport. Why ? simple reasons : you sometimes have complex scenario and you can for example declare a ProxyCommand in your ~/.ssh/config if you need to use a JumpHost to reach the real host you want to connect to. That's fine and I was using that for some of the hosts i have to managed (specifying -c ssh when calling ansible, but having switched to a bash alias containing that string and also -i /path/to/my/inventory for those hosts).

It's great but it can lead to strange results if you don't have a full look at what's happening in the background. Here is the situation I just had yesterday : one of the remote hosts is reachable, but not a standard port (aka tcp/22) so an entry in my ~/.ssh/config was containing both HostName (for the known FQDN of the host I had to point to, not the host i wanted to reach) and Port.

Host myremotehost
HostName my.public.name.or.the.one.from.the.bastion.with.iptables.rule
Port 2222

With such entry, I was able to just "ssh user@myremotehost" and was directly on the remote box. "ansible -c ssh  -m ping myremotehost" was happy, but in fact was not reaching the host I was thinking : running "ansible -c ssh -m setup myremotehost -vvv" showed me that ansible_fqdn (one of the ansible facts) wasn't the correct one but instead the host in front of that machine (the one declared with HostName in ~/.ssh/config). The verbose mode showed me that even if you specify the Port in your ~/.ssh/config, ansible will *always* use port 22 :

<myremotehost> EXEC ['ssh', '-tt', '-q', '-o', 'AddressFamily=inet', '-o', 'ControlMaster=auto', '-o', 'ControlPath=/tmp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'Port=22', '-o', 'User=root', 'myremotehost', 'mkdir -p /var/tmp/ansible-1351603527.81-16435744643257 && echo /var/tmp/ansible-1351603527.81-16435744643257']

Hmm, quickly resolved : a quick discussion with people hanging in the #ansible IRC channel (on irc.freenode.net) explained the issue to me : Port is *never* being looked at in your ~/.ssh/config, even when using -c ssh. Solution is to specify the port in your inventory file, as a variable for that host :

myremotehost ansible_ssh_port=9999

In the same vein, you can also use ansible_ssh_host , this one corresponding to the HostName of your  ~/.ssh/config.

Hope that it can save you time, if you encounter the same "issue" one day ...

Filed under: Ansible, CentOS, SysAdmin Comments Off
26Oct/12Off

Ansible as an alternative to puppet/chef/cfengine and others …

I already know that i'll be criticized for this post, but i don't care spacer . Strangely my last blog post (which is *very* old ...) was about a puppet dashboard, so why speaking about another tool ? Well, first i got a new job and some prerequisites have changed. I still like puppet (and I'd even want to be able to use puppet but that's another story ...) but I was faced to some constraints when being in front of a new project. For that specific project,  I had to configure a bunch of new Virtual Machines (RHEL6) coming as OVF files. Problem number one was that I can't alter or modify the base image so i can't push packages (from the distro or third-party repositories). Second issue is that I can't install nor have a daemon/agent running on those machines. I had a look at the different config tools available but they all require either a daemon to be started, or at least having extra packages to be installed on each managed node. (so not possible to have puppetd nor puppetrun or invoke puppet directly through ssh , as puppet can't even be installed, same for saltstack). That's why i decided to give Ansible a try. It was already on my "TO-test" list for a long time but it seems it was really fitting the bill for that specific project and constraints : using the 'already-in-place' ssh authorization, no packages to be installed on the managed nodes, and last-but-no-least, a learning curve that is really thin (compared to puppet and others, but that's my personal opinion/experience).

The other good thing with Ansible is that you can start very easily and then slowly add 'complexity' to your playbooks/tasks. I'm still using for example a flat inventory file, but already organized to reflect what we can do in the future (hostnames included in groups, themselves included in parents groups - aka nested groups). Same for the variables inheritance : at the group level and down to the host level, host variables overwriting those defined at the group level , etc ...)

The Yaml syntax is really easy to understand so you can have quickly your first playbook being played on a bunch of machines simultaneously (thanks to paramiko/parallel ssh). The number of modules is less than the puppet resources, but is quickly growing. I also just tested to tie the execution of ansible playbook with Jenkins so that people not having access to the ansible inventory/playbooks/tasks (stored in a vcs, subversion in my case) can use it from a gui.. More to come on Ansible in the future

Filed under: Ansible, CentOS, Cluster, Linux, SysAdmin Comments Off
   

Blogroll

  • Belgian FLOSS People
  • Planet CentOS
  • Planet HA/Cluster

CentOS promotion

  • spacer www.centos.org

RPMS

  • My RPMS

Categories

Tags

Ansible CentOS Cluster Fun Linux linux-ha Monitoring Nokia RPMforge SysAdmin Uncategorized Virtualization Xen Zabbix

Twitter last posts

Tweets by @Arrfab

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.