Simple way to manage lots of system users in distributed environments.

Few years ago I was working in design of a large cluster of systems to perform some actions (solving some mathematical models, sharding database…). From the point of view of the systems, I had to deal with a number of pesky troubles. One of them was the user management.

Since I had more than one hundred of hosts, and this number could be grow up in short, and I’ve a number of users which need to access to all hosts, I need to think in a way to easy user management. Actually user management is, in my opinion a pain in the butt. If you ve a central user directory, you need to deal with a big and fat single point of failure, so you need to create some kind of HA service for this directory. And if you ve systems around the world, then you need to replicate the user data in different directories and keep them synchronized. If it is not the hell, it must be very similar.

Dealing with user managemet is a royal hassle for system administrators in every place, but in the cloud (i.e. a number of hosts distributed around the world), it’s also a punishment. So, I need to solve (almost in part) this problem before moving forward in my deployment. I do not need a full user management really, just a basic UID mapping and a way to authenticate users (for which I could use the old-and-friendly authorized_keys).

So, How can I manage a big number of users in a single way, and to be effective in a distributed environment? That’s not a simple questions, and of course each implementation has its own solution, from authentication services to suites of scripts. Anyway, I was looking for a simple to manage ones, cause of I was the responsible to manage the entire environment and I’m too lazy too 😉

Thinking about the problem, I imagine a system without any user, let’s imaging that there are just one user, and any other user is just an alias for the first one. It could be easy to manage, because we only need one UID, but we need to solve the alias mapping.
Here is when libnss_map join into the game. The libnss_map is a library designed to be used with GNU NSS service. The NSS allows the system to get user credentials from many sources, which can be configured easy from the /etc/nsswitch.conf file.

For example, we can configure our system in the following way:

passwd:      files map
shadow:      files map
group:       files map

So, for each user to get credentials NSS will lookup in standard files first, and then using the map module (libnss_map).The map module works as the flow diagram shows.

Flow diagram of how get credentials works with libnss_map

As you can see in the diagram there are two major steps in lookup. The first one is the responsible to map an user to a virtual one. The virtual user is static, and it’s defined in /etc/nssmap.conf. This file has the same syntax like passwd does. For example:


Which means that any user who does not exists in /etc/passwd will be mapped into this one, with UID 10000.

Okay, sounds good, but there are a lot questions yet. What about the password? What about the home dir?

Well, I do not find a good solution for password, so nssmap will return a masked password (account is enabled, but password will be unpredictable), and I authenticate the user using other methods via PAM, or public keys via SSH.

Home directory is easier. The home directory field in the user definition (inside /etc/nssmap.conf file) is used as prefix, and it will be completed with the user name (the name of the user which is intended to login, not the virtual one). So, for example, for the hypothetical user “sample”, the effective home directory will be “/home/sample”, because “/home/” is the prefix. Please note that the end slash is mandatory in current implementation.

Finally I need to solve another big problem: if two users has the same UID then both can change the same files, or delete the files of other “virtual” user. How can we solve it? There are not single answer, not easy afterall. In my case I use special shell, which
ensure that the user cannot remove, touch or even read files in any path into /home except his own home directory, but it’s not a full solution yet.

Here is an example using nss map:

host ~ # sudo su - test
No directory, logging in with HOME=/
test@host / $ id
uid=10000(virtual) gid=10000(virtual) groups=10000(virtual)

In the meanwhile, a basic code is available in my github, and I still researching in this kind of authorization. Keep in touch and enjoy! An of course, feedback is welcome 😀

New dtools and a bit more

Last week was crazy. I published a new release of dtools, the 4.2, a new web site for the dtools project and a couple of patches for version 5.0 of collectd.

In the last months, dtools becomes in an useful tool for me. I use dtools everyday for system administration in large distributed networks. So I decided to improve some functionalities and also test and retest current features, so in a couple of months I expect to launch a new release of dtools.

In the meantime, I still working in whistler, a XMPP bot for MUC rooms, I hope that SleekXMPP library, which is the XMPP engine used by whistler become to release early. At that point we must remove any dependency with the old xmpppy in the code and we will ready to release the version 2.0 of whistler.

Integer conversions in bash

Since version 2, bash support a single aritmethic operations. Altough bash is not a mathematical shell (use bc instead), you can perform certain conversions using the bash arithmetic logic.

For example you can remove the left zeroes in a decimal number without require any external utility or print formats, let’s suppose that you want to strip zeroes from the number 007, which is stored in bond variable.

$ echo $bond
$ let nozeros=10#$bond
$ echo $nozeros

In many forums and mailing list, people need to use ugly sed expressions, or awk invokation, but (with bash) it’s just simply 🙂

Using the same trick, you can perform a base conversions, for example:

$ let i=0x10
$ echo $i
$ let i=2#10000
$ echo $i

Or create an easy number checking:

$ is_decimal () { let i=10#$1 2>/dev/null; }
$ is_decimal 'a' || echo Nop
$ is_decimal 56 && echo 'Yep'


Whistler: a new Jabber MUC bot.

Few days ago, I start a new project called whistler. Whistler is a bot written in python using the greatest xmppy library, designed to work in XMPP networks (like jabber or GTalk. In first time I tried to use the quinoa framework, and it is very usefull, but have some issues for me, for example you cannot set another server configuration, which is a problem for GTalk accounts. So, after tried a number of frameworks, I decided to create my own one. Probably not the best, but mine 🙂

Whistler is intended to manage the connectical MUC room, and only basic functionalities are provided. Obviously it is under heavy development yet.

The code is publicy available on github whistler repository, and you can clone as usually:

$ git clone git://

You require xmppy to work with whistler and python >= 2.5. In few days I will publish the projecti into pypi too.

Enjoy and remember, any feedback is welcome 😉

Update dot files with git

For a years I was using a custom created scripts to keep my dot files updated. I had a local repository in bazaar and a script which check differences between home dot files and files stored in the repository. This solutions works fine for years, but now I want to do some changes…

The first one is moving my dot files to git (and probably pushed them to github), and the second one is to create a hook for git to update my dot files. I known that there are a lot of similar solutions, one more complex, other more easy, but this is mine 🙂

So, I created a post-commit hook script for git, which perform the modifications that I need. Now I just only do this steps:

1. Create a new git repo:

mkdir mydots_repo
cd my_dots_repo && git init

2. Put the hook:

wget -O .git/hooks/pre-commit
chmod 755 .git/hooks/pre-commit

Or just put this content to pre-commit hook:

#! /bin/bash
# (c) 2010 Andres J. Diaz <>
# A hook to git-commit(1) to update the home dot files link using this
# repository as based.
# To enable this hook, rename this file to "post-commit".

for dot in $PWD/*; do

 if [ -L "${home_dot}" ]; then
 if [ "${home_dot}" -ef "$dot" ]; then
 echo "[skip] ${home_dot}: is already updated"
 rm -f "${home_dot}" && \
 ln -s "$dot" "${home_dot}" && \
 echo "[done] updated link: ${home_dot}"
 if [ -r "${home_dot}" ]; then
 echo "[keep] ${home_dot}: is regular file"
 ln -s "$dot" "${home_dot}" && \
 echo "[done] updated link: ${home_dot}"

3. Copy old files:

cp ~/old/bzr_repo/* .
git add *

4. Commit and recreate links:

git commit -a -m'initial import'

And it’s works 🙂

Moving to github

Since one week ago, we are moving the Connectical servers from old location in Virpus datacenter on Texas to our own managed infraestructure, build on the top of a GuruPlugs cluster.

We are discussing now about how distribute the infraestructure and how to keep a number of copies in remote locations up-to-date, we are exploring solutions like elliptics or some similar.

In the meanwhile I created my github account to still my projects under development, and also to have a backup of some projects that I really use everyday.


New version of dtools

Today I was released a new version of dtools. Distributed tools, aka dtools is a project written in bash coding to create a suite of programs to allow running different UNIX comamnds parallelly in a list of tagged hosts.


  • Fully written in bash, no third party software required (except ssh, obviously).
  • Based in module architecture, easy to extend.
  • Full integration with ssh.
  • Easy to group hosts by tags or search by regular expression.
  • Manage of ssh hosts
  • Parseable output, but human-readable
  • Thinking in system admin, no special development skills required to extend the software.

Short Example

$ dt tag:linux ssh date
okay::dt:ssh:myhostlinux1.domain:Mon Nov 16 23:54:04 CET 2009
okay::dt:ssh:myhostlinux3.domain:Mon Nov 16 23:54:04 CET 2009
okay::dt:ssh:myhostlinux2.domain:Mon Nov 16 23:54:04 CET 2009

As usual, you can download the code from the project page, or if you wish you can download the code via git:

git clone git://


New htop color scheme

From a couple of weeks I use htop in my work to get a fast view about the system status, htop is an an interactive process viewer for Linux, similar to classic UNIX top, but with some enhancements, for example a more configurable view, the integration with strace and lsof programs and much more.

But (and it’s a big “but” for me) I really dislike the color scheme that use by default. htop comming with five color schemes, but I cannot find any beautifull one (from my personal point of view, of course), so I decided to make a new schema. I called “blueweb” theme (dont’ ask) ;). And here is the result:

htop with blueweb theme
htop with blueweb theme

You can download the patch file for the htop source code. And yes, unfortunately you need to patch the code.

Now my htop looks nice 🙂


Python module to handle runit services

Last month I needed to install runit in some servers to supervise a couple of services. Unfortunately my management interface cannot handle the services anymore, so I decided to write a small module in python to solve this handicap, and that is the result!.

With this module you can handle in python environment a number of runit scripts. I think that this might be work for daemontools too, but I do not test yet. Let’s see an example 😀

>>> import supervise
>>> c = supervise.Service("/var/service/httpd")
>>> print s.status()
{'action': None, 'status': 0, 'uptime': 300L, 'pid': None}
>>> if s.status()['status'] == supervise.STATUS_DOWN: print "service down"
service down
>>> s.start()
>>> if s.status()['status'] == supervise.STATUS_UP: print "service up"
service up

Personally I use this module with rpyc library to manage remotely the services running in a host, but it too easy making a web interface, for example using bottle:

import supervise
import simplejson
from bottle import route, run

def service_status(name):
   """ Return a json with service status """
   return simplejson.dumps( supervise.Service("/var/service/" +
name).status() )

def service_up(name):
    """ Start the service and return OK """
    c = supervise.Service("/var/service/" + name)
    return "OK UP"

def service_down(name):
    """ Stop the service and return OK """
    c = supervise.Service("/var/service/" + name)
    return "OK DOWN"

from bottle import PasteServer

Now you can stop your service just only point your browser http://localhost/service/down/httpd (to down http service in this case).