...

Tuesday, April 5, 2011

Django + Redis (master/slave) = Pseudo distributed event based caching

This article focuses on the following topology to address usingRedis as the cache backend for Django where sets and deletes aresent to a master Redis server along with a local or dedicated Redisserver for a set of or single cluster member.

  • 'Cluster Slave 1' <- 'Cluster Master 1'
  • 'Cluster Slave 2' <- 'Cluster Master 1'
  • 'Cluster Slave 3' <- 'Cluster Master 1'
  • 'Django Node 1' uses 'Cluster Slave 1' as 'default' and Cluster Master 1 as 'master'
  • 'Django Node 2' uses 'Cluster Slave 2' as 'default' and Cluster Master 1 as 'master'
  • 'Django Node 3' uses 'Cluster Slave 3' as 'default' and Cluster Master 1 as 'master'
All updates from 'Cluster Master 1' are sent to all slaves as quickly as possible, not atomic, not verified. This requires you to expect slave cache keys to timeout appropriately as a backup.

You should note that using this method does not necessarilyaugment the standard of a memory cache.
  1. Store/Fetch/Invalidate Keys
  2. Keys can be deleted at any time
  3. The application is ultimately responsible for understanding cache behaviour
  4. Caches in a cluster may hold different values for a key
Number 4 is really only ever seen in a replicated or isolated environment where the backend is a database slave, a database in a master/master relationship, redis, or when using many different local caches per cluster. In the setup I am testing there is a single local slave at '10.10.0.41' and a single master at '10.10.0.40'. The redis servers are already configured, here is the Django cache configuraiton.
CACHES = {
'default': {'BACKEND': 'redis_cache.RedisCache', 'LOCATION': '10.10.0.41:6379',},
'master': {'BACKEND': 'redis_cache.RedisCache', 'LOCATION': '10.10.0.40:6379',},
}
And lets do a little testing:
>>> from django.core.cache import get_cache
>>> slave = get_cache('default')
>>> master = get_cache('master')
>>> master.set('hi','there')
True
>>> slave.get('hi')
'there'
Huzzah.. it replicated!
>>> master.delete('hi')
>>> slave.get('hi')
Huzzah.. it deleted!
>>> master.set('hi','there')
True
>>> slave.set('hi','pony')
True
>>> master.get('hi')
'there'
>>> slave.get('hi')
'pony'
Huzzah.. it can be different!
>>> master.delete('hi')
>>> master.get('hi')
>>> slave.get('hi')
Huzzah.. even though there were differences I can delete the key!

What this means to me is that I can use the master for all changes and isolate all lookups to the slave server and if I'm feeling frisky I can even update the slave server. Keep in mind that with Redis and the Django-Redis-Cache module you can specify a different namespace or rather database ID making the following possible.
CACHES = {
'default': { 'BACKEND': 'redis_cache.RedisCache', 'LOCATION': '10.10.0.41:6379', 'OPTIONS': { 'DB': 1, }, },
'master': { 'BACKEND': 'redis_cache.RedisCache', 'LOCATION': '10.10.0.40:6379', 'OPTIONS': { 'DB': 1, }, },
'local': { 'BACKEND': 'redis_cache.RedisCache', 'LOCATION': '10.10.0.41:6379', 'OPTIONS': { 'DB': 2, }, }, }
Huzzah.

The idea behind all of this is to allow the master to fail while the slave continues operating. In a hash based distribution that is possible, however whenever a cache member leaves there is no persistance for what was once cached without hacking the distributor to deal with n+1 replication of events. Once a hash member is alive again all keys that were reallocated to a different member are invalidated simply because the distributor won't think to look there any more unless the same member fails again.. at which point you are seeing old data that you couldn't possible expire using 'delete' since you didn't know it existed. This makes deleting data in a hash based distribution very clumsy.

The above scenario works best if you set a key on the slave first and then send it to the master. This allows the Django cluster node to have immediate access to data that it is attempting to persist into memory between request which is assumed to be the most appropriate node (given an intelligent entry distribution for the request/requestor pairs). Now the master can fail and come back up without too much of a problem. You can take it one step further and store a local set of keys (command queues would be a lot of trouble, just use keys) for master updates that never made it for when it comes back online and needs to be updated. Redis can be set up using heartbeat and DRBD to mitigate the down time of a Redis master as well as supports master/master mode by way of slaving eachother. Redis can store the database to disk which is great. You can write a quick program to replay the master cache to a new slave if you're using dynamic nodes since redis supports key lookups. You can even have the slave store values only in memory while the master uses the disk as well. Lots of options.

Friday, December 24, 2010

Initial thoughts on "Thats So Meta" Filesystem development

What should other filesystems offer that TSMFS doesn't need to offer?
point in time recovery? (BTRFS/ZFS)
redundancy? (GLUSTERFS/...)

Initial functionality that I would like to offer.


  • files as directories system structure for filename nodes containing links to the
  • hi/lo hash storage for content - hi being a padded hex digest out to 8 bytes that is "just there" in case of collisions and lo being the sha1 content (not header) data
  • content nodes and all specific file header data
  • generic support for fuse and http/webdav
hmm...

Wednesday, December 15, 2010

Initial work on signature line detection.

Image banding mass.. that's right.. that's what I'm gonna call it.

Take a signature... normalize it (white background, black foreground).. shrink it to 1px wide using your favorite quadratic or bicubic or sinc filter and expand it again for visual effect.

Normalize what you get and create two thresholds.. one that determines the major mass and what that determines the middle mass. I chose to use 192 and 128.. since the banding is normalized that should work relatively well.

Now visually inspect the two and you should see each layer has one major band. The difference between the bottom of both major bands lands pretty darn close to the signature line.

I've attached some images to the post so you can wonder about my sanity. Each one is a composite of the signature signed against a common line, that line, and the banding groups. It's important to think about each image without the signature line in place. The signature can be anywhere in the image and you will still find an approximate signature line y offset.











Saturday, October 3, 2009

Using OpenVPN and bridges to create a seamless wired/wireless transition

What?

The reason behind this setup is due to 802.11 wireless standards having a different L2 MAC layer protocol than 802.3 wired ethernet standards which results in twp networks having completely incompatible frames. However they can transport the same Internet Protocol (IP) easily enough. It is because of the L2 differences that make using 802.1D bridging difficult between wired and wireless devices. This is a huge gotcha for most newbs that want to set up a simple bridge to use their laptops or routers as a client on another wireless network and offer a transparent bridge to the ethernet level. Normally that sort of function is handled by a dynamic Proxy-ARP agent with very specific instructions.

I care because I'm a network geek that likes to unplug his laptop and not have a nasty wireless re-association event happen that eventually changes my IP or worse.. my subnet. This Mini-HOWTO explains the procedure to assign a single virtual MAC address to a laptop or mobile device and put it on the same ethernet segment as two virtual switches, or bridges, that the physical and vpn devices will use and allow privatization to one or the other depending on link state, typically the wired interface. I am also not taking into account the evil of any auto-network daemons like NetworkManager, wicd, ifplugd, etc... You can deal with all of those easily enough by using VLAN for static subnets between wireless clients and the router, allowing wpa/wep/dhcp to occur on the base wireless device without changing the VLAN settings. Very handy.

So here's the spinach. I have the following running and operating subnet 10.42.3.0/24 and to avoid using the same subnet for the bridge device that the wireless device would use to set up an openvpn connection I have set aside 10.255.255.0/30:

  1. A Linux based router (Aenema/Debian 5.0) @ 10.42.3.1/10.255.255.1
  2. A Linux based laptop (Lateralus/Debian 5.0) @ dynamic/10.255.255.2
  3. A Wireless access point (WAP/Pretty standard) @ dynamic
At my house I have the WAP on a different VLAN serving out DHCP for a completely different subnet, this helps me reduce broadcast traffic being done by my wired clients and limits multicast via some handy iptables rules from bleeding over to my wireless network. There are several methods of handling the following including using the bonding driver or dummy devices and dynamic routing protocols. I will be covering the use of bridging and spanning tree protocol to achieve a seamless transition from wired to wireless on a mobile device.

Firstly and foremostely, the router:

Install OpenVPN and bridge-utils then copy the following:

/etc/network/interfaces
...
auto br0
iface br0 inet static
pre-up /usr/sbin/openvpn --mktun --dev tap0 --dev-type tap
bridge_ports eth0 tap0
bridge_fd 2
bridge_stp on
bridge_pathcost eth0 50
bridge_max_age 5
address 10.42.3.1
netmask 255.255.255.0
broadcast 10.42.3.255

auto br0:0
iface br0:0 inet static
address 10.255.255.1
netmask 255.255.255.252
broadcast 10.255.255.3
...

/etc/openvpn/laptop0.conf

port 4242
local 10.255.255.1
remote 10.255.255.2
dev-type tap
dev laptop0
keepalive 10 60

Go ahead and bring up the br0 interface and start OpenVPN with the laptop0.conf.

Now on your laptop set up something similar:

...
auto br0
iface br0 inet dhcp
pre-up /usr/sbin/openvpn --mktun --dev tap0 --dev-type tap
bridge_ports eth0 tap0
bridge_fd 2
bridge_stp on
bridge_pathcost eth0 50
bridge_maxage 5

auto wlan0
iface wlan0 inet static
wpa-driver wext
wpa-ssid sexypants
wpa-psk 29413cd1e6f5a9259a697303db905308d9c3702d856f14ebe56d50abc5515f7c
address 10.42.3.56
netmask 255.255.255.0

auto wlan0:0
iface wlan0:0 inet static
address 10.255.255.2
netmask 255.255.255.252
broadcast 10.255.255.3
...

/etc/openvpn/laptop0.conf

port 4242
local 10.255.255.2
remote 10.255.255.1
dev-type tap
dev tap0
keepalive 10 60

Bring up the interfaces and start OpenVPN with the laptop0.conf. The bridge should be active and you should, SHOULD, have just received a DHCP response and IP from your router. With any luck you can also ping 10.255.255.1. If so then your VPN should be up as well. Check your bridge with 'brctl showstp br0'. You'll see that eth0 has the lowest cost (first priority) for all packets between this little network loop that we've set up. Without Spanning Tree Protocol (STP) we would have formed a switch to switch double crossover scenario which would eventually destroy the CPU in both machines (exaggeration) but it has been known to fry a few improperly cooled switches.

If it appears as though tap0 is connected then feel free to cast off your wired network and watch the bridge turn off eth0 as an active port and defer to tap0. Huzzah. I can now unplug my laptop from my wired network or docking station and walk into the bathroom and keep typing from the throne as if nothing changed.

Tuesday, September 29, 2009

Exploiting /etc/debian_chroot

I prefer my naming conventions to be as follows for most of my work machines.

  • gw01.site01.domain.com
  • gw01.site02.domain.com
  • wks01.data.site01.domain.com
  • lpr01.data.site02.domain.com
  • sip01.voip.site01.domain.com
This can get pretty confusing. Logging into 'gw01' at both 'site01' and 'site02' means the default shell in Debian, bash, has a prompt that looks like this at both sites:
  • sysadmin@gw01:~$
Previously I was doing something somewhat out-of-the-box it seems and naming the hosts `gw01.site01` and `gw02.site02` and changing the `\h` (show hostname before first dot) in the bash preference files to `\H` (show entire hostname) which gave me an initial headache when installing a new system as well as some programs and dns utilities failed to honor anything after the first dot of a hostname. Back to the drawing board.

Looking at `/etc/bash.bashrc` and the user bash preference files in `/etc/skel` I saw that if `/etc/debian_chroot` was available it's contents were prepended to the prompt. This is an easy solution to my problem. `echo site01 > /etc/debian_chroot`. Now my prompts look like this:
  • (site01)sysadmin@gw01:~$
Woot.