The Memory Leak

Friday, June 22, 2012

Cobbler Install on CentOS 6.2

Cobbler - not the kind you put peaches in, this is an automated install tool

Here's the quick and dirty to get it installed and the web interface working:

CentOS 6.2 install

Basic Server install option
as root, run "setenable 0" to turn selinux to permissive (without this, selinux caused me many headaches with the "cobbler check" command later)
as root, run "vi /etc/selinux/config" and change the SELINUX=enforcing to SELINUX=permissive. This keeps it in permissive mode over reboots.
optional: set up a local user with wheel access, enable wheel sudo access, and set /etc/ssh/sshd_config with "PermitLocalRootLogin without-password"

add EPEL repo

point browser to: http://fedoraproject.org/wiki/EPEL
right-click, copy link
on CentOS system (I connect through putty and change to root at this point), run

rpm -ivh <SHIFT+INSERT> (last two keys will paste the link from step 2)

Install Cobbler

yum -y install cobbler cobbler-web koan policycoreutils-python
service cobblerd start
service httpd start
cobbler check

resolve all reported issues (I had about 10)

Configure Cobbler-Web

see cobbler-web wiki page, just remember to try http if https fails

I think you might be able to skip step 3.4 and do that after step 4 if you'd like to have the web gui, since it is available there, but I don't know if you can resolve all the issues from there.

Kudos to Mike DeHaan for a really helpful config checker; wish all software came with something like that.

Saturday, June 16, 2012

iSCSI Performance, round 2

So after turning on Jumbo frames (see my last post about this), I was able to get wonderful speed through the network, but I was having an issue with the storage server at this point; load averages were too high, and none of the RAM on the box was being used for caching.

In reading through the OpenFiler forums, I'd seen people referring to using iSCSI (a blockIO type technology) with fileIO transfer mode. This didn't make sense to me, but I decided to try it with a new storage system I'd brought online.

I'd already mapped the LUN on the new system in the same was as the old system: iSCSI, write-back, blockIO. Since there wasn't anything riding on this one, I just unmapped the LUN, and remapped it with write-back/ fileIO. VMware didn't bat an eyelash at it (I didn't take the iSCSI service offline) and was able to browse the datastore just fine. I then tested an fresh install of a system, since this is highly IO intensive.

Needless to say, I was very surprised to see the performance improvement. Read and write latencies are now in the single digits, and I had a sustained network transfer during the install of 233Mbps, or 23.3% of my 1GbE connection (info based on VMware's performance reporting). I also saw the memory on the OpenFiler system being used for caching, which was another win.

I immediately shut down my other 9 VMs and flipped my other system to fileIO tranfer mode. There was no data loss (again, VMware didn't even notice the change), and I brought up the systems, first two at the same time, and then all the rest at the same time. Latencies stayed in the single digits during the boot, and everything came up as if it was on dedicated hardware.

Also, the load averages on the OpenFiler system had dropped back to where they were before, but I noticed another problem... the cache was using all the RAM on the box.

My OpenFiler systems are DELL 2850s, and when I bought them, I'd only gotten them with 2GB of RAM each. Needless to say, I'm shopping for RAM right now =D.

(ps: I'm using BBU on the PERC cards in the Dells, and I have all my systems on a UPS as well).

So there you have it: iSCSI can be done cheaply and perform well enough to run your virtual infrastructure. In this case, I'm currently running 10 VMs on a DELL 2850 and a DELL 1950, and total cost to me to set this up was under $2K. More to come once I have more RAM =D

Friday, June 15, 2012

Ansible setup

Ansible - def. 1. super-luminal (aka, faster than light)
2. system managment automation program on github you wished you were running

Ansible is set up to be very simple, and runs over ssh. Here are my notes from trying to get it installed and working on Centos 6.2, using the "Running from Checkout" instructions found at http://ansible.github.com/gettingstarted.html, which gets you version 0.5. The RPM from EPEL provides version 0.3.

here's my super quick instructions, the few issues I ran into mentioned below:

start with CEntOS 6.2
sudo su - root or su - root
install needed packages

# rpm -ivh http://mirror.pnl.gov/epel/6/i386/epel-release-6-7.noarch.rpm
# yum -y install python PyYAML python-jinja2 python-paramiko
# exit

add ansible

$ git clone git://github.com/ansible/ansible.git
$ cd ./ansible
$ source ./hacking/env-setup

configure hosts

$ echo "127.0.0.1" > ~/ansible_hosts
$ export ANSIBLE_HOSTS=~/ansible_hosts

and test:

$ ansible all -m ping -u dewey.garwood
127.0.0.1 | success >> {
"ping": "pong"
}

you should note the following errors will occur if you aren't paying attention:

if you go looking for paramiko, yum wont find it; you have to use python-paramiko
without the -u option in the test command (step 6), ansible tries to use the root user to log in and you end up with:

$ ansible all -m ping --ask-pass
SSH password:
127.0.0.1 | FAILED => FAILED: Authentication failed.

iSCSI perfomance

If you've read any of my other posts, you know I'm running OpenFiler as an iSCSI backend for VMware ESXi 4.1.

There are some issues with running it in this manner, and I hope to write out some more instructions later about setting up to use SCST rather than IETD. However, this is for anyone out there who might be trying to get better performance out of your iSCSI infrastructure... hopefully this will help you avoid my "doh!" moment.

If you haven't already done so, find a time to bring your environment down long enough to turn on jumbo frames on your switches. Your VMs and the customers who use them will thank you, not by saying anything, but by not complaining that the performance is really slow.

After having done so, my average write latencies have gone from triple digits to double digits, and my throughput has roughly doubled. Also, my Openfiler system has gone from load averages that were around 1 to around 4 - 5 (4 is a full load for my system).

So here's a friendly reminder to avoid my face-palm moment X[ and get some decent performance out of your system =D

Tuesday, June 5, 2012

Minor format tweaks to blog

aka: how to make your background image stay put using CSS

Someone mentioned that it would be nice if the background would stay put on my blog, so it was always there, rather than just at the top.

Since I'm in the process of learning html and css, figured I would see if I could do something about that. Care to guess which CSS section I'm learning about right now? =-D

Before:
body {
background: #000000 url(<image_url_here>) repeat-x scroll top center /* Credit for photo here */;
}

After:
body {
background: #000000 url(<image_url_here>) repeat-x fixed top center /* Credit for photo here */;

}

I hope this makes the main blog a bit easier to read, and not seem like you're Lost in Space™ (weeeoooo!) when you scroll down.

I haven't been able to get the mobile working yet, so if you're looking at this on a too-smart-for-your-own-good phone and you know how to fix it, drop me a comment, please. Or be patient; I should be there in a few more chapters :)

Also, I want to take this time to highly recommend www.murach.com. They publish books that are excellent tools for learning technology, and are worth their weight in gold. You won't find a better book for getting up to speed on a topic quickly, provided that they have a book that covers what you're looking for.

So, just in case anyone from murach.com is reading this, a few topics I'd like to request:
Perl, Python, Apache Administration, and testing automation.

In the meantime, if you're interested in those topics, stay tuned, I'll probably end up with something to "leak".

Monday, November 14, 2011

Method for cooking down Pumpkin

Hello again. Been a while, been busy, and thought I'd write something unrelated to systems administration.

If you find this helpful, leave a comment. Thanks for reading!

One of the things that has always frustrated me around Halloween was throwing out the pumpkin that was carved just a night or two ago. If it's a reasonable size to carve, then you're talking about throwing out at least a couple of cans for each pumpkin.

If you've ever tried to cook pumpkin before, you know the amount of work involved, scraping out the insides (and getting seeds if you enjoy eating them), then cutting up the pumpkin to cook it, and trying to find a way to get the peel off without burning your fingers (I really hate that part), and then trying to turn it into puree. So after trying several different things, here's what my wife and I have come up with.

This method minimizes the amount of work you'll have to put in, as well as any burns you might receive from handling hot pumpkin.

Stats for one pumpkin (reasonable carving size)
Total cook time: 2 hours
Total prep time: 1.5 hours w/ seeds, 1 hour w/o seeds
Total seeds: 1 cup (approx)
Total pumpkin yield: 1 quart

Step one: Cut pumpkin in half, seed, and scrape out stringy insides.

yes, getting the seeds can be a bit of a slimy mess, but if you enjoy eating them like I do, it's worth it. For about 15 mins of work, you end up with about a cup of seeds per pumpkin, and they're easier to get out than sunflower seeds.

If you've carved your pumpkin, you've already gone through the process of cleaning out the inside, so just cut the pumpkin in half.

Step Two: Cut pumpkin into strips no more than 1" thick.

I find that holding the pumpkin with the outer shell towards you and pushing down on the handle end of the knife works well. I also use the largest knife we have when doing this work. Also, cutting a strip single that has the stem and the stub where the flower was (bottom) make it easy to remove these.

Step Three: With a vegetable peeler, remove the outer shell.

When I finally thought to do this, I was surprised how easy it was. It's a bit more like peeling carrots than potatoes, and removes the shell quickly without much effort. You'll want the peeler at an angle, rather than the whole blade flat on the pumpkin, or it will be harder to get started; once started, it's pretty easy to get under the shell.

Step Four: In a 6 qt pot, put in 1/2 cup water (enough to cover the bottom about 1/4"), and place the pumpkin in. Cook covered for 1 hour over med-low heat. Pumpkin is cooked when it cuts easily with a fork.

This helps to remove the water. You'll start with 1/2 cup, but you might have to drain it a few times to avoid having it boil over. You many also want to cut it into smaller pieces to get it into the pot (4-6" strips).

Step Five: Pack the pumpkin in a blender, mashing out as much water as possible. Then, puree the pumpkin.

You can actually fit 1 whole pumpkin in a blender that holds a quart. It is preferable to have a blender that also has a dispenser on the bottom, since this is the easiest way to get the pureed pumpkin out. I use a potato masher to press the pumpkin in. Also, you'll want to get as much water out now as you can, before you puree the pumpkin.

Step Six: Cook puree uncovered over med-low heat to remove water, stirring occasionally, until it makes a paste about the consistency of semi-thick oatmeal. (about 1 hour)

Your pumpkin is now ready to use in recipes (pie, scones, oatmeal, cookies, butter, etc.)

Friday, July 29, 2011

VTP on Cisco Switches in a Small Company (aka: my network just drops)

Sorry it's been a while. Here's the most recent fun bang-head-here problem I was able to resolve.

Situation:

3 Cisco Switches in an office. 1x 3750, 2x 2960S

Every so often at random intervals, the network connections for all the clients would just vanish; connectivity through the main switch was fine (used Zenoss to monitor, only reported failure of the switches, and a printer beyond them), but couldn't get to any of the clients, and they couldn't use the network, let alone the internet.

Troubleshooting:
I tried everything I could think of to identify this problem. checked spanning tree, checked logging to see if I could catch it (this was one of those really random problems, highly unpredictable), made sure the VLANs were set correctly, had Zenoss pulling snmp data for interface utilization % on the trunks, etc.

what I noticed was the following: graphs didn't show any vertical breaks, so the interfaces never went down, even though the network connections would drop. This meant the switch was up, and there was no problem with the physical wiring, as well as the power to the switches.

after asking someone else more knowledgeable than me, he pointed me in the direction of VTP settings.

What I learned (they probably cover this in CCNA 101): VTP is a proprietary Cisco protocol used to simplify VLAN management on many many switches (think triple digits or higher), allowing Network admins to manage them all from one point. Makes sense, cuts down the amount of mistakes and time to configure a switch fabric. My problem was that the three devices that were installed in the company had not been configured correctly, and since they were all non-configured when they were added, they all became servers. Apparently, they couldn't decide which switch was the authoritative switch, and when the switch designated as the true master would change, all the VLANs would be deleted off these switches, and then added back. Net result was the switches looked like they were going down. Highly unpredictable, highly annoying (to everyone).

Resolution:

Set the switches to VTP transparent mode. commands were really as simple as:

log in
config t
vtp mode transparent
write mem

some things to remember are to check your vtp status to see where you are on a given switch (show vtp status), and that you need to make sure you are not using vtp pruning when you make the change. The change does not prevent you from connecting to the switch (some reported a delay, but I didn't experience one), but if vtp pruning is in place, it can cause problems getting your clients to connect as you change switches in the environment. Since the environment I'm in is so small, I just set vtp transparent, since I could set the vlans on those switches, and they would still forward vtp packets.

info that I used included the following:
https://supportforums.cisco.com/thread/2029581 (be sure to read the whole forum thread)
http://www.cisco.com/en/US/tech/tk389/tk689/technologies_tech_note09186a0080094c52.shtml (main page about VTP configuration and what it is and does)
http://www.cisco.com/warp/public/473/vtp_flash/
(this really helped with my understanding of VLAN Trunking Protocol, VTP; the first problem discussed is exactly what I was facing, called Problem #1, of all things)

something else I learned (again, probably CCNA 101) was that a good protection technique on making changes where you might possibly lose connectivity to a switch is to start with the following before you make your change:

reload in {mmm|hhh:mm}
<make your change>
reload cancel (after change is complete)

this allows you to work, and if something happens that you can no longer connect to the switch, it will reload the config that worked before you started.

I'm sure there are more knowledgeable networking folks out there, but this was how I solved this problem for the time being. Simply putting this out there for anyone who could use it; like me when I run into this problem again. (=