jonEbird

September 16, 2009

Server Death

Filed under: adminstration,blogging,linux — jonEbird @ 7:41 pm

I often joke that the only people that read my weblog are bots, so it shouldn’t bother me if my site is down but it does. Last week the server, which was also doubling as a workstation for the wife, died. “The computer is not working”, the Wife explained. I didn’t check it out immediately as I just assumed that X had crashed or something else preventing her from using firefox. Like I said, I’m not too overly concerned with my site’s uptime.

But when I finally did check it out, sure enough, it was not looking good. Absolutely no display on the monitor. Considering I had replaced my video card not too long ago and I could no longer ssh into the machine, I am thinking that either the CPU and/or the motherboard are dead.

DeadPC
Hercules taking a look

After Hercules and I surveyed the situation, we decided to pull the sheet over it’s head. It’s had a nice long life (pc years) since 2004.

I headed to microcenter today to checkout what kind of motherboards, CPUs and even memory that they had on sale. If you consider my last machine was running with only 756M of memory, an ageing AMD 2Ghz processor on a abit kv8 motherboard while happily serving my website and handling the Wife’s facebook usage, then you can understand I was looking for the smallest, cheapest solution I could find. That solution was looking to be somewhere around $225.

Not willing to rush into a $200+ investment, I instead bought a IDE enclosure which is capable of serving my data via USB for a mere $21 bucks.

Now for the Restoration of my Website

I really shouldn’t even be talking about this. I should have had regular MySQL dumps along with full web content backed off to another machine. Aside from a laptop, the other “real” pc in the house is a Acer I bought as a media machine which sits in my entertainment center. It was never intended to be running 24×7, so I only did on-demand backups of my important files which were actually outside of my website. Another justification for not having regular backups was that I had two internal Seagate drives configured in a software mirror. I always figured if I had some sort of hardware problem, I’d be able to replace it and in worse case never really lose my data.

So I have my hard drive and am now looking to get my WordPress site back online with the pc in the living room. After plugging in the harddrive, I need to activate the MD device and mount up my filesystem:

[jon@pc ~]$ sudo mdadm --assemble --scan
mdadm: /dev/md/0_0 has been started with 1 drive (out of 2).
[jon@pc ~]$ cat /proc/mdstat
Personalities : [raid1]
md127 : active raid1 sdd1[1]
      241665664 blocks [2/1] [_U]

unused devices: <none>
[jon@pc ~]$ sudo mount /dev/md127 /mnt

My two machines were off from each other by two Fedora releases. I wondered if I could do a chroot, startup MySQL and get a fresh, clean dump of the database…

[jon@pc ~]$ sudo su -
[root@pc ~]# chroot /mnt
[root@pc /]# ls
bin  boot  dev  etc  home  lib  lib64  lost+found  media  mnt
opt  proc  root  sbin  selinux  srv  sys  tmp  usr  var

[root@pc ~]# mount -t proc none /proc
[root@pc ~]# /etc/init.d/mysqld status
mysqld dead but subsys locked
[root@pc ~]# /etc/init.d/mysqld restart
Stopping MySQL:                                            [  OK  ]
Starting MySQL:                                            [  OK  ]
[root@pc ~]# /etc/init.d/mysqld status
mysqld (pid 9394) is running...
[root@pc ~]# mysqldump -u root -p wordpress > wordpress.mysqldump
Enter password:
[root@pc ~]# wc -l wordpress.mysqldump
354 wordpress.mysqldump

Cool!

The rest of the migration involved an rsync of /var/www/html/ content, adjustments of the default Apache config, granting access for my WordPress user to use the database and finally updating my router to now direct requests for port 80 to my media pc.

At this point, I guess I’ll be running this site from the living room until I decide what to do about my server / workstation. I’ve always wanted to build a slimmed down, efficient virtual server to host my website and then migrate it between server and laptop during maintenance / patching of my machines, but my AMD processor didn’t support the Virtualization assistance, so it was painfully slow. I think I’ll keep an eye out for a used, server-class machine. Let me know if you find any, bots. Thanks. ;-)

June 15, 2009

Intern Regiment

Filed under: adminstration,blogging,linux — jonEbird @ 10:06 pm

Today was Patrick Shuff‘s first day with our team. He is our intern for the summer and I actually recommended that we steal him from another team after meeting him last year. From my half day assessment of him last year, I thought he was much more suited to work with our Linux team vs. the Windows provisioning team. He found my gnu screen, emacs and script automation tricks fascinating and right there just invalidated himself as a legit Windows guy. The Windows experience he picked up last year was no doubt useful, but it’s not something you enjoy to return to. Just like it’s very useful to have learned C as your first programming language, being an awesome basis to provide a solid understanding of the computing innards, but you don’t want to return to it after programming in Python.

I have been trying to brainstorm good ideas for him to work on in the team. I suppose the main reason I want to see his experience be as positive as possible is because I myself was an intern for about three years. One thought I had, was to turn the three month time schedule into an intense one assignment per week ordeal where I throw a new task at him intended to inject new incites into all facets of becoming a well rounded Linux administrator. Of course, one week is not enough time to properly study each area for most of the categories of topics I was thinking of, but it would have a nice organized structure and would be nearly guaranteed to provide an intense experience worthy of writing home about. Okay, if he ends up writing home about it, then we know he’s a dork but would also mean he’s probably found a career in which he’d never have to work a day of his life because it’s enjoyable.

I started brainstorming my categories of areas in a quick outline mode. Of course, this list is subject to change and we actually end up doing through with this I’ll naturally have to report back on the actual topics covered in each week and what the assignments were. If nothing else, it should keep my weblog busier than normal which isn’t hard. So, here what I’m thinking constitutes a well rounded Linux Administrator:

  • Ever improve efficiencies
    • editor
      - pick one: emacs or vim. Just don’t settle at being able to modestly edit text.
    • shell
      - An essential, stereotypical Linux Admin skill. And yes, it is important. Study up.
  • organizational skills
    • - Can not be underestimated. Aren’t we always ever improving our organizational skills?
      - Develop consistent habits in note taking. Try reading Getting Things Done

    • project notes
    • meeting notes
    • hallway conversations
    • company hierarchy
  • technical expertise
    • operating systems
    • programming languages
    • architectural design
    • applications administration
  • staying current
    • awesome rss feeds
    • key social article sharing sites

      - Looking at you, reddit.
    • magazines
    • books
  • soft skills
    • working within a team
    • speach / presentation
    • written communication

      - tech writing, effective email communication
  • career, career stuff
    • resume writing
    • networking
    • staying driven
    • finding your path

Sorry for the lack of details on each of the items but it’s kind of silly to populate it further now. For now it remains an idea for a summer internship. Only once the plan comes to fruition will I report back with juicier details.

November 22, 2008

andLinux

Filed under: linux,usability — jonEbird @ 10:02 am

For years I thought the best way to enhance my Windows experience, with the common Unix/Linux tools I’m most comfortable with, was to do so with cygwin. That was until now. At work, where we are forced to use Windows, I recently had my laptop rebuild and afterwards my re-install of cygwin wasn’t going too well. Finally fed up, I then recalled seeing reviews about colinux and how it was advertised as being tightly integrated into the Windows experience. Before looking for the install media, I then saw that there are a couple of distros which are then built on top of colinux. One of which, andLinux, is a full Ubuntu release and I presumed that their layering on top of colinux was naturally providing additional support and/or features and decided to go with andLinux for the install.

After completing the install, I must say, I am very pleased and impressed with the work they have done. I’m not sure how much of the credit goes to colinux and how much goes to andLinux, but they both get a A+ in my book. Here are my top reasons for choosing andLinux over cygwin:

  1. Full Linux operating system running on top of Windows.
    Not actually being virtualized and is therefore quick. It is a special patch to the Linux kernel which allows this tight integration with winblows.
  2. Each window / app launched takes the same look&feel decorations as each other windows app.
    Translation: Doesn’t look like crap.
    Also, each app’s icon is properly displayed in the task bar instead of the same, repeated icon used in cygwin.
  3. Transition from wired to wireless is seamless.
    This was a piece a co-worker asked me to test out and I’m writing this up while on my wireless network at home. After suspending my laptop at work, I then un-suspended it at home and I didn’t have to touch a thing. My existing terminal window could still query hosts and I even tested an install of a quick package.
  4. It’s running Ubuntu.
    Means to get additional apps, which you might be missing, you get to do "apt-get install <missingapp>" instead of re-launching setup.exe.
  5. Clean terminal.
    What is this bullet point doing here you ask? Well, it is what motivated me to move away from cygwin in the first place today. I was previously trying to use mrxvt and after multiple issues, decided to punt.
    The default terminal appears to be gnome-terminal and yet I choose the XFCE version over the KDE version.
  6. Xming X11 Server included.
    No need for hummingbird’s crappy X server. This is discrete and works very well.
  7. Automatic TAP (bridged) networking configured.
    There is about 4 screens used during the install and none were about networking. Just works.
  8. “cofs” filesystem
    My C:\ drive is mounted at /mnt/win via their ‘cofs’ filesystem. Sweet.

If you are like me and are stuck using Windows for whatever reason, I would highly suggest checking out andLinux. They are currently in beta but I’m okay with dealing with any minor hiccups. I have used it for only two days and already feel light-years away from my previous cygwin days.

November 13, 2007

Reverse Engineering Buddy

Filed under: adminstration,linux,usability — jonEbird @ 10:34 pm

An Idea for a helpful Admin Tool

What if you got a page and/or ticket for an obscure server’s particular service? The unique problem is that your environment is huge, you’re still relatively new to the company, co-workers are not there to help you and you have never heard of this server. When logging in, you’re hoping that the person has a nice RC script under /etc/init.d/, that you can find the app via a “lsof -i:<port>”, find the application’s home and locate some log files. But what if the application install was not that nice and did not conform to the norms that you are used to?

To either a small or very large degree, you will be reverse engineering this application. If you’re really unlucky, the application who supports it also has no idea about it nor knows anything about Unix-like machines. So, what if there was an application which is polling upon logging into the server, told you, “In case you are looking for the application binX, which typically listens on port XX, it was most likely started last time by issuing the script /path/to/funky/path/binX.sh”. I’m guessing it would freak you out and immediately flood your emotions with confusion, gratitude and curiosity.

So, would such an application be difficult to write?

  • Poll any events for read/write/access under key dirs, such as /etc/init.d/, /etc/*conf ? (use the inotify syscall introduced in Linux kernel 2.6.16)
  • Track users logging into the system (could correlate later)
  • Watch for any new ports being listened on, then record the binary name.
  • Reverse engineer this application to automatically collect interesting data on it.
  • Intelligently parse an strace (note to self, checkout: http://subterfugue.org/)
  • Utilize systemtap for Linux and DTrace for Solaris. pseudo code { observe new socket being opened, so show me the last 10 files opened and executed. correlate application with startup script }

Now, if your data was collected in a easily usable format, you can collect similar data from other machines and start to make broader correlations.

The whole process is really about automating the process of reverse engineering an application. I do that alot. I believe others would like an application which aided or performed the entire reverse engineering for them.

June 21, 2007

Stripping Another Process of it’s Signal Handler

Filed under: adminstration,linux — jonEbird @ 11:38 pm

Have you ever wanted to send a signal, which normally produces a core file, but the process has one of those annoying signal handlers setup to catch the signal you’re sending? The nerve of that application trying to intelligently handle signals! I actually have a real need to remove the signal handler of a process which I’ll describe shortly. Normally, it is a bad idea to remove another process’s signal handler and under normal circumstances I do not suggest following the procedure which I am going to describe.

I have been struggling with a production issue at work with a process which has been less than cooperative. You see, I have a java process which gets crazy and starts consuming CPU cycles. When you run a strace against the process the only system call you will see is a sched_yield() call. The java thread is most likely stuck on a spinlock in user space and the process/thread which owns the lock has died or something else, but for my runaway process all it cares about it is checking for it’s lock and yielding execution back to the kernel to schedule another task. Ofcourse, it just gets the CPU again and continues to pound it.

My company pays alot of money for support and we actually have had a case now open for eight months now. The problem is we are unable to gather sufficient data for their level2 and level3 support teams. They would like a javacore to be generated, which can be done by sending a signal 3 to the java process. In addition to a javacore, they recommend sending a signal 11 (SEGV) to the process to prompt the generation of a normal binary core file. Either one would be invaluable for the support team in ascertaining what is going wrong. Unfortunately, it seems that once the process is stuck in this tight, sched_yield() loop any of the signals we send to it are being ignored. In short, that is my problem.

During my Linux Kernel Internals training with RedHat, I had an idea of writing a kernel module to strip the signal handler from the java process so I can finally generate that elusive core file. The kernel module sets up an entry under /proc named stripsignal_pid. If you read the value, it will tell you a quick one-liner about using this interface. To use the module, you write a process ID into that file and that process’s SIGABRT signal handler will be reset to the SIG_DFL. At this point, if you send a SIGABRT signal to the process the result will be writing out of it’s core file.

Download the source along with a helper test program here: stripsignal.tar.gz.

But if all you are interested in is reviewing the short source code, then feel free to browse the stripsignal.c source online.

I tell you what, the best thing I learned from the class was just familiarizing myself with the source code and actually learning some new emacs tricks for navigating large source code projects. Next writeup will be about my experiences with the GNU global tagging system.

July 31, 2006

Asynchronous Network Programming from an Admin Perspective

Filed under: adminstration,linux — jonEbird @ 9:08 pm

Asynchronous programming is common place for developers but it can often be a mysterious thing to system administrators who merely know enough programming to get by. Since the vast majority of material you will find on the subject is catered towards developers, it can easily go right over the heads of many administrators. This is for those administrators.

As a system admin myself, I tend to break applications down to the system call level while troubleshooting problems. That is where we live, day in day out, while keeping closed source applications up and running. The only window we have into the nature of proprietary applications are the system calls they make.

To me, asynchronous programming is nothing more than an exercise of using the select system call. The select system call takes various arrays of file handles ready to be read from, written to or potentially having an exception. It also blocks execution of your program as to not waste CPU cycles.

There are a handful of libraries out there to make asynchronous programming easy and painless. These libraries typically use the notion of registering a callback function to be executed when data is available for a particular file handle. This design allows the programmer to focus on the problem they are trying to solve as well as keeping your program readable. For you python coders, you should checkout Twisted if you haven’t already. They make this sort of thing pathetically simple.

But what kind of tutorial would this be if I simply used a one-liner call from Twisted? No. Instead I’ll create my own version of the Twisted echo server which is really what I wanted to demonstrate anyways.

Download asyn_echo.py


#!/bin/env python

import socket, select, sys

HOST =
PORT = 9999

client_fd = []
fd_to_conn = {}

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_fd = s.fileno()
client_fd.append(listen_fd)
s.bind((HOST, PORT))
s.listen(100)
s.settimeout(5)

while True:
r_fds, w_fds, e_fds = select.select(client_fd, [], [])
for fd in r_fds:
if fd == listen_fd:
conn, addr = s.accept()
print ‘Accepted connection from %s:%d [fd: %d]‘ %
(addr[0], addr[1], conn.fileno())
fd_to_conn[conn.fileno()] = conn
client_fd.append(conn.fileno())
else:
data = fd_to_conn[fd].recv(1024)
if not data:
# closed connection
print ‘Goodbye’
fd_to_conn[fd].close()
del(fd_to_conn[fd])
client_fd.pop(client_fd.index(fd))
else:
fd_to_conn[fd].send(data)

I actually pitted my version against the Twisted version. I created a shell script which recursively called itself in a fork bomb style and then finally sending a 1500k postscript file via netcat to the echo server. The most I sent at once was 64 netcat processes totaling approximately 60 megabytes. For the most part they both performed nearly equal. More importantly, they rely on the same select system call to efficiently process the data.

April 18, 2006

getopts morphed into a gui

Filed under: linux,usability — jonEbird @ 10:02 pm

Lately I’ve found myself trying to further learn web technology and have been
evaluating various web frameworks. The motivation is to become much more
proficient with the most widely accepted user interface: The web browser.

Being a tradition Unix/Linux user, when I think of widely understood,
standardized interfaces, I’m going to be thinking of the good ‘ole shell. Just
run your unknown utility with a ‘-h’ or a ‘–help’ to get the impatience
user’s guide. That is really nice. A key piece of that consistency is how the options
are typically processed via the get-opts library.
Most people are familiar with the standard get-opts routines that are readily
available in your language of choice… from C to python. For those of you
unfamiliar with the get-opts routine, it is the library which makes all
command line options to a program standardized. That is why passing ‘-fp’ to
ps is the same as ‘-f -p’. What I admire about this library is
how successful it has been in unifying all sorts of utilities on Unix/Linux
for years. Heck, even tar came around after those early years!
So why not we take get-opts to the next level of convenience, usability and
power? I’ve already seen this trend start with python’s replacement of the
stock get-opts with href="http://www.python.org/doc/2.4/lib/module-optparse.html">optparse.
Can you guess where I’m going with this?

What if we supplied a bit more information than merely whether or not an
option takes an argument or not, and then use that extra information for
maximum power and usability? Imagine developing your latest utility and testing via command line, but then turning around and invoking it via a web browser, GTK window, TK window, etc? And why not?

In the simplest of implementations, a web page could provide radio buttons for
toggling various options as well as text form for any additional options. I
envision a metamorphosis of information and interface options while being ease to
use. After we get the basics working, we can start implementing more advanced
features, such as: grouping like options together in sections, allowing the
user to toggle layout to alphabetical, search options, keep advanced options
hidden by default, remember the last options used and much more! Think of what that’ll do for the Linux newbies?

I can not currently
call myself a web developer and honestly I am not anxious to become one
either, yet I want to make some of my utilities available via the web. I’ll spend an equal amount of time exposing my utility via the web as I spent developing it in the first place! Are you
the same? Care to develop the next generation of get-opts?

« Previous Page