jonEbird

November 13, 2007

Reverse Engineering Buddy

Filed under: adminstration,linux,usability — jonEbird @ 10:34 pm

An Idea for a helpful Admin Tool

What if you got a page and/or ticket for an obscure server’s particular service? The unique problem is that your environment is huge, you’re still relatively new to the company, co-workers are not there to help you and you have never heard of this server. When logging in, you’re hoping that the person has a nice RC script under /etc/init.d/, that you can find the app via a “lsof -i:<port>”, find the application’s home and locate some log files. But what if the application install was not that nice and did not conform to the norms that you are used to?

To either a small or very large degree, you will be reverse engineering this application. If you’re really unlucky, the application who supports it also has no idea about it nor knows anything about Unix-like machines. So, what if there was an application which is polling upon logging into the server, told you, “In case you are looking for the application binX, which typically listens on port XX, it was most likely started last time by issuing the script /path/to/funky/path/binX.sh”. I’m guessing it would freak you out and immediately flood your emotions with confusion, gratitude and curiosity.

So, would such an application be difficult to write?

  • Poll any events for read/write/access under key dirs, such as /etc/init.d/, /etc/*conf ? (use the inotify syscall introduced in Linux kernel 2.6.16)
  • Track users logging into the system (could correlate later)
  • Watch for any new ports being listened on, then record the binary name.
  • Reverse engineer this application to automatically collect interesting data on it.
  • Intelligently parse an strace (note to self, checkout: http://subterfugue.org/)
  • Utilize systemtap for Linux and DTrace for Solaris. pseudo code { observe new socket being opened, so show me the last 10 files opened and executed. correlate application with startup script }

Now, if your data was collected in a easily usable format, you can collect similar data from other machines and start to make broader correlations.

The whole process is really about automating the process of reverse engineering an application. I do that alot. I believe others would like an application which aided or performed the entire reverse engineering for them.

January 15, 2007

Python Admin vs. Java Developers

Filed under: adminstration,usability — jonEbird @ 5:31 pm

What is the best programming language for a system administrator? Queue the language war, please. The typical arguments are “your language can’t do this”, “this library doesn’t have a consistent naming convention”, well “my language is faster”, yeah and “your syntax is hideous to read much less use”, blah blah blah. No, I’m not a professional developer but I do spend a significant time doing development as a systems administrator. My programs are not huge year long projects, will probably never reach million lines of code and usually never need superb speed. For administrators, the most important aspect of the language of choice is productivity and maintainability.

When choosing your language, I recommend picking one that has a decent user community, is available on numerous platforms, has had significant time to mature in proving itself and has an extensive modules/library support. Meeting these requirements will leave you using a language that should keep you efficiently producing solutions to your administrative tasks.

First let’s eliminate some languages based on maintainability. Goodbye Haskell, lisp, scheme, Erlang and any other purely functional languages you have used or know of. I’d venture to say that less than 2% of system administrators are comfortable using any one of those languages. And you can obviously not choose a language which only yourself are going to be able to maintain. Aside from staying away from the obscure, the program should be intuitive to read. People can argue on the virtues of their favorite language and why it lends itself to writing maintainable code, but writing maintainable code is truly a skill. You can write obfusticated code in any language. It takes practice and a conscience effort of keeping your code clean and organized well. Here, practice makes perfect, is the key.

Secondly, and in my opinion the most important aspect of the language of choice is staying efficient. Ideally, each program should be succinct and to the point. I no longer use C/C++ regularly, even though that’s the language I started with, because you simply have to write much more code which another language can do in half or less of work. Try looking at one of the ‘P’s of the LAMP stack and see which fits you better and you can see yourself being productive in. That is, evaluate Python, Perl, PHP and Ruby (okay, not a ‘P’ but whatever). Don’t use a language that doesn’t make sense to you. Don’t waste your time.

And finally, time to explain this title and tell a little story where some customer data was delayed during one day’s production incident. One day, we had a production issue where messages were accidentally dequeued from a IBM Webphere MQSeries queue. A tool which was used to grab just one message dequeued all of the messages. To top it all off, the same tool kept seg faulting while trying to requeue the same messages. The solution left to us was to manually parse out each of the discrete messages into separate files. Once in that state, we had another known tool which could upload the messages separately. There were three developers and myself on the phone and we were all racing to the solution. My language of choice was Python and the rest of the developers used the language that they use professionally, Java. So who reached the solution first? Well I wouldn’t be writing this if I hadn’t won, would I? For me, Python makes sense and I can efficiently write code which I like to think other people will be able to understand and update. That is what is most important for your language of choice.

[ As un-entertaining as it is, you can view the Python solution. ]

November 6, 2006

Scripting Best Practices

Filed under: adminstration,usability — jonEbird @ 6:35 pm

Nothing too fancy here. Just a list of the most common things I find desirable while writing shell scripts.

  1. Use meaningful variable names
  2. This point is strictly for the sake of readability. Too often when trying to read somebodies script I’ll actually do various search & replaces of their variables because they used variables like “w”, “w2″, “w3″. It was quick and dirty for the author, but the inheritor of that script would appreciate if you had used more meaningful variable names.

  3. Comment your code.
  4. This goes without saying, really…

  5. Visually separate any optional settings sections.
  6. Don’t know about you, but sometimes I get lazy and don’t feel like using getopts. Instead, I’ll throw my what would be optional arguments as hard coded variables at the top of my script. I think this is fine, but you’ll want to visually segregate these optional variables from the rest of the script.
    I like to use a — dashed line of about 50-70 characters and even put the words “do not modify beyond this point” to further emphasize what you’re encouraged to change and what shouldn’t normally be touched.

  7. Use relative pathing for accessing files to the script.
  8. Never assume the user’s cwd is the same as the script and use “./” to run or source another file. I like to set a variable REL_DIR=`dirname $0` and use it to reference the directory where the very script is running from.
    E.g. You have a functions script you’d like to source, then with that REL_DIR variable you would “. ${REL_DIR}/<some-file>“.
    I’m actually surprised on how often this happens.

  9. Always print a usage statement for improper usage and/or when -h option used.
  10. My code excerpt typically looks like:

    USAGE="Usage: `basename $0` <my options here>"
    if [ -z "$SOME_ARG" ]; then
        print $USAGE 1>&2
        exit 1
    fi
    
  11. Conscientiously use STDOUT vs. STDERR in different scenarios.
  12. Not a script faux pau really, but it can help during the development process. Use STDOUT only for informational messages and/or optional debugging info. Then STDERR would only be used for errors. That way, when running the script you can optionally turn off stdout (1>&-) and easily check that nothing was printed to STDERR. When the output is mixed you’ll have a greater chance of missing the error.
    One example of this technique in action is when using the tar command. Try leaving out the verbose (‘v’) option when creating or extracting your archive, then you can easily see when you might have had a permissions issue or something else related.

  13. Keep all required variables defined in the script.
  14. Define the required variables at the top of the script. Even mention that they are REQUIRED. A good example of this is scripts that use Sybase’s isql utility. Anytime I run isql, I like to set something like:

    # required variables for isql
    SYBASE=/some/path/to/sybase
    LD_LIBRARY_PATH=$SYBASE/lib
    

    What you want to avoid is a situation where the script works because you’ve got the required variable set in your env, but only because it’s set in one of your dot files.

  15. Cron’ed Scripts.
  16. Two common principles I like to emphasize here:
    1. Keep all required variable/env settings in the script! cron does NOT source your dot files.
    2. Redirect stdout, but leave stderr unmanaged. This is a cheap technique, but whenever I don’t have time to test for all possible errors I simply setup my .forward file and let cron email me the output produced from the cron script. Though, to be complete, you should really manage your stderr in other fashions.

  17. Keep your exit/return codes categorized.
  18. Not always important for small scripts, but a good practice.
    For any sort of error checking your script might perform, use a unique error code for each situation that you decide to exit the shell script. That will make invocations of your script more manageable.

  19. Avoid “Magic” Numbers
  20. Anywhere you are comparing a value to some, seemingly, arbitrary number, go ahead and set that value to a meaningful variable name. Then your comparison reads alot better.
    Using “$CURRENT_VALUE -gt $THRESHOLD” is much better than finding “$CURRENT_VALUE -gt 83” buried in some script and not having any clue what the number 83 signifies aside from the surrounding code.

  21. Use unique temporary files.
  22. Never do this: /path/to/some/command -option > command.out.
    You are assuming that you are sitting in a directory where you have permissions to create a temporary file and secondly that no one will ever be running the same script at the same time you are.
    Some shells make creating temporary files easy with commands such as mktemp. I typically employ a convention where I define my temporary file space as “TEMP=/tmp/.myshellname$$_“. Then lets say I need a temp file to capture the output from ps. I might redirect it to ${TEMP}raw_ps.
    And finally, at the end of the script, or defined in a shell function, you can cleanup each temporary file with one line: rm -f ${TEMP}*.

In general, well written code/scripts should read well and be organized well. Every principle discussed above has one purpose: maintainability.

April 18, 2006

getopts morphed into a gui

Filed under: linux,usability — jonEbird @ 10:02 pm

Lately I’ve found myself trying to further learn web technology and have been
evaluating various web frameworks. The motivation is to become much more
proficient with the most widely accepted user interface: The web browser.

Being a tradition Unix/Linux user, when I think of widely understood,
standardized interfaces, I’m going to be thinking of the good ‘ole shell. Just
run your unknown utility with a ‘-h’ or a ‘–help’ to get the impatience
user’s guide. That is really nice. A key piece of that consistency is how the options
are typically processed via the get-opts library.
Most people are familiar with the standard get-opts routines that are readily
available in your language of choice… from C to python. For those of you
unfamiliar with the get-opts routine, it is the library which makes all
command line options to a program standardized. That is why passing ‘-fp’ to
ps is the same as ‘-f -p’. What I admire about this library is
how successful it has been in unifying all sorts of utilities on Unix/Linux
for years. Heck, even tar came around after those early years!
So why not we take get-opts to the next level of convenience, usability and
power? I’ve already seen this trend start with python’s replacement of the
stock get-opts with href="http://www.python.org/doc/2.4/lib/module-optparse.html">optparse.
Can you guess where I’m going with this?

What if we supplied a bit more information than merely whether or not an
option takes an argument or not, and then use that extra information for
maximum power and usability? Imagine developing your latest utility and testing via command line, but then turning around and invoking it via a web browser, GTK window, TK window, etc? And why not?

In the simplest of implementations, a web page could provide radio buttons for
toggling various options as well as text form for any additional options. I
envision a metamorphosis of information and interface options while being ease to
use. After we get the basics working, we can start implementing more advanced
features, such as: grouping like options together in sections, allowing the
user to toggle layout to alphabetical, search options, keep advanced options
hidden by default, remember the last options used and much more! Think of what that’ll do for the Linux newbies?

I can not currently
call myself a web developer and honestly I am not anxious to become one
either, yet I want to make some of my utilities available via the web. I’ll spend an equal amount of time exposing my utility via the web as I spent developing it in the first place! Are you
the same? Care to develop the next generation of get-opts?

« Previous Page