jonEbird

November 6, 2008

2D Barcodes

Filed under: blogging, usability — jonEbird @ 12:06 am

In anticipation of heading down to PyWorks 2008, I have been thinking about creating a business card for the sake of keeping in contact with people I meet. One of my main goals, while attending and speaking at PyWorks, is to network with people and mark 2008 as the year which I start participating and contributing within the OSS community. While exercising my creativity in designing a nice business card, I have also been reading about Google’s android mobile platform, and I came across a interesting intersection between the two when I saw a demonstration video.

A Google developer, working on the zxing project (pronounced “zebra crossing”), has printed a 2D barcode encoding of his personal information on the back of his business card. With the builtin camera, on his android phone, he can scan in a barcode and immediate use the encoded data. It is an impressive demonstration of integrating technology with our mobile devices. Check out the video which has inspired me to not only do the same but also write this small informational note about 2D barcodes.

If you didn’t catch it, the format of the 2D barcode on his business card is QR Code. Among the other 2D barcode formats, QR Code barcodes are most popular in Japan where it was invented by Denso-Wave. The popularity of QR Codes in Japan has grown to the level of being supported by nearly every mobile device there and that also means finding QR Codes available on a increasing amount of printed media from fliers to magazines and coupons.

There are other competing 2D barcode formats that I could choose from but after doing researching and not seeing any distinctive advantages, I have concluded to follow suit with the google developer and hope that the android phone’s application and popularity will help propel QR Code’s popularity over the other 2D barcode formats.

Since 2D barcodes are nothing more than encoding and decoding data, the first thing to decide is what data we would like to encode. Since I actually do not have a business of my own and furthermore use a work issued phone, the data I encode will probably be a URL of my website. There are other interesting encodings, though, which include email address, sms, geographic locations, etc. See zxing’s wiki about BarcodeContents for a better discussion for suggested format, including their primary suggestion of using the MECARD format which is typically a composite of Name, Address, Phone number and Email address.

Once you know what you would like to encode in your 2D barcode, I’m guessing you will need software to help with that. With the assumption that you are not going to be encoding/decoding barcodes with a large frequency, my suggestion is that you use online utilities to help you. Interestingly, it turns out that the google chart api can now generate QR Code online. That is both convenient for repeated generation of QR Code but also in dynamic generation of barcodes. But alas, Jason Delport has created a google app engine application to record your text and generate the QR Code for you by generating the google chart api link for you. At that point, you can either use the supplied URL or simply download the png image. Finally, for performing online decoding of the barcode I have found the zxing online decoder to be the best and least intrusive one available.

The main reason 2D barcodes have not really taken off here in the States is because people have not yet came up with a really good idea to propel it into the mainstream. That, my friends, is going to be up to you and me to accomplish. Or, wait, we could just let google usher it in for us? But seriously I think support for mobile devices to read 2D barcodes is great step forward. Afterwards, I can envision graphic designers coming up with clever barcode prints and ways of intriguing people to scan the code for more details, but then it would be people like us who come up with new categories of data to be encoded in the barcodes for new, innovative ways to use them.

To learn more, try the collection of interesting links provided again by the zxing folks.

October 25, 2008

PyWorks Stuff

Filed under: adminstration, python, usability — jonEbird @ 12:00 am

For the 2008 PyWorks convention, I will be presenting about LDAP and Python. The presentation is really about demystifying LDAP and encouraging people to use and extend LDAP for their config file needs. In efforts to make my point, the last half of my presentation will be a time for a demo. This entry is your basic landing point where you can download the scripts, presuming you are looking for a copy of the scripts and/or slides after seeing my presentation? (oh! nevermind, your google search landed you here)

PyWorks Speakers Badge

For the demo, I will be leveraging the fail2ban project. It is a python based application which scans typical application logs for security failures and bans IPs from being able to connect again. It also uses the builtin ConfigParser module for reading it’s 30+ config files, which is why I have chosen to use it. For the demo, I have created two scripts:

The first one, configparser2ldap.py is used to process a set of config files and automatically generate LDAP schema as well as LDIF data.

Next, I have my ldapconfig.py module where I extended the ConfigParser module to support making queries to LDAP. I am basically overriding the read() method only and leaving the rest of the module alone. This way the only modifications to the fail2ban application are how it is instantiating the ConfigParser and I won’t have to become a full time fail2ban developer if I want to centralize the configuration data in LDAP.

And that is really the main point of my presentation: The power of centralizing your configuration data and how it can drastically change how you administer your large scale server farm.

Downloads

LDAP + Python Slides.

configparser2ldap.py script to auto-generate LDAP schema and LDIF from ConfigParser compatible config files.

ldapconfig.py python module which inherits the ConfigParser and supports optionally pulling config data from LDAP.

November 13, 2007

Reverse Engineering Buddy

Filed under: adminstration, linux, usability — jonEbird @ 10:34 pm

An Idea for a helpful Admin Tool

What if you got a page and/or ticket for an obscure server’s particular service? The unique problem is that your environment is huge, you’re still relatively new to the company, co-workers are not there to help you and you have never heard of this server. When logging in, you’re hoping that the person has a nice RC script under /etc/init.d/, that you can find the app via a “lsof -i:<port>”, find the application’s home and locate some log files. But what if the application install was not that nice and did not conform to the norms that you are used to?

To either a small or very large degree, you will be reverse engineering this application. If you’re really unlucky, the application who supports it also has no idea about it nor knows anything about Unix-like machines. So, what if there was an application which is polling upon logging into the server, told you, “In case you are looking for the application binX, which typically listens on port XX, it was most likely started last time by issuing the script /path/to/funky/path/binX.sh”. I’m guessing it would freak you out and immediately flood your emotions with confusion, gratitude and curiosity.

So, would such an application be difficult to write?

  • Poll any events for read/write/access under key dirs, such as /etc/init.d/, /etc/*conf ? (use the inotify syscall introduced in Linux kernel 2.6.16)
  • Track users logging into the system (could correlate later)
  • Watch for any new ports being listened on, then record the binary name.
  • Reverse engineer this application to automatically collect interesting data on it.
  • Intelligently parse an strace (note to self, checkout: http://subterfugue.org/)
  • Utilize systemtap for Linux and DTrace for Solaris. pseudo code { observe new socket being opened, so show me the last 10 files opened and executed. correlate application with startup script }

Now, if your data was collected in a easily usable format, you can collect similar data from other machines and start to make broader correlations.

The whole process is really about automating the process of reverse engineering an application. I do that alot. I believe others would like an application which aided or performed the entire reverse engineering for them.

January 15, 2007

Python Admin vs. Java Developers

Filed under: adminstration, usability — jonEbird @ 5:31 pm

What is the best programming language for a system administrator? Queue the language war, please. The typical arguments are “your language can’t do this”, “this library doesn’t have a consistent naming convention”, well “my language is faster”, yeah and “your syntax is hideous to read much less use”, blah blah blah. No, I’m not a professional developer but I do spend a significant time doing development as a systems administrator. My programs are not huge year long projects, will probably never reach million lines of code and usually never need superb speed. For administrators, the most important aspect of the language of choice is productivity and maintainability.

When choosing your language, I recommend picking one that has a decent user community, is available on numerous platforms, has had significant time to mature in proving itself and has an extensive modules/library support. Meeting these requirements will leave you using a language that should keep you efficiently producing solutions to your administrative tasks.

First let’s eliminate some languages based on maintainability. Goodbye Haskell, lisp, scheme, Erlang and any other purely functional languages you have used or know of. I’d venture to say that less than 2% of system administrators are comfortable using any one of those languages. And you can obviously not choose a language which only yourself are going to be able to maintain. Aside from staying away from the obscure, the program should be intuitive to read. People can argue on the virtues of their favorite language and why it lends itself to writing maintainable code, but writing maintainable code is truly a skill. You can write obfusticated code in any language. It takes practice and a conscience effort of keeping your code clean and organized well. Here, practice makes perfect, is the key.

Secondly, and in my opinion the most important aspect of the language of choice is staying efficient. Ideally, each program should be succinct and to the point. I no longer use C/C++ regularly, even though that’s the language I started with, because you simply have to write much more code which another language can do in half or less of work. Try looking at one of the ‘P’s of the LAMP stack and see which fits you better and you can see yourself being productive in. That is, evaluate Python, Perl, PHP and Ruby (okay, not a ‘P’ but whatever). Don’t use a language that doesn’t make sense to you. Don’t waste your time.

And finally, time to explain this title and tell a little story where some customer data was delayed during one day’s production incident. One day, we had a production issue where messages were accidentally dequeued from a IBM Webphere MQSeries queue. A tool which was used to grab just one message dequeued all of the messages. To top it all off, the same tool kept seg faulting while trying to requeue the same messages. The solution left to us was to manually parse out each of the discrete messages into separate files. Once in that state, we had another known tool which could upload the messages separately. There were three developers and myself on the phone and we were all racing to the solution. My language of choice was Python and the rest of the developers used the language that they use professionally, Java. So who reached the solution first? Well I wouldn’t be writing this if I hadn’t won, would I? For me, Python makes sense and I can efficiently write code which I like to think other people will be able to understand and update. That is what is most important for your language of choice.

[ As un-entertaining as it is, you can view the Python solution. ]

November 6, 2006

Scripting Best Practices

Filed under: adminstration, usability — jonEbird @ 6:35 pm

Nothing too fancy here. Just a list of the most common things I find desirable while writing shell scripts.

  1. Use meaningful variable names
  2. This point is strictly for the sake of readability. Too often when trying to read somebodies script I’ll actually do various search & replaces of their variables because they used variables like “w”, “w2″, “w3″. It was quick and dirty for the author, but the inheritor of that script would appreciate if you had used more meaningful variable names.

  3. Comment your code.
  4. This goes without saying, really…

  5. Visually separate any optional settings sections.
  6. Don’t know about you, but sometimes I get lazy and don’t feel like using getopts. Instead, I’ll throw my what would be optional arguments as hard coded variables at the top of my script. I think this is fine, but you’ll want to visually segregate these optional variables from the rest of the script.
    I like to use a — dashed line of about 50-70 characters and even put the words “do not modify beyond this point” to further emphasize what you’re encouraged to change and what shouldn’t normally be touched.

  7. Use relative pathing for accessing files to the script.
  8. Never assume the user’s cwd is the same as the script and use “./” to run or source another file. I like to set a variable REL_DIR=`dirname $0` and use it to reference the directory where the very script is running from.
    E.g. You have a functions script you’d like to source, then with that REL_DIR variable you would “. ${REL_DIR}/<some-file>“.
    I’m actually surprised on how often this happens.

  9. Always print a usage statement for improper usage and/or when -h option used.
  10. My code excerpt typically looks like:

    USAGE="Usage: `basename $0` <my options here>"
    if [ -z "$SOME_ARG" ]; then
        print $USAGE 1>&2
        exit 1
    fi
    
  11. Conscientiously use STDOUT vs. STDERR in different scenarios.
  12. Not a script faux pau really, but it can help during the development process. Use STDOUT only for informational messages and/or optional debugging info. Then STDERR would only be used for errors. That way, when running the script you can optionally turn off stdout (1>&-) and easily check that nothing was printed to STDERR. When the output is mixed you’ll have a greater chance of missing the error.
    One example of this technique in action is when using the tar command. Try leaving out the verbose (’v') option when creating or extracting your archive, then you can easily see when you might have had a permissions issue or something else related.

  13. Keep all required variables defined in the script.
  14. Define the required variables at the top of the script. Even mention that they are REQUIRED. A good example of this is scripts that use Sybase’s isql utility. Anytime I run isql, I like to set something like:

    # required variables for isql
    SYBASE=/some/path/to/sybase
    LD_LIBRARY_PATH=$SYBASE/lib
    

    What you want to avoid is a situation where the script works because you’ve got the required variable set in your env, but only because it’s set in one of your dot files.

  15. Cron’ed Scripts.
  16. Two common principles I like to emphasize here:
    1. Keep all required variable/env settings in the script! cron does NOT source your dot files.
    2. Redirect stdout, but leave stderr unmanaged. This is a cheap technique, but whenever I don’t have time to test for all possible errors I simply setup my .forward file and let cron email me the output produced from the cron script. Though, to be complete, you should really manage your stderr in other fashions.

  17. Keep your exit/return codes categorized.
  18. Not always important for small scripts, but a good practice.
    For any sort of error checking your script might perform, use a unique error code for each situation that you decide to exit the shell script. That will make invocations of your script more manageable.

  19. Avoid “Magic” Numbers
  20. Anywhere you are comparing a value to some, seemingly, arbitrary number, go ahead and set that value to a meaningful variable name. Then your comparison reads alot better.
    Using “$CURRENT_VALUE -gt $THRESHOLD” is much better than finding “$CURRENT_VALUE -gt 83” buried in some script and not having any clue what the number 83 signifies aside from the surrounding code.

  21. Use unique temporary files.
  22. Never do this: /path/to/some/command -option > command.out.
    You are assuming that you are sitting in a directory where you have permissions to create a temporary file and secondly that no one will ever be running the same script at the same time you are.
    Some shells make creating temporary files easy with commands such as mktemp. I typically employ a convention where I define my temporary file space as “TEMP=/tmp/.myshellname$$_“. Then lets say I need a temp file to capture the output from ps. I might redirect it to ${TEMP}raw_ps.
    And finally, at the end of the script, or defined in a shell function, you can cleanup each temporary file with one line: rm -f ${TEMP}*.

In general, well written code/scripts should read well and be organized well. Every principle discussed above has one purpose: maintainability.

April 18, 2006

getopts morphed into a gui

Filed under: linux, usability — jonEbird @ 10:02 pm

Lately I’ve found myself trying to further learn web technology and have been
evaluating various web frameworks. The motivation is to become much more
proficient with the most widely accepted user interface: The web browser.

Being a tradition Unix/Linux user, when I think of widely understood,
standardized interfaces, I’m going to be thinking of the good ‘ole shell. Just
run your unknown utility with a ‘-h’ or a ‘–help’ to get the impatience
user’s guide. That is really nice. A key piece of that consistency is how the options
are typically processed via the get-opts library.
Most people are familiar with the standard get-opts routines that are readily
available in your language of choice… from C to python. For those of you
unfamiliar with the get-opts routine, it is the library which makes all
command line options to a program standardized. That is why passing ‘-fp’ to
ps is the same as ‘-f -p’. What I admire about this library is
how successful it has been in unifying all sorts of utilities on Unix/Linux
for years. Heck, even tar came around after those early years!
So why not we take get-opts to the next level of convenience, usability and
power? I’ve already seen this trend start with python’s replacement of the
stock get-opts with href="http://www.python.org/doc/2.4/lib/module-optparse.html">optparse.
Can you guess where I’m going with this?

What if we supplied a bit more information than merely whether or not an
option takes an argument or not, and then use that extra information for
maximum power and usability? Imagine developing your latest utility and testing via command line, but then turning around and invoking it via a web browser, GTK window, TK window, etc? And why not?

In the simplest of implementations, a web page could provide radio buttons for
toggling various options as well as text form for any additional options. I
envision a metamorphosis of information and interface options while being ease to
use. After we get the basics working, we can start implementing more advanced
features, such as: grouping like options together in sections, allowing the
user to toggle layout to alphabetical, search options, keep advanced options
hidden by default, remember the last options used and much more! Think of what that’ll do for the Linux newbies?

I can not currently
call myself a web developer and honestly I am not anxious to become one
either, yet I want to make some of my utilities available via the web. I’ll spend an equal amount of time exposing my utility via the web as I spent developing it in the first place! Are you
the same? Care to develop the next generation of get-opts?