Using RDP Clients with VirtualBox machines

I’ve been working with VirtualBox a lot recently, but one obstacle I’ve banged my head against repeatedly was getting a Remote Desktop Client (like Microsoft Terminal Services Client, for instance) to successfully connect.  It was peculiar because I could ping the VM ip address, connect through SSH, made sure I had the Extension Pack installed and so on, but the RDP client just kept timing out without any helpful explanation.

Finally I went back and read the documentation and figured out my mistake.  I had assumed I should use the IP address of my virtual machine as the target, but according to the documentation (see below) I should have been using the IP address of the host instead.

Chapter 7. Remote virtual machines

Since VRDP is backwards-compatible to RDP, you can use any standard RDP viewer to connect to such a remote virtual machine (examples follow below). For this to work, you must specify the IP address of your host system (not of the virtual machine!) as the server address to connect to, as well as the port number that the VRDP server is using.

I guess VirtualBox on the host must provide some sort of proxy/passthrough service, and that you set different ports for each virtual machine if you need to switch back and forth between different VMs? So you might tell VM #1 to use port 3389, VM #2 to use port 3390, and so on, for example.

Hopefully this will save someone a little time and frustration (or remind me if I should forget) in the future.

Using JavaScript to Solve a Trivia Question

This weekend, I wound up visiting a restaurant with an irritating trivia question on their welcoming chalkboard: “How many numbers between 0 and 200 contain the number ‘1’ in them?”  (e.g. 1, 10, 11, etc.) Puzzles like this tend to stick in my craw, so I tucked it away in my “stuff to figure out later when I have more time” stack.

Since I don’t know a mathematical algorithm to generate a sequence of numbers that contain the number 1 in them, the only way I see to solve this puzzle is “brute force.” Although we can do this manually on a piece of paper fairly quickly (i.e. write the numbers 0 through 200, circle the ones that have a one in them, and then count the circled numbers), that approach is a manual process and can’t be easily reused with different parameters. It also doesn’t scale well if the range of numbers to be checked grows significantly (e.g. between 0 and 10,000).

. . . .
<script type="text/javascript">

    function howManyXinYthruZ(x, startY, endZ) {

        counter = 0; // initialize a counter

        // iterate from first number to last
        for (a = startY; a <= endZ; a++) {

            // convert number to string . . .
            aString = a.toString();

            // so we can search for X
            if (aString.search(x) != -1) {

                // if we find one, ding the counter
                counter++;

                // purely optional, for "proof"
                console.log(a);
                }
            }

    // share the final tally of matches
    console.log("count: " + counter);
}

    // invoke the function to solve the puzzle
    howManyXinYthruZ( 1, 0, 200);

</script>
. . . .

If you run this little script in a blank page in your browser and then crack open the console, you’ll see a list of all the numbers that match the specified the criteria as well as the final total of matches (i.e. 119).

The nice part of this is that if I ever encounter another trivia question similar to this one (i.e. “How many numbers between 3000 and 5000 have the number 3 in them?”), I can quickly and easily solve it with one function call:

howManyXinYthruZ( 3, 3000, 5000);

// 1,271 in case you’re curious

Not bad, JavaScript.  Not bad.

Working With Fixed Width Text Files

Yesterday I attended GovLoop’s “Open Data Event” and was inspired to revisit a project I’d set aside a few months ago . . . to try a different approach.

For those who don’t know, the Fairfax County Police Department posts a weekly “data dump” of the tickets issued and arrests made.  I know Fairfax Underground has created a search engine based around a collection of this data, but it’s little more than a novelty item to see if anyone you know (e.g. friend, neighbor, blind date, etc.) has been arrested or ticketed.  I haven’t seen anyone plot the original data set on a Google Map, let alone mashup with other data sets such as real estate prices or school test scores. It’s become a curiosity for me, to see if something more significant can be done with that information.

One frustrating challenge with this particular data file is that the data is stored in a fixed width text file– a very generic format, but not as immediately useful as XML or JSON, for instance.  You have to write code that tells your application to use characters 1-40 for column 1 data, use characters 41-60 for column 2 data, and so on . . . plus you have to hard code the names and quantity of the data fields, and that just didn’t seem like a particularly robust situation for me.

I wanted to create a php function that could, for lack of a better term, “read and analyze” any fixed width text file and return the names of the fields as well as the start positions for each field dynamically. I’ve written before about the struggle to come up with reusable algorithms that we can apply to different variations of the same basic principles, but this one took longer than I thought.

I won’t rehash the different approaches and failures, and jump to the approach that finally worked for me.

 /* let's open our fixed width text file */
 $handle = @fopen("arrests.txt", "r");

 if ($handle) {
 // read our first line
 $headerRow = fgets($handle, 4096);
 
 /* parse for header row names, using TWO
  or more spaces for RegEx pattern */
 $headerRowNames = preg_split('/(?:\s\s+|\n|\t)/', $headerRow, 0, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_OFFSET_CAPTURE);

// display array contents for examination
 echo "<pre>";
 var_dump($headerRowNames);
 echo "</pre>";

Basically, we grab the first line of the file and use a regular expression of two or more consecutive spaces to figure out where one header row name ends and the next one begins. Since one of the header rows in my sample file has a space in it (i.e. Charge Description), we can’t use a single space as the delimiter. NOTE: There is an edge case where a header row name could be so long that only a single space separates it from the next header row name; so it’s not 100% “bulletproof”, but this was as close I could get to a solution with only one evening of work.

Here are the results when we display the array:

array(8) {
  [0]=>
  array(2) {
    [0]=>
    string(5) "LName"
    [1]=>
    int(0)
  }
  [1]=>
  array(2) {
    [0]=>
    string(5) "FName"
    [1]=>
    int(40)
  }
  [2]=>
  array(2) {
    [0]=>
    string(5) "MName"
    [1]=>
    int(60)
  }
  [3]=>
  array(2) {
    [0]=>
    string(3) "Age"
    [1]=>
    int(100)
  }
  [4]=>
  array(2) {
    [0]=>
    string(7) "DateArr"
    [1]=>
    int(105)
  }
  [5]=>
  array(2) {
    [0]=>
    string(6) "Charge"
    [1]=>
    int(135)
  }
  [6]=>
  array(2) {
    [0]=>
    string(15) "Charge Descript"
    [1]=>
    int(160)
  }
  [7]=>
  array(2) {
    [0]=>
    string(7) "Address"
    [1]=>
    int(210)
  }
}

This gives us a multidimensional array with both the field names contained in the header row and their offset positions, which will certainly come in handy as we process the remaining lines of the file. Time permitting, I’d like to drop this as a method into a php class, alongside methods to convert the extracted data into either a JSON or XML file. That way, I’ll have a utility class that can add value to other fixed width text files in the future.

because "all web professionals must have a website if they want to be taken seriously."