Arduino + Laser Tripwire + Camera

This project makes use of an Arduino and some fairly simple electronics to trigger a DSLR when a person breaks a laser beam. This laser beam is password-protected and can be enabled and disabled using an iPhone/iPod Touch web app. It’s conceptually simple, and makes for really fun party games as well.

The Laser

In my set up, I used several small mirrors to turn the single beam into a grid that covered a room.

Laser beam reflected around a room.

The 5mw green laser is barely visible under normal conditions; the image above has a long exposure timing. The laser is turned on and off by the Arduino, and so this beam is not visible in photos. The laser is from a normal laser pointer, and looks like this:

The laser pointer. It's held in a microphone clip mounted on a tiny tripod.

The laser pointer’s button is held down by the clip, and the power supply to the laser is controlled using a relay. The detector The laser beam is detected by an LDR in a simple voltage divider circuit connected to the Arduino. The Arduino is configured to use an interrupt to process the voltage change – this will be described in detail a little further down. As the laser beam is scattered by the imperfections in the mirrors, it is first collected and focused onto the LDR using a fresnel lens. Here is a photo of the beam:

Scattered laser beam

And here is the beam collection set-up:

Beam collection & detection

Note the lens on the right. It is positioned to focus the incoming light onto the sensor. This saturates the LDR, allowing it’s resistance to fall to around 300Ω (In complete darkness, it is around 30,000Ω). The circuit diagram is shown below (drawn in fritzing). Note that the LDR is in series with 5kΩ of resistance, and that the voltage is being measured by digital input 2, not an analog input.

LDR voltage divider circuit to detect the laser beam.

The simple way of detecting if the light beam was interrupted would be to use the Arduino’s built-in analog to digital converter and the handy analogRead() function to determine the potential across the LDR and use that to calculate the intensity of the incident light. The problem with this is that it’s slow (100μs per read) and it requires constant polling, which prevents it from doing anything else (strictly speaking, this is not true, but at best the polling frequency would suffer.)

By connecting it to digital pin 2, we can use the Arduino’s external interrupt pin to call a function when the voltage rises from LOW to HIGH. Instead of setting a voltage threshold (as we would do in the simple method), we set the sensitivity of the trigger in hardware – by changing the resistance from +5v to the LDR, we can adjust the threshold intensity of light.

The Camera Trigger

The camera trigger is a modified Yongnuo ML-L3 clone (available on eBay) that has additional female header pins soldered onto the button contacts. The remote’s board has holes drilled in a convenient position. Modifying it is simple – just use a screwdriver to pry off the top sticker, solder on the female headers, cut the sticker and paste it back. The original button remains fully functional.

Camera Trigger

An optoisolator is used (with the two headers connected across the phototransistor end) to trigger the shutter.

The Laser & The Buzzer

Connecting these two is very easy – the buzzer is controlled using a transistor (I used the MPS-2222A, a small NPN transistor), and the laser is controlled using a relay, which is in turn controlled by a transistor.

The only caveat is that the laser diode expects a 3V power supply, and so the relay is connected to the Arduino’s 3V output. Ideally, the laser would have its own power circuit and surge protection and it would be connected to an independent power supply. The photo of the breadboarded circuit may be helpful:

The circuit on a breadboard.

Note the shroud for the LDR on the bottom left of the image. It’s cut from a sheet of black paper, and it allows you to greatly increase the sensitivity of the circuit.

The Code

The code is split up into three sections: An iPhone web app, a Python server running on a computer and finally the Arduino code. In the current version of the software, users can enable and disable the laser beam remotely using a very simple interface:

iPhone/iPod Touch app.

The code is simple and commented, and can be downloaded here.

If you do make your own version, link to it in the comments and I’ll feature it here.

Review: Densitron DD-6448BE-2A (OLED panel)

The good folks over at element14 sent me a really, really small OLED display for an upcoming project. The Densitron DD-6448BE-2A with the accompanying evaluation board. It’s a blue/black OLED screen, 64 pixels wide by 48 pixels high with good visibility and pretty decent contrast. It uses 3V logic, but needs a 9V supply to drive the display. Here’s a photo of the screen mounted on the evaluation board:

Densitron DD-6448BE-2A

And here’s a closer view of the screen itself; It’s only 18.5 mm wide and 18.1 mm tall:

Closer Look

More details (and the review proper) in a few weeks’ time!

Persistence of Vision prototype

Recently, I saw an impressive business card how-to that uses Persistence of Vision to display floating text. Take a look at it.

I was considering the plausibility of a version of this card that can display the text and graphics in color. The basic mechanism (Persistence of Vision) is still the same, but the new card uses either 3-color RGB SMD LEDs (which are prohibitively expensive), or three rows of LEDs (red, green and blue each get their own row).

Row Alignment

In a normal PoV card, the text is displayed along a single arc (the yellow path). Once multiple rows of LEDs are introduced, it becomes necessary to align them such that all their paths coincide in order to display a single color image. Without any alignment, there is an obvious separation in the paths. In the diagram below, observe how the paths appear to coincide closest to the vertical position, but are far apart at the ends:

For the rows to be successfully aligned, they must converge at the point of the elbow (assuming, of course, that the person holding the card holds his elbow in a fixed position while viewing these cards). The alignment must be like this:

Where r is the distance from the pivot point (your elbow) to the lowest LED, and θ is the angular displacement between two adjacent rows. Notice that the arcs do overlap, but have an angular displacement. This is unavoidable, and can be (reasonably) easily corrected in software. The exact method used to correct this is discussed in the next section. In the final model, it is desirable to make θ as small as possible. Also, you must know the value of r before you can build the card.

Calculate the exact placement of each of the LEDs in such an arrangement only requires basic trigonometry (or even geometry), and is left as an exercise to the reader. The main constraining variables are r, d, and the minimum distance between LEDs. Assuming 0.5m (a child’s arm), 5 cm and 8 mm (respectively), θ = 0.9° and the topmost LEDs will be 9 mm apart – a close-to-parallel construction. In fact, using the above values for d and minimum distance, we can quickly plot the difference in spacing (in mm) against the arm length (in meters):

Displacement against r

It’s possible to make a card with moving sections that allow a person to pick their height, but that would simply be overkill. Here’s a design:

If you implement, do tell me! The combination of precise cutting and moving electronics puts this out of the reach of all but the most determined hobbyists.

I have an initial prototype of a single-color device, made using cardboard, copper tape and an Arduino Uno (not mounted on the card itself – for obvious reasons). This is the card with two LEDs lit:

Prototype

And this is the circuit diagram sketched out on the cardboard, with components placed appropriately:

Sketch

iPad- and Kinect-Controlled Car

This project extends a simple remote-controlled car, allowing it to be controlled by an iPad or by hand gestures. This project builds on the Arduino project, the Kinect and certain HTML5 features (WebSockets, DeviceMotionEvent, Canvas). The final product is this:

Overview

There are two different versions of this project – one for the HTML5 web app, and the other for the Kinect. In the HTML5-based version, the web application uses DeviceMotionEvent to get accelerometer readings and determine what the car has to do. This action is encoded in the format expected by the Arduino sketch, and is then sent over a WebSocket to a simple server written in Python. The Python script simply forwards the received data to the Arduino via the serial port. The Arduino toggles its output to close and open switches on the remote controller (using optical isolators). The car moves correspondingly.

The Kinect-based version functions in a nearly identical manner – the only difference is that the Kinect data is received and processed in the same C# application that dispatches instructions over the serial port. You can download the full source code here.

Now, to take a closer look at each section of this project, from the bottom-up:

The Arduino

The Arduino receives commands from its Serial interface and toggles its output to control the car’s remote controller.  For a controller that supports only one speed, the circuit looks like this:

Fritzing image of circuit.

Each output pin controls current passing through an opto-isolator, which isolates the circuit of the Arduino from that of the car’s controller. This allows the Arduino to control the car, despite both circuits having different electrical potentials. The switches at the top of the above diagrams are placeholders for the actual control mechanism of the car.  A current-limiting resistor is chosen so as to provide a current within the operating parameters of the opto-isolator. The breadboarded circuit looks like this:

Photo of Arduino Circuit

The sketch that the Arduino runs is very simple:

int pinRight = 11;
int pinLeft = 10;
int pinForward = 9;
int pinReverse = 8;

void setup() {
  Serial.begin(9600);
  pinMode(pinRight, OUTPUT);
  pinMode(pinLeft, OUTPUT);
  pinMode(pinForward, OUTPUT);
  pinMode(pinReverse, OUTPUT);
}

void loop() {
  if(Serial.available() > 0){
    int tmpByte = Serial.read();
    switch(tmpByte){
      case 'w': // Move car FORWARDS
        digitalWrite(pinReverse, LOW);
        digitalWrite(pinForward, HIGH);
        break;
      case 's': // Move car in REVERSE
        digitalWrite(pinForward, LOW);
        digitalWrite(pinReverse, HIGH);
        break;
      case 'a': // Turn steering wheels LEFT
        digitalWrite(pinRight, LOW);
        digitalWrite(pinLeft, HIGH);
        break;
      case 'd': // Turn steering wheels RIGHT
        digitalWrite(pinLeft, LOW);
        digitalWrite(pinRight, HIGH);
        break;
      case '_': // STOP ALL motion
        digitalWrite(pinReverse, LOW);
        digitalWrite(pinForward, LOW);
        // The missing break; here is entirely intentional.
      case 'x': // Move steering wheels STRAIGHT
        digitalWrite(pinRight, LOW);
        digitalWrite(pinLeft, LOW);
        break;
      default:
        break;
    }
  }
}

Notice that the digitalWrite(pin, LOW); always precedes the digitalWrite(pin, HIGH); command? This is to prevent conflicting commands from being sent to the car(e.g. Forwards and Backwards simultaneously).

And now, on to the Web App-based controller:

Web Application Controller

This controller comes in two parts. One is the actual client, which is served as a single html file (with optional additions – discussed later), and one is the server, which is a Python script that simply copies all data sent over a WebSocket to the Arduino over a serial port. This is how the client interface looks like:

The web app on an iPad.

The web app on an iPad.

The server is based on this Python script (if you want to use a test server, grab this code – the response headers adhere to the Same-Origin Policy). pyserial 2.5 is used to send output to the Arduino.

The web-based client, on the other hand, is much more complex and interesting. A single file (index.html) draws the GUI (using Canvas), reads the tilt angle of the device (using DeviceMotionEvent), calculates the action that the car has to perform and sends the action to the Python script over a WebSocket connection. HTML5 is certainly useful, isn’t it?

The JavaScript has been extensively commented, and is included (along with the optional files) in the download above.

The purpose of the optional files (cache_manifest.php and date.php) is explained here. They are not essential to the functioning of the code.

Kinect Controller

The Kinect controller is written using Code Laboratories’ CL NUI SDK instead of the more commonly used (and official) OpenNI or OpenKinect/libfreenect. The primary motivation in choosing CL NUI over the other SDKs is that CL NUI makes writing code in C# very easy and serial communication is a trivially easy in C#. The tradeoff of writing in managed C# is that (1) Threading is inevitable, which adds to the complexity of the code and (2) The image processing code runs painfully slowly. The software looks like this:

Kinect view.

The bar on the left is not relevant to this project – it is just a quick way to move the Kinect up and down (using the built-in motor) and to read and graph the angle over time. The algorithm used to detect the position of the hand has been deliberately kept simple – working with System.Drawing.Bitmap objects is very slow. Here is the algorithm:

// System.Drawing.Bitmap bmpVideoData contains the current frame.
// double xbar, ybar contain the running average of the points that lie
// 	in the desired depth range. Tweak the incremented values until satisfied
// 	with the accuracy and speed trade-off. The choice of 10 is arbitrary.
for(int i=0; i<bmpVideoData.Width; i+=10)
    for(int j=0; j<bmpVideoData.Height; j+=10){
        c = bmpVideoData.GetPixel(i, j);
        // Check to see if color of pixel in depth map corresponds to the desired
        // depth range. This was manually calculated beforehand, but can be
        // automated if the range needs to be varied.
        if (c.R == 0 && c.B == 255 && c.G < 192) {
            // Live, numerically stable, mean calculation:
            ++count;
            xbar += (i - xbar) / count; // Update the mean x value
            ybar += (j - ybar) / count; // Update the mean y value
        }
    }
// xbar and ybar will be used to calculate the position of the hand onscreen,
// and the action to be performed by the robot.

 

As algorithms go, this is among the simplest. It gives surprisingly robust tracking, though. Do note that you need to install Code Laboratories’ CL NUI SDK before you can run the code included in the download above. Once you have done that, copy CLNUIDevice.cs, CLNUIDevice.dll and NUIImage.cs into the project folder, replacing the existing files. (As per their SDK license requirements, I cannot distribute these files directly).

Future Expansion

  1. Use the Ethernet shield and write a sketch that allows the Arduino to act as a WebSocket server. This will remove the need for having a computer as an intermediary (to forward WebSocket data to the Arduino over the serial port).
  2. Modify the web app code to automatically use the internal gyroscope when available (by using DeviceOrientationEvent instead of DeviceMotionEvent). When I eventually get a device with a gyroscope, I’ll look into it.
  3. Implement the same thing in a toy helicopter. Same concept, new dimension! Use the Kinect to gather positional data, and the iPad to steer it around.
  4. Modify the vision algorithm to detect each hand separately (using a conditional floodfill), and allow each hand to control a different car. Alternatively, use it to steer a helicopter in three dimensions.

iPad Web App – Cache Manifests

Recently, the Minesweeper3D project was adapted into a web application for iPad – complete with multi-touch gestures and offline caching. It’s available here, with the source code here.

iPad web application can be written to function remarkably like native applications. With the ability to keep a copy of the web app for offline use, web apps are made a lot more useful. Cache manifests are used to make this happen on iOS (see the official document here).

A cache manifest is simply a list of all the files required by the app to run offline. It must be served with the Content-Type header set to text/cache-manifest and the first line must be

CACHE MANIFEST

. Each relative file path must occur on a separate line, and all text after a # is ignored. Files are updated when the hash of the manifest file changes. A sample manifest file is here:

CACHE MANIFEST
# Sample Manifest
index.html
arbitrary.js

For the Minesweeper3D project, the cache manifest is generated by a simple script. Feel free to modify it for your own use:

<?php
$files = array("index_iPad.html", "display_canvas_iPad.js", "date.php", ... );
header("Content-Type: text/cache-manifest");
?>
CACHE MANIFEST
# Manifest for Minesweeper3D
# Uses filemtime() to automatically change contents
<?php
foreach ($files as $fn)
	echo $fn."\n# Mod:".filemtime($fn)."\n\n";
?>

Since the hash of the manifest file is used to determine if the files need to be redownloaded, even changing the contents of the comments will cause the browser to download the files again. The above script takes advantage of that by including the time that each file was last modified as a comment – any change in the contents of a file will cause the timestamp to change, which will change the hash of the manifest. As the manifest is downloaded each time the application is open, this simple method may prove to be too resource intensive. This can easily addressed by caching the cache file – an amusingly self-referential but effective method.

Version tracking

Debugging the cache system is usually difficult, but here is a simple way to keep track of the current version present in the cache. This solution comes in two parts, and is compatible with the above manifest making script. It comes as an external php script that is just two lines long:

<?php
header("Content-type: application/x-javascript");
echo "var php_date=\"".date("r")."\";";
?>

It acts as an external JavaScript file that defines the value of the variable php_date to be the date and time that the file was accessed. Since the client only updates the files when the manifest changes, the cache version of this file will not change until the manifest changes. Hence, the date stored in this file is the date at which the file was last downloaded.

function display_initialize(){
	// ...
	document.getElementById("update_date").innerHTML = php_date;
}

Adding a single line to display this date allows us to keep track of the version that is currently being used to display the document. Simple and effective.

A few things to note:

Using the manifest file overrides any other cache directive – HTTP headers, browser configuration, etc. Even pages served over HTTPS are not exempt from this behavior.

The web app can only directly include files mentioned in the manifest. For example, using a <script> tag to include arbitrary.js will only succeed if the manifest file also mentions arbitrary.js. If the file is not mentioned, then it will not be available in the app. Changing this default behavior can be done by appending this to the end of the cache manifest:

NETWORK:
*

Canvas Rendering

Interestingly enough, the canvas rendering is not hardware accelerated. To prevent visible latency in the animations, the variable ani_ActiveMovement in display_canvas_iPad.js disables the rendering of text in the main grid as animation. Uncommenting line 13 from the snippet below enables that “feature”. (Ultimately, the problem of lag was solved by changing the frame rate and increasing the “snap” distance.)

function ani_moveToTargetZ(tgtZ){
	if(isNaN(tgtZ)) return;
	if(Math.abs(current_z-tgtZ) &lt; 0.1){
		current_z = tgtZ;
		ani_ActiveMovement = false;
		display_Z_slice(current_z);
		return;
	}
	current_z = (current_z + tgtZ)/2;
	display_Z_slice(current_z);
	// Check timer
	clearTimeout(ani_timer1);
	//ani_ActiveMovement=true;
	ani_timer1 = setTimeout("ani_moveToTargetZ(" + tgtZ + ")", 40);
}

Hopefully, hardware rendering will be enabled in a future update.

The story of Blinky, Inky, Pinky and Clyde.

All modern browsers limit the number of concurrent connections that they establish with HTTP servers so that connections and devices are not overburdened. There are usually two limits: a cap on the number of connections to a host, and another cap on the total number of outbound connections. At the time the HTTP/1.1 standard was written in 1997, the limit was two connections per host (see RFC 2068, section 8.1.4). For a website that makes extensive use of included content, this limit is rather restrictive. Unsurprisingly, most modern browsers deliberately set their limits high – typically 4-6 connections per host (more here).

This problem came up when I was working on my T-Shirt Design browser – the thumbnail images were loading unbearably slowly. The limited number of connections available were forcing the thumbnails to be downloaded sequentially, rather than concurrently. This post details a rather simple way to get around this problem by using multiple hosts to serve files.

Maintaining mirror hosts is, with some amount of planning, rather easy – just add additional DNS A records and configure your server to serve the exact same set of files for calls to multiple domains (on Apache, just specify the same DocumentRoot for multiple VirtualHosts). This is where the rather cryptic title of this post comes in – blinky, inky, pinky and clyde are all sub-domains of gauravmanek.com. Here is an excerpt from the DNS records of gauravmanek.com:

DNS Zone: gauravmanek.com

Record    Type    Value
------    ----    -----
           A     173.236.181.179
blinky     A     173.236.181.179
clyde      A     173.236.181.179
inky       A     173.236.181.179
pinky      A     173.236.181.179

As you can see, blinky.gauravmanek.com, inky.gauravmanek.com, pinky.gauravmanek.com, clyde.gauravmanek.com and gauravmanek.com are all on the same IP. Do note that I did not use a wildcard record for this, even though its technically possible. I don’t directly edit my httpd.conf settings, but the entries needed to generate the desired behavior should (might? possibly? I’m not particularly experienced with Apache, so don’t take my word as the gospel truth) look something like this:

NameVirtualHost *:80

<VirtualHost *:80>
   DocumentRoot /www/main_site
   ServerName gauravmanek.com
</VirtualHost>
<VirtualHost *:80>
   DocumentRoot /www/main_site
   ServerName blinky.gauravmanek.com
</VirtualHost>

# Repeat for inky, pinky and clyde.

Now the exact same website is being served on each of the subdomains – this means that the path to each file is the same, making our job much easier. This can be verified manually by accessing the same file via each hostname. For example:

http://www.gauravmanek.com/images/OAS.gif

http://www.blinky.gauravmanek.com/images/OAS.gif


http://www.inky.gauravmanek.com/images/OAS.gif


http://www.pinky.gauravmanek.com/images/OAS.gif


http://www.clyde.gauravmanek.com/images/OAS.gif

Now that the mirroring works, we can modify the client-side code/markup to spread the load across each host to meet the aim of maximizing parallel downloads. There is an important constraint to keep in mind – the browser must access a particular resource from the same host each time, or the benefits of having multiple hosts are lost (Each resource is cached by hostname, and so accessing it from another hostname will result in multiple instances of the same resource in cache. This is not good.) For static content, simply replacing each reference to a particular file will suffice. It’s not particularly exciting, but it does work. Alternatively, a small piece of JavaScript could change the src and href attributes at runtime, but this is likely to worsen performance (the linked article provides many of the rules that this article both builds on and breaks).

For dynamic content meant to be asynchronously loaded, though, this is easily implemented. Most, if not all, scripts that dynamically download resources after the page has loaded do so from an array or similar source. To load the content, simply use the iterator variable modulo number of hosts available to quickly distribute the requests into appropriate groups. As used in the (as of February 2011) current version of the T-Shirt Design Browser:

var ts_mirrorServers = new Array("http://www.inky.gauravmanek.com", "http://www.pinky.gauravmanek.com", ... , ".");
// Some code
initialize() {
	for( /* each preview icon */)
		tsIcoNodes[i].src = ts_mirrorServers[i%ts_mirrorServers.length] + ts_icoPrefix + ts_icoNameArr[i];
}

And that’s it. It should work properly now.

There are better methods to deal with this problem (see SpriteMe, more on this later), none are as easy to implement for dynamic content as the solution discussed on this page. (Note: I’m working on mixing sprite generation and this together. Let’s see if it works.)

This has one additional benefit, especially important for cookie-heavy sites. As the hosts are different, cookies that would be sent as part of the browser’s GET request are no longer sent, reducing both transfer and computational overhead. This is the reason that sites often use a single subdomain to serve static content (e.g.: static.bbc.co.uk).

Potential Problems
This method is, however, rather problematic at times. There are two main overheads that are incurred that makes this unsuitable for serving many tiny files.

Firstly, the DNS overhead. DNS round-trips can take more than a second to complete, and absolutely no content can be transferred until this request is completed. If the mirror system is only used for a few pages (as it is in this case), then small snippets of JavaScript can be used to asynchronously download dummy images from these hosts while on other pages of the site – thereby forcing the browser to resolve the domain names beforehand. (This is scheduled for implementation, I will post an update when its done.)

Establishing a TCP connection is time-consuming, and this is the second overhead that makes the current method impractical. While its possible for the connection to be “reused” (using Connection: Keep-Alive), it’s not something that can be relied upon. This is why sprites are a popular solution to this problem.

A little more

You might find it desirable to block access to your website on each of your mirror domains – to not do so would allow people to maintain multiple sessions on your website (if you use cookies to track sessions) and could potentially confuse people. A simple mod_rewrite directive can solve this problem. Alternatively, if you use a PHP-based CMS, put this in the head of your page:

<?php
if($_SERVER['HTTP_HOST'] != "www.example.com" && $_SERVER['HTTP_HOST'] != "example.com")
	if(preg_match('/^(www\.)?(blinky.|inky.|pinky.|clyde.)example\.com$/', $_SERVER['HTTP_HOST'])){
		header("Location: http://www.example.com/",TRUE,301);
		die();
	}
?>

I now leave you with one final image, based on artwork from the original Pac-Man:

Perhaps I’ll put it on a t-shirt?

3D Histogram

Recently, as part of my research, I implemented a modified version of the algorithm detailed in this paper. While it was unsuitable for the intended purpose, the process did yield interesting results in the form of three-dimensional color-frequency histograms.

First, a quick introduction to the concept of a color space: Colors are usually described, additively (starting from black and adding light) or subtractively (starting from white and subtracting light). The former is usually used with screens and computer displays (which work by adding light), and the latter with print materials (which work by adding pigments, which subtract light). By no means does this depict every possible color, but it does an excellent job in specifying colors for use by digital devices. The RGB color space is the most commonly used color space in digital images, simply because its use makes it very easy for displays to directly use, and for printers (or other devices) to convert to the appropriate format.

RGB is usually specified by a set of three numbers (each representing the intensity of red, green or blue present in the final mix, as an integer from 0-255 or a fraction from 0-1). Since these three colors are assumed orthogonal in human perception, they can be drawn in a cube, with each color varying along one dimension of the cube. This means that each point in the cube can be described using a unique set of three numbers, and that each set of three intensities describes a particular, unique, color. The image below expresses this concept visually:

Three-dimensional visualization of the RGB colorspace. Image courtesy of SharkD on Wikimedia Commons.

The purpose of a histogram is to calculate which colors are most popular in a very simple way. The entire RGB cube is divided into “bins” of equal size, where each bin corresponds to colors that belong to the same ranges of intensities of each primary color. The divisions in the image above divide the larger cube into 53 = 125 bins. It is important for the “size”, the range of colors that each bin encompasses, to be perceptually uniform. This is not the case in the RGB model, but we can live with the assumption for now. In the final histogram, each bin contains the total number of pixels in an image that fall within its range.

With all this knowledge, we can now see that a three-dimensional histogram is simply a record of the color distribution within an image, and so can be used to find out the particular color or color groups that are most common in an image. One particular image of interest is Jellyfish.jpg, from the sample pictures included in modern versions of Windows.

Low resolution version of Jellyfish.jpg.

This image was processed to yield a 3D histogram as described above (but with 163 = 4096 bins) and converted into a “stacked” contour plot. Each layer in this stacked plot corresponds to moving towards the maximum intensity of red by one bin (or 1/16 of the maximum color). Within each layer, the colored lines enclose areas within the histogram that all contain more than a certain number of pixels. As in a geographical contour map, the minimum limit for each contour line increases linearly.

Smoothed RGB Histogram of Jellyfish.jpg

It is easiest to imagine that all the slices are vertically stacked and that lines of similar colors are connected between layers. Doing so will give you a series of polygons enclosing colors that all occur at least a certain minimum number of times. For quick reference, the color of each bin can be seen here:

Color Lookup

It is easy to manually verify that the most common maximum colors are the dark blue of the background and the orange of the jellyfish.

Of course, this particular histogram has been smoothed. The original histogram, as shown below, has considerably thin peaks, suggesting that colors within a group tend towards a central value rather strongly. A three dimensional Gaussian convolution matrix was applied (with σ = 1.33 bins) was applied to get the rather nice looking histogram above.

Raw RGB Histogram of Jellyfish.jpg

The whole process relies heavily on a few key assumptions. Perhaps the most important assumption is that the intensity of a particular color varies linearly with the value used to express it. As far as the RGB color space is concerned, this is not true. This leads to the process being non-representative in a number of important ways:

  • Smoothing is not representative of colors perceived.
  • Color grouping is likely to be too aggressive in colors where human eyes can differentiate shades the best and not sufficiently aggressive at other colors.

If you have managed to read this far, then congratulations! Here is the same data in other color spaces.

YCbCr

Jellyfish Histogram - Smoothed YCbCr


Jellyfish Histogram - Raw YCbCr

HSV

Jellyfish Histogram - Smoothed HSV


Jellyfish Histogram - Raw HSV

Do note that the Gaussian smoothing used does not “wrap around” the hue dimension in the smoothed HSV histogram.

Fun With Lasers

Last weekend, a friend and I spent some time photographing laser beams.

Laser beams are not directly visible in air, and so talcum powder was used to scatter light. The first such image (of two laser beams) was generated by taking a long exposure shot in a darkened room. Talcum powder was spread by a person standing in front of the screen, and the long exposure resulted in the image of the moving person being almost completely “averaged out”.

Initial Method of viewing a laser beam.

Admittedly, this particular photo has a number of salient flaws: the non-uniform background; the lack of contrast between the laser and the screen; the slight shake in the image from the pressing of the shutter release.

The set-up was then reworked to use a single laser device. Two mirrors were used to bounce a beam back and forth in front of the screen. The screen itself was moved towards the camera, and a ladder was placed behind the screen, allowing a person to spread talcum powder easily while not appearing in the image. The camera itself was tethered to a computer, and the shutter release was triggered over USB, eliminating camera shake.

Subsequent images showed substantial improvement:

Long exposure shot of laser beam.

The ray from the laser source is the bottom-most one, with the source placed on the right. The lowest ray is much more well-defined than the highest one, suggesting that either the laser source is not particularly good (the beam divergence is high) or the mirrors are imperfect (either with dust on the surface or an imperfect reflective surface). The gap between lasers decreases upwards, once again due to the mirror being slightly warped.

The red “glow” around the laser beam is probably due to secondary scattering by talcum powder between the camera lens and the plane of the lasers. This effect was especially pronounced towards the end of this experimentation, when a large amount of powder was present in the air.

Compare the above image to one that has a (relatively) short exposure time:

Same setup, shorter exposure.

You can see the contribution a single particle makes towards the entire image by examining the left side of the image. The glow is also substantially less intense in this image.

Here is a photo of the set-up with the screen removed:

Behind the screen.

Notice the bottle of talcum powder?

Future expansion for this includes using a laser “slice” to track the flow of particles within a moving fluid.