Bookmark and Share

ARM Guide to OpenCL Optimizing Canny Edge Detection: Implementation

Register or sign in to access the Embedded Vision Academy's free technical training content.

The training materials provided by the Embedded Vision Academy are offered free of charge to everyone. All we ask in return is that you register, and tell us a little about yourself so that we can understand a bit about our audience. As detailed in our Privacy Policy, we will not share your registration information, nor contact you, except with your consent.

Registration is free and takes less than one minute. Click here to register, and get full access to the Embedded Vision Academy's unique technical training content.

If you've already registered, click here to sign in.

See a sample of this page's content below:


This chapter describes an example implementation of Canny edge detection.

Buffer layouts and formats

This example implementation relies on convolutions to perform part of the edge detection process. This means that convolutions centered on pixels at the image edge cause unexpected behavior unless extra work is performed.

The problems which occur are:

  • Legal memory access problems
    These are caused when a load attempts to use data outside the image, this occurs when the convolution is at the top of the image or the bottom of the image.
  • Algorithm correctness
    Algorithm correctness problems occur when a convolution is applied to the left side or right side of the image and loads data from the opposite side of the next line of the source image. This is not valid data for the convolution, so the result from it is wrong.

Solving the legal memory access problem

To solve the legal memory access problem, consider a simple image memory layout organization, a linear buffer with W pixels in each row stored left to right. Each new row is appended to the end of the previous row with no padding data. This means that the first pixel of a row is stored next to the last pixel of the previous row.

The following image shows a simple image layout.


Figure 3-1 Simple buffer layout

Solving the legal memory access problem y component

This layout means that the neighbor below a pixel in the image is stored at index(P) + W, where P indicates the pixel coordinates whose neighbor is being found. Similarly, the pixel above the current pixel is stored at index(P) - W. This works in most cases however, when the pixel above a top row pixel is required there is no valid neighbor in that direction. Reading from this location can cause a page fault if this problem is not addressed.

The easiest solutions to this problem are:

  • Never attempt to perform this kind of calculation near the edges of an image. Instead, cut the result image or fill the result borders with dummy data.
    This solution is too simple and avoids the problem, rather than solving it. This is not a good solution to ensure legal accesses.
  • Use a condition in the kernel code to decide if the pixel is too close to an edge for the required calculation to work, and implement some special code for this case.
    This solution is expensive because it involves checks on every pixel to fix a problem that affects a small proportion of pixels.
  • Create a copy of the...