The Grid’s Eyes

We apply algorithms and techniques from the Computer Vision field. Combined with hand picked datasets we can create extractors for features we are interested in. With that our AI can answer questions like "what is a good color palette from this image?", "where is the best place to crop that picture?" or "can we place text on top of this image? and where?".

Since last year I'm working together with Jon Nordby, Henri Bergius and Gabriela Thumé on an image analytics pipeline to The Grid. Basically, we are teaching our AI to identify patterns and extract relevant information from raw image data.

Here I describe some of these feature extractors and how they impact the design of content you share on The Grid.

When we look into a picture we can naturally determine what points get our attention first. A human face, a specific object or a region with higher contrast, those are both examples of salient areas or regions. We can easily recognize those areas, but computers don't. So we have to teach them.

The Grid uses a saliency detection algorithm which analyzes a given image distinguishing foreground and background regions. In summary, it divides the original image by means of little segments. Those segments are analyzed and separated in foreground and background groups. Segments similar to a foreground group are grouped together. After many iterations we have all pixels grouped in their respective groups. If we mark each pixel with a value symbolizing its group, we have what we call a saliency map.

It's easier to understand with an example. Given the original image, we have the following saliency map.

Higher gray levels indicate more salient areas. The Grid's AI now knows the region in white is the most relevant part of that image.

With this new information, The Grid's Design Systems can now crop the original image to show only those regions to fit on any screen size, for example. Or it can place text on top of images avoiding those salient regions.

Color palette extraction

Caliper - how we call The Grid's image analytics pipeline - reduces all image colors into a really small set of n colors. This set represents the most predominant colors of an image. The Grid's Colorverse can now use this color palette as input to find the right contrastant color to paint some content placed on top of the image, or to generate a similar color to paint the background around an image.

Face detection

We use Machine Learning to detect if an image has human faces on it and where they are. Combined with the information we already have about salient regions, The Grid's Design Systems can avoid cropping heads off or placing content on top of faces.

Face detection is also important to detect mood on face expressions and to recognize people.

Metadata extraction

Images captured by digital cameras or processed by photo editing softwares include metadata like EXIF tags into their files. Caliper extracts information like rotation angle and geolocation. It's important to pre-process images, rotating them back to the expected direction, for example.

Other high level information like color histograms are also extracted and are used by The Grid's Design Systems to calculate contrast and other metrics.

The infrastructure

The counterpart of Caliper is ImgFlo. While Caliper does image analyses, ImgFlo is responsable to process every image on The Grid. In other words, ImgFlo applies filters to images and does all kind of transformations.

Both Caliper, ImgFlo and all The Grid backend infrastructure is powered by a dataflow implementation also developed by us: NoFlo and Flowhub. Using a dataflow paradigm on Caliper makes it really simple to understand the flow of data between image analisys operations. We can actually show what Caliper looks like because it's not code but a graph.

It's a common NoFlo graph: each node is a component that implements some operation (like Detect Faces or Extract Saliency). Those components are connected by edges where data flow through. In this case we have image data flowing through the graph and each node is extracting the features we discussed before.

The future

One of the important aspects of our pipeline is scaling. We are currently working on new feature extractors like for text detection. That will make it possible to avoid placing text on top of image's text. We are also improving our existing solution for face detection and researching Deep Learning models for image classification and sentiment analysis.

Besides image analysis, we are always researching and experimenting alternative ways to use data extracted by Caliper. For example, we can use extracted information from text, images or videos to synthesize new content. Using text as a seed we can generate an infinite number of unique images combining different shapes, curves and colors.

We can apply more complex algorithms like Delaunay Triangulation to obtain meshes for favbanners or backgrounds.