Privacy preference center

Cookies are small files saved to a user’s computer/device hard drive that track, save, and store information about the user’s interactions and website use. They allow a website, through its server, to provide users with a tailored experience within the site. Users are advised to take necessary steps within their web browser security settings to block all cookies from this website and its external serving vendors if they wish to deny the use and saving of cookies from this website to their computer’s/device’s hard drive. To learn more click Cookie Policy.

Manage consent preferences

Necessary cookies

Always active

These cookies are necessary for the website to function and cannot be switched off in our systems. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms. You can set your browser to block or alert you about these cookies, but some parts of the site will not then work. These cookies do not store any personally identifiable information.

Analytics cookies

These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us to know which pages are the most and least popular and see how visitors move around the site. If you do not allow these cookies we will not know when you have visited our site, and will not be able to monitor its performance.

How To Process Media Data with the Metal Framework [Tutorial]

20755 views
7 min
May 17, 2018

Sergey A.

IOS Developer

Daryna P.

Copywriter

Tags:

Tutorials Tech Insights

Processing visual data like images and videos is an integral part of mobile development. We recently searched for a tool to do this in the simplest and most efficient manner. Now we want to share our experience and introduce our favorite ‒ Metal. What is the Apple Metal framework and how does it work? What benefits does it offer you? Let’s look under the hood of Metal.

What is Metal?

Metal is a framework that provides an API with which you can render data or perform data-parallel computations on the graphics processing unit (GPU). The GPU is a single-chip processor that draws graphics on the user’s device and thus reduces the load on the central processing unit (CPU).

Metal allows you to perform hardware acceleration, meaning you can process a range of tasks that require more load than usual solutions, such as cryptography, machine learning, or even VR. Unlike third-party APIs, the Metal API is optimized to work with Apple devices running on both iOS and macOS.

Why do I need to use Metal?

Metal provides you with shaders you can compile when building your own application and then use in the runtime if necessary. To write shaders, we use a special language ‒ the Metal Shading Language. The Metal Shading Language is based on the C++ 14 Specification (aka the ISO/IEC JTC1/SC22/WG21 N4431 language specification).

Metal also handles memory and resource management. The main classes in this case are MTLBuffer and MTLTexture: they help determine how data is represented in the GPU memory. You can also use the ready-made classes from Metal to add new effects to images, such as Blur Filter or Threshold Filter.

With Metal, you can use low-level programming language capabilities for various calculations, which speeds up program execution with no need to use high-level abstractions.

Here are a few more benefits of Metal for iOS devices:

Lowest overhead access to the GPU, eliminating all bottlenecks usually caused by data transfer between the CPU and GPU in other frameworks
Up to 10 times the number of draw calls compared to OpenGL
Allows you to run compute applications with performance levels similar to technologies such as CUDA and OpenCL
Built-in memory and resource management

Among the main competitors of Metal are APIs from third-party developers that solve similar tasks, such as Vulkan, OpenGL, and DirectX. In a way, the significant disadvantage of Metal is that it’s confined to Apple platforms. Yet you can overcome this problem by using additional libraries such as MoltenVK. MoltenVK lets you use the cross-platform API and at the same time run apps based on the Metal API.

Basic concepts of image processing

Let’s look through the main classes we’ll be using in our tutorial:

MTLDevice is a protocol that defines the interface to a GPU. You need to query the GPU features on a specific device and allocate Metal objects for your app.
MTLLibrary is an object that contains all the compiled Metal shaders obtained from the text string when building the application or from the runtime itself.
MTLCommandQueue is a protocol that defines an object storing the ordered list of command buffers to execute.
MTKTextureLoader is a class helper that allows you to load your textures into the application. It supports the following file types: PNG, JPEG, TIFF, KTX, and PVR (including advanced options).
MTKView is a class that simplifies drawing of all data in your application.

In order to store the entire configuration in one place, we’ll create a context class:

Now we need to create a file with a simple shader and a .metal extension. Note the syntax of the code and the name of the function. What you see isn’t a Swift snippet but actual Metal code (because this isn’t Swift):

Next, we’ll load the shader and try to process the texture:

First, you need to pay special attention to the name "compute_shader." The value of the parameter must match the name of the function you’ve implemented in the shader file. Let’s move to the settings for image rendering.

The make buffer function creates an MTLBuffer object by copying data from an existing storage allocation into a new allocation:

We need to consider two more terms:

MTLCommandBuffer ‒ An object that stores commands for execution on the GPU.
MTLRenderPipelineState ‒ An object that stores the state and graphical functions.
MTLBuffer ‒ A memory allocation that stores unformatted data accessible to the GPU.
MTLComputeCommandEncoder – An encoder that specifies the data-parallel compute processing state and commands and executes compute functions.

A shader processes each pixel on the screen. In fact, a shader is just a program executed in the graphics pipeline. The above shader is extremely simple: it just copies each pixel of the original texture into a new texture without changes.

Let’s slightly complicate the example and try to pass parameters to the shader code.

Here we create a simple array from one parameter.

Now we need to change the shader code a bit:

Note the new data parameter in the function header. Using this parameter, we can get the value we’ve set above and process this value inside the shader. We can pass it to the output.write function. This way the original image will get a reddish shade.

Unary image kernels

A unary image kernel works on a single input texture to produce a single output texture. There are several categories of unary image operations available in the Mathematical Programming System (MPS):

Convolutional operations (Box, Tent, GaussianBlur, Sobel, Convolution)
Threshold (ThresholdBinary, ThresholdBinaryInverse, ThresholdToZero, ThresholdToZeroInverse, ThresholdTruncate)
Resampling (LanczosScale)
Morphological operations (Erode, Dilute)
Sliding-neighborhood operations (AreaMax, AreaMin, Median, Integral, IntegralOfSquares, Threshold)

Instantiating kernels

To instantiate MPS kernels, you need a Metal device and some number of other parameters determining how the kernel should behave. For example, the MPSImageGaussianBlur kernel takes a sigma parameter that determines its blur radius:

Kernel properties most often aren’t modified after initialization. Therefore, if you want to use different parameters, you have to create multiple filter instances. Even though kernels are fairly lightweight objects, you still will notice the performance hit if you create new instances every frame. So try to reuse kernels as often as possible.

Encoding kernels to a command buffer

To make a kernel work correctly, you must encode it to a command buffer. Kernels operate on Metal textures, read from their source, and write to their destination. You can simply encode a kernel to a command buffer to get the desired result:

We recently created MediaWatermark, an open source GPU/CPU-based iOS watermark library. Now we want to update this library to support the Apple Metal API. For more details about MediaWatermark, check out our article MediaWatermark: GPU/CPU-based iOS Watermark Library.

MediaWatermark library update

First, we gave the MediaItem class a new instance method:

An object of the MediaFilter type is a parameter of the above method. We also added five new filters:

Color filter
Sepia filter
Blur filter
Threshold filter
Sobel filter

Original picture

Color filter (red parameter with value of 1)

Sepia filter

Gaussian blur (sigma = 45)

Image with the Gaussian Blur Sigma Filter

Sobel filter

Threshold filter

Here you can see an example of the use of new filters:

Encode differently

If your app depends on the latest and greatest in media data processing on iOS devices, Metal is the right choice for you. Metal frees the CPU to deal with things requiring complicated computations like physics engines, audio processing, and artificial intelligence, helping you build faster and more efficient apps.

Check out our MediaWatermark repository to see all the capabilities of Metal.

CONTENTS

Tags:

Tutorials Tech Insights

Authors:

Sergey A.

IOS Developer

Daryna P.

Copywriter

Be the first user to rate this article!

Nay

So-so

Not bad

Good

Wow

0 rating, average 0 out of 5

Share article with

Comments (2)

Anna Harris about 7 years ago

Nice article, Thanks for sharing.It would be helpful to me for my further projects.

Daryna P. about 7 years ago

Thank you for your comment. We're glad to hear that our article was useful for you =)

Subscribe via email and know it all first!

Name	_rg_session
Provider	rubygarage.org
Retention period	2 days
Type	First party
Category	Necessary
Description	The website session cookie is set by the server to maintain the user's session state across different pages of the website. This cookie is essential for functionalities such as login persistence, ensuring a seamless and consistent user experience. The session cookie does not store personal data and is typically deleted when the browser is closed, enhancing privacy and security.

Name	m
Provider	m.stripe.com
Retention period	1 year 1 month
Type	Third party
Category	Necessary
Description	The m cookie is set by Stripe and is used to help assess the risk associated with attempted transactions on the website. This cookie plays a critical role in fraud detection by identifying and analyzing patterns of behavior to distinguish between legitimate users and potentially fraudulent activity. It enhances the security of online transactions, ensuring that only authorized payments are processed while minimizing the risk of fraud.

Name	__cf_bm
Provider	.pipedrive.com
Retention period	1 hour
Type	Third party
Category	Necessary
Description	The __cf_bm cookie is set by Cloudflare to support Cloudflare Bot Management. This cookie helps to identify and filter requests from bots, enhancing the security and performance of the website. By distinguishing between legitimate users and automated traffic, it ensures that the site remains protected from malicious bots and potential attacks. This functionality is crucial for maintaining the integrity and reliability of the site's operations.

Name	_GRECAPTCHA
Provider	.recaptcha.net
Retention period	6 months
Type	Third party
Category	Necessary
Description	The _GRECAPTCHA cookie is set by Google reCAPTCHA to ensure that interactions with the website are from legitimate human users and not automated bots. This cookie helps protect forms, login pages, and other interactive elements from spam and abuse by analyzing user behavior. It is essential for the proper functioning of reCAPTCHA, providing a critical layer of security to maintain the integrity and reliability of the site's interactive features.

Name	__cf_bm
Provider	.calendly.com
Retention period	30 minutes
Type	Third party
Category	Necessary
Description	The __cf_bm cookie is set by Cloudflare to distinguish between humans and bots. This cookie is beneficial for the website as it helps in making valid reports on the use of the website. By identifying and managing automated traffic, it ensures that analytics and performance metrics accurately reflect human user interactions, thereby enhancing site security and performance.