Drawing Big Graphs with Swift and UIKit

John Scalo
Made by Windmill
Published in
9 min readDec 21, 2020

--

We just released Tempi, our “reverse metronome” that detects music BPM in real time, for the Mac! In the process, I revisited some of the pitfalls involved in graphing audio sample data and thought it might make for a good tutorial. And while I’m using audio as the subject of this tutorial, these techniques apply just as well to any time-based data that might extend way past the bounds of the screen.

If you want to follow along with the code, download the source before starting.

Let’s build this app!

Goals

  • Plot audio waveform data from a file on the screen
  • Show a ruler with seconds and tenths of seconds
  • Let the user scroll from the beginning of the song to the end
  • Let the user zoom in and out of the data
  • On Mac, support resizing the window
  • Make it smooth and fast

App Architecture

UIView vs CALayer

First, a cardinal rule: avoid UIView’s draw: (formerly drawRect: in Obj-C) like the plague. Historically, draw: was the place to do Quartz-based (aka CoreGraphics) 2D drawing. Pretty much anything you do there is going to use the CPU and system RAM and its mere existence adds an entire render path to the view update process. Think of draw: as “painting the screen”: each time through you’re re-creating the entire scene from scratch, and all using vital CPU resources.

Instead, we want all of our drawable objects to live on the GPU. Not only does this free up the CPU and system RAM for other tasks, but the GPU’s entire raison d’être is to handle exactly these sorts of tasks so it will be lightning fast. To do that, I’m going to use CALayer exclusively.

CALayer is an abstraction above what you might know as textures from OpenGL & Metal. Once a CALayer has been added to the GPU (by adding one as a sublayer to a layer that’s already on the GPU), it remains there until removed. The CALayer docs show all its properties, and importantly: modifying these properties is always handled entirely on the GPU.

There is a gotcha with CALayer to be aware of. The “CA” stands for “Core Animation” and CALayers are geared towards animating their properties. So much so in fact that not only are almost all CALayer properties animatable, almost all of them will animate when changed without explicitly telling them to. This is known as implicit animation, and CALayers have an implicit animation duration of 0.25 seconds. If you don’t change or disable this value, your views might appear to animate at 4 fps instead of 60 fps. I’ll cover how to handle that later.

UIScrollView

A typical usage of UIScrollView is to stick a content view inside of it, resize the content view to match its content, and let UIScrollView handle the rest. And that’s 100% fine for a typical iOS view. But let’s say we’ve loaded a 44.1kHz song file that’s 4 minutes long and zoomed in at 10x. That’s:

4 * 60 * 44100 / 10 = 1,058,400 pixels

Obviously trying to manage a view that’s a million pixels wide is going to be slow as molasses in January.

Humongo scroll view — don’t do this!

“But wait!” you say. “We can clip the content to only what’s visible so that we end up drawing only what’s needed!” Yes, that’s a reasonable approach, but the downside is that while scrolling or resizing you’ll need to make a clipping path on every pass, and even just calculating the clipping path can steal valuable CPU cycles. You’ll also find that if the user scrolls fast enough there can be missing content so you end up having to clip well outside of the bounds anyway.

Big scroll, small clip — we can do better!

And here I think is the key insight into making this drawing architecture insanely fast: we’re not going to put anything inside the scroll view! Instead, the UIScrollView is just a tool that we use to tell us where the user has panned to on the x axis of our graph. In fact, if it weren’t for UIScrollView’s nifty bouncing behavior, we could probably just implement a pan gesture recognizer and call it a day.

Small scroll, no clip. Simple = fast.

OK, so in terms of the view hierarchy, here’s a rough outline of what we have:

AudioGraphView: UIView
⎼⎼ UIScrollView
⎼⎼ AudioGraphView.layer
⎼⎼⎼⎼ WaveformLayer: CALayer
⎼⎼⎼⎼ RulerLayer: CALayer

AudioGraphView is the top-level publicly vended view. Inside is a “dumb” scroll view that’s pinned to its bounds and with nothing inside of it. Inside the AudioGraphView’s layer (each UIView has its own CALayer) lies a CALayer subclass for anything we want to draw. In this case, a waveform plot and a ruler.

Helper Classes

There’s one small problem. The AudioGraphView and each of the drawing layers potentially needs to know both about the data it’s drawing and the state of the view it’s drawing into. We could encapsulate all of this in the top-level AudioGraphView and let its layers consult back to it, but this creates unseemly circular references and leads to a lot of bloat in AudioGraphView.

Instead I’m using two helper classes to centralize this information: DataProvider to access and manipulate the audio sample data, and ViewPort to manage the basic geometry, translation, and scale.

User Interaction

Let’s talk about the user interaction we need to handle and how to handle it. There are three gestures from the user we need to handle: panning, zooming, and because this app can run on a Mac via Catalyst, window resizing.

Panning is easy. We already have a scroll view in place to track the user’s pan gesture. We made the viewPort the scroll view’s delegate and the viewPort notifies the AudioGraphView when the scroll changes.

For zooming, we install a UIPinchGestureRecognizer whose handler looks like:

Unpacking that, recenterForScale is a UIScrollView extension that keeps the scroll view centered while resizing or zooming. (Take a look at its implementation for more info about why it’s needed and how it works.) One problem here is that recenterForScale can modify the scroll view’s offset which triggers a call to update(), but changing the zoom also triggers a call to update(). So to avoid halving our frame rate here, we suppress the first call to update() via thepauseUpdates variable.

Then we adjust the viewPort’s zoom by the incremental scale amount sent by the gesture recognizer and reset the gesture recognizer’s scale to 1.0 so that it continues to be incremental, a typical method of handing zoom with UIPinchGestureRecognizer.

Finally, to handle window resizing, we can simply track the size of the AudioGraphView’s bounds in its layoutSubviews():

Here again we’re using the handy UIScrollView extension that keeps the scroll view centered, in this case while resizing. Since the bounds potentially changed, we need tell the scroll view about its new contentSize:

Drawing Stuff

OK, let’s get to drawing! Our central entry point for drawing is AudioGraphView.update(). Here we’ll ensure that our data is up to date and call into each of the layers’ update() function so they can actually draw. As long as we call it from layoutSubviews() or when the ViewPort reports that the user has pinched or scrolled, it should keep the view up to date.

But… remember when I said that CALayer has an implicit animation duration of 0.25 seconds? That’s going to make the view stutter badly when scrolling or zooming, so we wrap those updates in CATransaction calls to disable it:

WaveformLayer and RulerLayer both use a similar strategy to draw:

  • Create the maximum number of layers needed and store them on the GPU (ie add them as sublayers).
  • Never remove layers from the GPU. Instead, begin each render pass by hiding all layers and then unhiding any layers that are actually used. Hiding/unhiding layers is really cheap!
  • Consult the viewPort to figure out what the pan translation (ie scroll amount) is and draw accordingly within its bounds.

Because there can be so many audio samples to plot, I’ve chosen to plot on pixels instead of points to make it as detailed as possible. (As I’m sure you know, each Apple device screen has a scale that determines how many pixels are actually used to draw one point.)

As a quick aside, when dealing with audio sample data we need a way to downsample all of the audio samples (44,100 per second in this case) into only what we plan to actually plot on the screen. In the audio DSP world, this process is known as “summarizing” and the DataProvider class is responsible for re-summarizing the data any time the scale changes.

Let’s first look at WaveformLayer’s main update() function:

prepare() checks to make sure there are enough layer objects to draw with, and creates new ones if needed. Again, to keep things fast, layers are never removed from the GPU, only added.

updateMidline() simply draws a line through the middle of the screen (as a CALayer, of course!).

updatePlot() looks like this:

Here we’re just figuring out what the first sample (based on x translation, ie scroll amount) and the last sample (based on the number of pixels that we can plot on) to plot should be. There’s some bounds checking to keep things from crashing, but otherwise it’s very simple. Lastly, all of our layers are set to be hidden to start. The resulting array of samples is then sent to updateLines():

To set things up, we need to know what the largest magnitude in the data set is so we know how far up or down we can draw. Luckily, the dataProvider stored this when creating the data. The yScalingFactor is then calculated by dividing into half the bounds height, since each magnitude can draw up or down.

The drawing proceeds in a while loop, until there are no more samples left to plot or we’ve gone past the screen bounds. There’s a bit of math to figure out what timestamp each sample lies at, and the the dataProvider’s sample(at:) function is called to get the actual sample. (A simple sample array lookup using the index will show subtle errors since the summarized audio data isn’t exactly as long as the view’s content width, due to rounding.) Finally, a previously created layer is fetched, set to visible, and shaped into a line to show the current amplitude.

I won’t go over the RulerLayer implementation here since it’s very similar to WaveformLayer, but simpler.

Performance

So, how fast is this actually? I added a simple FPS logging facility to AudioWaveformGraph so let’s run it on a real device and find out.

Scrolling: 290 fps —iPad Pro 9.7" (2016)
Resizing: 496 fps—MacBook Pro 13" M1 (2020) (Catalyst)
Zooming: 110 fps—iPad Pro 9.7" (2016)
(All fps are averaged across a 2 second window.)

Everything is fast and easily beats the 60 fps benchmark for “smooth”. (And my iPad is 4 years old so I bet it’s a lot faster on a newer one!) But… the zooming performance is noticeably worse than the others. Why?

Luckily it has nothing to do with our drawing architecture. It’s because when the user zooms, the graph’s scale changes and so the audio sample data needs to be re-summarized. But even though I’m using Accelerate.framework functions wherever possible, this is still slow enough to bottleneck the rendering cycle. If I needed zooming to be buttery smooth on the oldest and slowest iOS devices, I’d re-summarize the data in a separate thread.

// TODO:

For completeness, here are a few enhancements I’d want to make before putting this in a shipping product:

  • When zoomed way into the audio data, draw amplitude points instead of lines
  • Show a progress indicator when loading the audio data
  • Re-summarize audio data in a separate thread
  • Make the ruler more useful by adding textual timestamps

I hope this tutorial was helpful! If interested, you’re welcome to download and use the full source code (MIT license).

Shameless plug time! If you’re looking to have an iOS app developed, get in touch. We’re available for consulting work.

--

--

Twitter: @scalo • Cofounder of @MadeByWindmill - an iOS dev shop • Apple, Inc ex-pat.