This is the third in a series of posts on the finite Fourier transform. The Fourier matrix produces an interesting graphic and has a surprising eigenvalue distribution.

The Fourier matrix of order $n$ is the $n$ -by- $n$ complex Vandermonde matrix $F$ whose elements $f_{k,j}$ are powers of the $n$ th root of unity

$$ \omega = e^{-2 \pi i/n} $$

$$ f_{k,j} = \omega^{k j} $$

The matrix can be generated with the MATLAB statements

k = (0:n-1)'; j = (0:n-1); F = exp(-2*pi*i*k*j/n);

Or, by taking the FFT of the identity matrix

F = fft(eye(n))

The statement

plot(F)

connects points in the complex plane whose coordinates are the real and imaginary parts of the elements of the columns of `F`, thereby generating a subgraph of the graph on `n` points. If `n` is prime, connecting the elements of all columns generates the complete graph on `n` points. If `n` is not prime, the sparsity of the graph of all columns is related to the computational complexity, and hence the speed, of the fast FFT algorithm. The graphs for `n` = 8, 9, 10, and 11 are generating and plotted by the following code.

for n = 8:11 subplot(2,2,n-7) F = fft(eye(n)); plot(F) axis square axis off title(['n = ' int2str(n)]) end

Because `n = 11` is prime, the corresponding graph shows all possible connections. But the other three values of $n$ are not prime. Some of the links in their graphs are missing, indicating that the FFT of a vector with that many points can be computed more quickly.

The remainder of this post examines `F12` in more detail. Here is its graph.

clf n = 12; F = fft(eye(n)); plot(F) axis square axis off title(['n = ' int2str(n)])

The program `fftmatrix`, available here, or included with the NCM App, allows you to investigate these graphs. `fftmatrix(n)` plots all the columns of the Fourier matrix of order `n`. `fftmatrix(n,j)` plots just one column.

Let's plot the individual columns of `F12`. The first column of `F12` is all ones, so its plot is just a single point.

for j = 1:12 p = mod(j-1,4)+1; subplot(2,2,p); fftmatrix_mod(12,j-1) title(['j = ' int2str(j)]) if p == 4, snapnow, end end

To see typical behavior, look at the third subplot, the red graph labeled `j = 3`, generated by the third column

F(:,3)

ans = 1.0000 + 0.0000i 0.5000 - 0.8660i -0.5000 - 0.8660i -1.0000 + 0.0000i -0.5000 + 0.8660i 0.5000 + 0.8660i 1.0000 + 0.0000i 0.5000 - 0.8660i -0.5000 - 0.8660i -1.0000 + 0.0000i -0.5000 + 0.8660i 0.5000 + 0.8660i

These are the powers of $\omega^2$. Because 2 divides 12 evenly these powers hit all the even-numbered points around the circle twice and miss all the odd-numbered points. Now look at the cyan graph labeled `j = 11`. These are the powers of $\omega^{10}$, which are the complex conjugates of the powers of $\omega^2$. So the two graphs lie on top of each other.

Six of the twelve columns in the graph of `F12` connect only a subset of the nodes and ten of the columns lie on top of their complex conjugate columns. As a result, when all of the columns are combined to form the full graph, it is sparse. This sparsity, in turn, makes it possible to construct a fast finite Fourier transform algorithm for `n = 12`.

I've always been curious about the eigenvalues and vectors of the Fourier matrices. In 1979, three friends of mine at the University of New Mexico, Gus Efroymson, Art Steger, and Stan Steinberg, posed the question in the SIAM Review problem section. They didn't know that Jim McClellan and Tom Parks had actually solved their problem seven years earlier, when Jim was a grad student working under Tom at Rice. I didn't learn about the McClellan-Parks paper until fairly recently.

The Wikipedia page on the discrete Fourier transform says the facts about the eigenvalues are well known, but that the eigenvectors are the subject of ongoing research.

Let's rescale $F$ so that its columns have unit length.

F = F/sqrt(n);

This makes $F' \ F = I$

round(F'*F)

ans = 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1

Now $F$ is *unitary*, the complex generalization of *orthogonal*. This implies that all of its eigenvalues lie on the unit circle in the complex plane.

Furthermore, it turns out that $F^4 = I$.

round(F^4)

ans = 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1

This implies that any eigenvalue $\lambda$ must satisfy $\lambda^4$ = 1. Consequently the only possible eigenvalues are 1, -1, i, and -i. You might guess that guess that if $n$ is divisible by 4, the eigenvalues would be equally distributed among these four values. But, surprisingly, to me at least, that doesn't happen.

lambda = eig(F)

lambda = 1.0000 + 0.0000i -1.0000 + 0.0000i -0.0000 + 1.0000i -1.0000 + 0.0000i -1.0000 - 0.0000i -0.0000 + 1.0000i 0.0000 - 1.0000i 1.0000 + 0.0000i -0.0000 - 1.0000i 1.0000 - 0.0000i 1.0000 - 0.0000i 0.0000 - 1.0000i

It's hard to pick through this unsorted output, but there are four +1's, three -1's, three -i's, and only two +i's.

Here is a tricky piece of code that uses `angle` and the counting feature of sparse indexing to count the number of each of the four possible eigenvalues.

```
type eigfftmat
```

function c = eigfftmat(n) % EIGFFTMAT Count eigenvalues of the Fourier matrix. % c = eigfftmat(n) is a 4-vector with counts for +1, -1, -i, +i. % Compute the eigenvalues. e = eig(fft(eye(n))); % Sparse repeated indexing acts as a counter. c = full(sparse(mod(round(angle(e')/(pi/2)),4)+1,1,1))'; c([3 2]) = c([2 3]); end

When we run this code for a sequence of values of `n`, we see the pattern predicted by the McClellan-Parks analysis. The number of each of the four possible eigenvalues depends upon `floor(n/4)` and `mod(n,4)`.

disp(' n +1 -1 -i +i') for n = 4:20 disp([n eigfftmat(n)]) end

n +1 -1 -i +i 4 2 1 1 5 2 1 1 1 6 2 2 1 1 7 2 2 2 1 8 3 2 2 1 9 3 2 2 2 10 3 3 2 2 11 3 3 3 2 12 4 3 3 2 13 4 3 3 3 14 4 4 3 3 15 4 4 4 3 16 5 4 4 3 17 5 4 4 4 18 5 5 4 4 19 5 5 5 4 20 6 5 5 4

The proofs in the McClellan and Parks paper involve the eigenvectors and are quite complicated. It turns out that this MATLAB expression

floor((n+[4 2 1 -1])/4)

generates a 4-vector of the multiplicities of the +1, -1, -i, and +i eigenvalues for any given value of `n`.

J. H. McClellan and T. W. Parks, "Eigenvalues and eigenvectors of the discrete Fourier transformation", IEEE Trans. Audio Electroacoust 20, 66-74, <http://dx.doi.org/10.1109/TAU.1972.1162342>, 1972.

G. Efroymson, A. Steger, and S.Steinberg, "A Matrix Eigenvalue Problem", SIAM Review, 21, 139-139, <http://dx.doi.org/10.1137/1021013>, 1979.

Wikipedia, "Discrete Fourier Transform", <http://en.wikipedia.org/wiki/Discrete_Fourier_transform>, retrieved 09/03/2014.

Get
the MATLAB code

Published with MATLAB® R2014a

Let’s see how this works using a simple example!

**The Problem**

Let's say we have a setup where a DC motor is used to move a load between two positions. The position is measured using an encoder, and a PID is used to control the angle of the DC motor. The model looks like:

And when it simulates, the load tracks the desired position in that manner:

How could we detect if something breaks in the motor, or if the dynamics of the load change?

**Online Identification**

When starting the system we will spend a few seconds to identify its dynamics online (while the system is running). Once we have a model estimated, we will compare the encoder reading with the output of the model. If the error between the real system and the model becomes too large, it's a sign something undesired is happening... and we want to detect that!

I wrapped the control loop shown above in a subsystem and connected to it the Recursive Polynomial Model Estimator block from the System Identification Toolbox library.

Using the enable input port, we enable estimation for the first few seconds (only a few periods of the motor motion are necessary to get a good model). Once the estimation is completed, we monitor the estimation error, using very simple logic blocks to make a signal pass from zero to one as soon as the error level becomes larger than an acceptable level.

**The Results**

To test the logic in simulation, I faked the failure by doubling the friction in my DC motor model. As you can see, when the system dynamics change around 10 seconds, the error flag instantly signals the problem:

**What's next?**

The online estimation block also supports code generation. To see that in action, I recommend watching the video Online Fault Detection for a DC Motor by my colleague Karthik Srinivasan.

In this video, Karthik uses a model similar to the one I describe above, except that instead of a simulated DC Motor, he used driver blocks from the Arduino Simulink Suppport Package, to detect a failure using a real DC motor.

**Now it's your turn**

As you can imagine, fault detection is only one application of online parameters estimation. For example the identified model could be used in any sort of adaptive control.

Let us know what you think of the new online estimation features of the System Identification Toolbox by leaving a comment here.

]]>Jiro's pick this week is Colormouse by Patrick.

In addition to examining the actual values or creating line plots, you can get a lot of insight by visualizing data with color.
MATLAB has a number of functions that let you represent your data with color, including `image`, `surf`, `mesh`, and `contourf`. Many of these functions make use of a `colormap` that represents a range of colors for the values.

contourf(peaks) colorbar

`colorbar` above shows the range of colors represented in the figure. You can programmatically change this color range by using the
`caxis` function.

caxis([-4 4])

Patrick's `colormouse` allows you to change the colormap range interactively using the mouse. Calling `colormouse` adds a toolbar button to the current figure.

Once you're in colormouse mode, you can click-n-drag to move (by moving the mouse up and down) or widen/narrow (left and right) the color range.

You can also select a specific colormap from a context menu.

**Comments**

Let us know what you think here or leave a comment for Patrick.

*Note that the actual sourcecode resides on GitHub, so please check the GitHub page for the license information.*

Get
the MATLAB code

Published with MATLAB® R2014a

I'd like to introduce this week's guest blogger Alan Weiss. Alan writes documentation for mathematical toolboxes here at MathWorks.

An economist I know (well, he's my son) asked me a question the other day. I was able to answer him quickly by using Symbolic Math Toolbox™ calculations. You might find the steps to be an interesting case of a somewhat nonstandard use of symbolic mathematics

My son was working through a paper that described a computational algorithm. He got stuck on one step of the algorithm:

If $A$ is a positive definite (symmetric) matrix, find a **lower triangular** matrix $G$ such that

$$GAG^{-1} = D, \mbox{ where }D\mbox{ is diagonal.}$$

He called me asking if there was some linear algebra that he just never learned. I knew how to find a matrix $G$ that would diagonalize $A$ using the `eig` function. But a lower triangular version? I didn't know how to do that. I told him I would call him back after some investigation.

After spending a couple of minutes fruitlessly looking through a linear algebra book, I decided to see what Symbolic Math Toolbox had to say about the equations that any such matrix $G$ would have to satisfy. Here were my steps.

syms a b c g1 g2 g3 real % Create symbolic variables G = [g1,0;g2,g3] % Lower triangular matrix H = inv(G) % Its inverse A = [a,b;b,c] % Symmetric matrix A D = G*A*H % D is supposed to be diagonal

G = [ g1, 0] [ g2, g3] H = [ 1/g1, 0] [ -g2/(g1*g3), 1/g3] A = [ a, b] [ b, c] D = [ a - (b*g2)/g3, (b*g1)/g3] [ (a*g2 + b*g3)/g1 - (g2*(b*g2 + c*g3))/(g1*g3), (b*g2 + c*g3)/g3]

Look at the upper right entry in D. If D is diagonal, then b*g1 = 0. This means b = 0 or g1 = 0. But if b = 0 then A is already diagonal. And if g1 = 0 then the first row of G is zero, so G is not invertible.

In other words, I had just proved that there is no such thing as a lower triangular matrix that diagonalizes a symmetric matrix in this way.

I called my son back and explained the calculation. We decided that the paper he was reading had a misstatement, and he went ahead and used `eig` to diagonalize the matrix in his application. He will get in touch with the author of the paper to explain the misstatement in the algorithm.

I was quite happy with this result. Symbolic math calculations had saved us both a lot of time searching for an algorithm that doesn't exist.

Do you use symbolic math calculations in cute or innovative ways? Tell us about it here.

Get
the MATLAB code

Published with MATLAB® R2014a

*Note: I really didn't expect to be writing about the FFT two weeks in a row. But ready or not, here we go!*

My friend Joe walked into my office the other day and announced, "Steve, I've got a question about the FFT."

(Joe was one of my MathWorks interviewers 21 years ago this month. His question for me then was, "So what do you bring to the party?" I don't remember how I answered.)

Joe is one of the biggest MATLAB experimenters I know. He's always trying to use MATLAB for some new kind of analysis or to automate something dubious (such as changing the hue of the lights in his house. Hmm.)

Anyway, lately he's been into automatic analysis of music. And something was bothering him. If you have a pure sinusoid signal and take the FFT of it, why does it spread out in the frequency domain? How do you interpret the output of `fft` when the frequency is in between two adjacent frequency bins of the FFT? And is that situation fundamentally different somehow that when the sinusoid's frequency is exactly one of the FFT bin frequencies?

Oh boy. I told Joe, "Well, you don't really have a sinusoid. A sinusoid is an infinitely long signal. You have a finite portion of a sinusoid, which I think of as a rectangularly windowed sinusoid. Go away, let me think more about how to explain all that, and then I'll come find you tomorrow."

Because of my background in digital signal processing, I tend to think about this in terms of the relationships between three kinds of Fourier transforms:

I have written before (23-Nov-2009) about the various kinds of Fourier transforms. It is this last kind, the DFT, that is computed by the MATLAB `fft` function.

It's hard to tell exactly where to start in any discussion about Fourier transform properties because the use of terminology and the mathematical convensions vary so widely. Let me just dive in and walk through Joe's example. If you have questions about what's going on, please post a comment.

*Note: In what follows, I am making a very big simplifying assumption. Specifically, I'm assuming that the sampling frequency is high enough for us to be able to discount the possibility of aliasing in the sampling of the sinusoid. This isn't a completely valid assumption because, unlike a real sinusoid, the truncated or windowed sinusoid doesn't have a finite bandwidth.*

Here some sample parameters that Joe gave me.

Sinusoid frequency in Hz:

f0 = 307;

Sampling frequency in Hz:

Fs = 44100;

Sampling time:

Ts = 1/Fs;

How many samples of the sinusoid do we have? From a theoretical perspective, think of this as the length of a rectangular window that multiplies our infinitely long sinusoid sequence.

M = 1500;

n = 0:(M-1); y = cos(2*pi*f0*Ts*n);

Compute M-point discrete Fourier transform (DFT) of `f`. This is what the function `fft` computes.

Y = fft(y);

Plot the DFT. The DFT itself doesn't have physical frequency units associated with it, but we can plot it on a Hz frequency scale by incorporating our knowledge of the sampling rate. (I'm plotting the DFT as disconnected points to emphasize its discrete-frequency nature. Also, I'll be plotting magnitudes of the Fourier transforms throughout.)

f = (0:(M-1))*Fs/M; plot(f,abs(Y),'*') xlim([0 Fs/2]) title('Magnitude of DFT') xlabel('Frequency (Hz)')

Zoom in on the peak.

p = 2*Fs/(M+1); f1 = f0 - 10*p; f2 = f0 + 10*p; xlim([f1 f2])

What's with the "spread" around the peak? The spread is caused by the fact that we do not have a pure sinusoid. We have only a finite portion of a pure sinusoid.

To explore this further, let's relate the DFT to what some might call the "real" Fourier transform of our windowed sinusoid. If we can discount aliasing, then the discrete-time Fourier transform (DTFT) is the same as the "real" Fourier transform up to half the sampling frequency. And, for a signal that is nonzero only in the interval 0 <= n < N, an M-point DFT exactly samples the DTFT as long as M >= N. The DTFT of an N-point windowed sinuoid has the form a bunch of copies of $sin(\omega (N+1)/2) / sin(\omega/2)$, where the copies are spaced apart by multiples of the sampling frequency.

Can we get a better idea of the true shape of the DTFT? Yes, by zero-padding the input to the DFT. This has the effect of sampling the DTFT more closely. We can use this relationship to plot a pretty good approximation of the DTFT and then superimpose the DFT (the output of `fft`).

M10 = length(y)*10; Y10 = fft(y,M10); f10 = (0:(M10-1))*Fs/M10; plot(f10,abs(Y10)); hold on plot(f,abs(Y),'*') hold off xlim([f1 f2]) legend({'DTFT','DFT'}) xlabel('Frequency (Hz)')

The DFT samples the Fourier transform at a spacing of Fs/M. Let's pick a frequency that lines up exactly on a bin.

f0 = 294; w0 = 2*pi*f0/Fs; n = 0:(M-1); y = cos(w0*n); Y = fft(y); p = 2*Fs/(M+1); f1 = f0 - 10*p; f2 = f0 + 10*p; plot(f,abs(Y),'*') xlim([f1 f2]) title('DFT') xlabel('Frequency (Hz)')

The DFT samples (output of `fft`) **appears** to be perfect above. It has just one nonzero value at the sinusoid frequency.

But the appearance of perfection is just a kind of illusion. You can see what's really happening by superimposing the DFT samples on top of the DTFT, which we compute as before using the zero-padding trick.

Y10 = fft(y,M10); plot(f10,abs(Y10)); hold on plot(f,abs(Y),'*') hold off xlim([f1 f2]) title('DTFT with DFT superimposed') xlabel('Frequency (Hz)')

You can see that most of the DFT samples happen to line up precisely at the zeros of the DTFT.

Finally, let's look at what happens when you use a longer signal. (Joe said, "Can't I make the spread go away by using a sufficiently large number of samples of the sinusoid?" The answer, as we'll see demonstrated, is that you can make the spread get smaller, but you can't make it go away entirely as long as you have a finite number of samples.) Let's go up a factor of 10, to 15,000 samples of the sinusoid. (And let's go back to the original frequency of 307 Hz.)

f0 = 307; M = 15000; w0 = 2*pi*f0/Fs; n = 0:(M-1); y = cos(w0*n); Y = fft(y); M10 = length(y)*10; Y10 = fft(y,M10); f10 = (0:(M10-1))*Fs/M10; f = (0:(M-1))*Fs/M; plot(f10,abs(Y10)); hold on plot(f,abs(Y),'*') hold off xlim([f1 f2]) title('DTFT with DFT superimposed') xlabel('Frequency (Hz)')

The plot above is shown at the same scale as the one for M = 1500 further up. You can see that the peak is much narrower, but it's hard to see the details. Let's zoom in closer.

p = 2*Fs/(M+1); f1 = f0 - 10*p; f2 = f0 + 10*p; xlim([f1 f2])

Now you can see that the DTFT shape and DFT samples look very much like what we got for the 1500-point windowed sinusoid. The width of the peak, as well as the spacing of the sidelobes, has been reduced by a factor of 10 (15000 / 1500). And the spacing (in Hz) of the DFT samples has been reduced by the same amount.

In summary, the Fourier transform of a finite portion of a sinusoid has a characteristic shape: a central peak (the "main lobe") with a bunch of "side lobes" in either direction from the peak. The DFT simply samples this shape. Depending on the exact frequency, the output of the DFT might look a little different, depending on where the frequency happens to line up with the DFT frequency bins.

Get
the MATLAB code

Published with MATLAB® R2014a

Idin's pick for this week is the SimRF Models of Analog Devices RF Agile Transceivers by MathWorks RF Team.

This is probably one of the best documented submissions on the File Exchange. The Simulink models in this submission can be used to accurately simulate a commercial radio frequency (RF) chip (Analog Devices AD9361). The work was the result of collaboration between MathWorks and Analog Devices engineers, and the final result is a software model that matches the behavior of the physical device on the lab bench.

You can get a lot of details on the hardware and these software models here.

I will not go into any further technical details on this page, but I want to point out why I believe this was a useful and interesting project.

First, this is just plain cool! Seeing any software model predict (or match) hardware behavior at the level of detail that we find in these models is always interesting (having run simulations for 15+ years, I’m still amused when what I see in the lab matches what I predicted in simulation; it feels magical every time!)

Second, it helps understand and program the chip. An RF chip like the AD9361 (or most any chip for that matter) is a complex piece of hardware, and you can't see inside of it. This is an "agile" RF transceiver, which means it can be programmed to perform many different tasks for many different radio applications. Understanding the functioning of such a chip, and programming it correctly are rather difficult tasks in hardware, especially when the results of one's work can't be seen immediately (at least not without the use of expensive lab equipment such as spectrum analyzers). Having a software model that accurately mimics the hardware allows me to understand the behavior of the hardware such that I can program it better.

Third, debugging. If you have programmed anything but the simplest piece of hardware, you know that things rarely do what you want on the first try. Understanding apparent anomalies and hardware "misbehavior" is made infinitely easier when we can go to a simulation model and trace different signals to our heart"s content! With this Simulink model I can look at intermediate signals that I could never access on the physical chip. These intermediate signals often hold the key to understanding why the system’s outputs look the way they do.

Fourth (this should really be 3.5), studying the effects of an RF chip on digital baseband signals. These effects are generally something that we observe once we put everything together in the lab. Being able to simulate the digital baseband with the RF front-end will be quite useful in designing a robust communication system before we put any hardware together. Remember that as cool as physical prototypes are, they're typically quite expensive and time-consuming to build.

Finally (or fifth!), it builds trust in Simulink and SimRF. This one explains why we invested the time to do this. The fact that we could construct this model in a relatively short amount of time, and that it matches lab measurements, makes us confident that the simulation software tools are both capable and accurate.

I encourage you to take a look at the documentation and videos on this page to get more information on this package.

As always, your thoughts and comments here are greatly appreciated.

Get
the MATLAB code

Published with MATLAB® R2014b

Since I am really enjoying the book, I decided that for this week's post I would use Simulink to validate some of the numbers provided by the author to one of the question in the book: If an asteroid was very small but supermassive, could you really live on it like the Little Prince?

**What would the gravity feel like on this little planet?**

For a tiny asteroid of 1.75m radius like the one in Le Petit Prince, by Antoine de Saint-Exupéry to have a gravitational field comparable to the one on Earth, it would need to weight about 500 million tons.

But how would this gravitational field change as you get farther form the surface? Since the radius is so small, the effect of the gravity would quickly go down, as illustrated by Randall:

*Source: http://what-if.xkcd.com/68/*

To validate that, I used a new R2014a feature in SimMechanics: The Gravitational Field block. To test the effect of the gravity, I simply linked together the Gravitational Field of 500 million tons and a Solid of 65kg with a Prismatic Joint, and slowly increased the distance between them. As you can see, the SimMechanics results match the book pretty well!

**What About the Escape Velocity?**

The next thing mentioned in the book is that if you were able to run fast enough (~5m/s) and jump, you could escape the asteroid gravity field:

*Source: http://what-if.xkcd.com/68/*

In SimMechanics, I replaced the Prismatic Joint by a Rectangular Joint:

I specified appropriate initial conditions for the joint:

I clicked play and I could see in the Mechanics Explorer that for the chosen speed, I would end up in a pretty steady almost circular orbit:

**How stable would that orbit be?**

The book mentions that the orbit would be *weird*...

*Source: http://what-if.xkcd.com/68/*

To test that, I replaced the Rectangular Joint by a Planar Joint, to allow rotation. Then I dug out the skater I had built in a previous post about the Olympics and without any modification made him move his arms while in orbit... definitely, the trajectory is interesting:

**Now it's your turn**

I guess this blog post highlights the fact that you should always trust what you read on XKCD!

Thanks to the Gravitational Field block of SimMechanics, I can now begin to plan space missions without digging out my notes from the orbital mechanics class I took during grad school!

Let us know what you think by leaving a comment here.

]]>My last meeting of the day ended just a little while ago, and I was starting to think seriously about figuring out a blog post for this week. What to write about? That's when I happened to see that Cleve posted just yesterday about the FFT. Ah ha! I'll write about that, too (in honor of National Blog About the FFT Week).

I have three little sort-of stories to tell, the first of which was inspired directly by Cleve's post. Cleve described a very compact, elegant implementation of the divide-and-conquer idea that's central to fast Fourier transform algorithms. Here's the key code fragment of Cleve's textbook function `ffttx`:

% Recursive divide and conquer k = (0:n/2-1)'; w = omega .^ k; u = ffttx(x(1:2:n-1)); v = w.*ffttx(x(2:2:n)); y = [u+v; u-v];

Function `ffttx` calls itself recursively, twice. Each recursive call is on half of the original input values. Then `ffttx` does a litte extra computation at the end to produce the final result.

Like Cleve, I really love this little bit of code. But it makes me think back to all the FFT code that I saw back when I was first learning digital signal processing. (That would be the mid- to late-1980s.) Here's a typical textbook sample from the 1989 edition of *Discrete-Time Signal Processing* by Oppenheim and Schafer):

This code is typical for the time. The focus is on doing the computation in-place and using as few multiplies as possible. One thing you don't see here: explicit recursion! The function `DITFFT` does not call itself, either directly or indirectly. That was typical for the time. No one wanted to use the memory to make copies of the two half-length arrays, and no one wanted to incur the speed penalty of many recursive function calls. Another typical thing you see in this code is something charmingly called bit-reversed sorting, but that's a topic for another day (maybe).

Anyway, my point is that modern FFT implementations built for today's computer architectures often do make use of explicit recursion. The additional cost of dividing the input array into two or more smaller copies is more than offset by the gains of more cache-friendly execution as the divide-and-conquer problems get smaller and smaller.

My second little story was inspired by one line of code that Cleve posted:

y = [u+v; u-v];

In the case where `u` is `x(1)` and `v` is `x(2)`, then the computation above is just the 2-point DFT of `x`.

A signal flow diagram for it looks like this (also from *Discrete-Time Signal Processing*):

Digital signal processing experts have for decades referred to this little graph as a "butterfly diagram" because of its appearance.

Well, to make a little extra money for myself in grad school, I wrote a couple of chapters of the solutions manual for *Discrete-Time Signal Processing*. One of the chapters covered FFTs. This effort left me completely fed up with drawing butterfly diagrams and counting multiplies. One day I posted the following chart outside my cubicle:

Yes, digital signal processing people even call that mess above a "butterfly diagram." We're a weird bunch.

That butterfly diagram indirectly leads me to my third little FFT story.

About ten years ago, a tech support case was forwarded to me. A customer complained that MATLAB was very slow in computing the FFT of one specific vector. Other vectors of the same length would go really fast, but not this one.

So I loaded the customer's MAT-file containing the vector and started running FFTs on it. To my surprise, it really was slow! I scratched my head. Then I looked at the output values, and I was even more surprised to see that they were all NaNs! Strange! What was the point? And was this related to the speed issue?

I looked at the long input vector provided by the customer, and I found that just one of the input values was NaN.

Light bulb!

Take a look again at that complicated butterfly diagram above. Notice that every output value depends on every input value. That's true in general for FFT computations of any length. With the way NaNs propagate through arithmetic operations, that implies that just one NaN in the input vector means that the output of `fft` will always be completely filled with NaNs.

That was related to the speed issue, as it turned out. NaN calculations often are executed in a less-well optimized part of the CPU, or even in software. Lots and lots of NaN calculations bogged down this particular customer's problem.

FFTs are a never-ending source of fun and mystery!

I have written quite a few blog posts about Fourier transform theory and algorithms. You can see them all in the Fourier transforms category.

PS. If you have ideas for even more fun ways to create a vector of NaNs, be sure to add them to the comments on this post.

Get
the MATLAB code

Published with MATLAB® R2014a

This is the second in a series of three posts about the Finite Fourier Transform. This post is about the fast FFT algorithm itself. A recursive divide and conquer algorithm is implemented in an elegant MATLAB function named `ffttx`.

The acronym FFT is ambiguous. The first F stands for both "fast" and "finite". A more accurate abbreviation would be FFFT, but nobody wants to use that. In MATLAB, the expression `fft(x)` computes the finite Fourier transform of any vector `x`. The computation is fastest if the integer `n = length(x)` is the product of powers of small primes.

Direct application of the definition

$$ Y_k = \sum_{j=0}^{n-1} \omega^{jk} y_j, \ k = 0, \ldots, n-1 $$

requires $n$ multiplications and $n$ additions for each of the $n$ components of $Y$ for a total of $2 n^2$ floating-point operations. This does not include the generation of the powers of $\omega$. A computer capable of doing one multiplication and addition every microsecond would require a million seconds, or about 11.5 days, to do a million-point FFT.

Several people discovered fast FFT algorithms independently and many people have since contributed to their development, but it was a 1965 paper by John Tukey of Princeton University and John Cooley of IBM Research that is generally credited as the starting point for the modern usage of the FFT.

Modern fast FFT algorithms have computational complexity $O(n \mbox{ log}_2 n)$ instead of $O(n^2)$. If $n$ is a power of 2, a one-dimensional FFT of length $n$ requires fewer than $3 n \mbox{ log}_2 n$ floating-point operations. For $n = 2^{20}$, that's a factor of almost 35,000 faster than $2 n^2$. Even if $n = 2^{10}$, the factor is about 70.

With MATLAB 8.3 and a 2.7 GMHz Intel Core i7 laptop, the time required for `fft(x)` if `length(x)` is `2^23 = 8388608` is about 0.5 second. The built-in `fft` function is based on FFTW, "The Fastest Fourier Transform in the West," developed at M.I.T. by Matteo Frigo and Steven G. Johnson in 1998.

The key to the fast FFT algorithms is the fact that the square of the $2n$ th root of unity is the $n$ th root of unity. Using complex notation,

$$ \omega = \omega_n = e^{-2 \pi i/n} $$

we have

$$ \omega_{2n}^2 = \omega_n $$

The derivation of the fast algorithm starts with the definition of the finite Fourier transform:

$$ Y_k = \sum_{j=0}^{n-1} \omega^{jk} y_j, \ k = 0, \ldots, n-1 $$

Divide the sum into terms with even subscripts and terms with odd subscripts.

$$ Y_k = \sum_{even \ j} \omega^{jk} y_j \ + \ \sum_{ odd \ j} \omega^{jk} y_j $$

Assume that $n$ is even, that $k \leq n/2-1$ and repurpose the summation subscripts.

$$ Y_k = \sum_{j=0}^{n/2-1} \omega^{2jk} y_{2j} \ + \ \omega^k \sum_{j=0}^{n/2-1} \omega^{2jk} y_{2j+1} $$

The two sums are components of the FFTs of length $n/2$ of the portions of $y$ with even and odd subscripts. In order to get the entire FFT of length $n$, we have to do two FFTs of length $n/2$, multiply one of these by powers of $\omega$, and concatenate the results.

The relationship between an FFT of length $n$ and two FFTs of length $n/2$ can be expressed compactly in MATLAB.

omega = exp(-2*pi*i/n); k = (0:n/2-1)'; w = omega .^ k; u = fft(y(1:2:n-1)); v = w.*fft(y(2:2:n));

then

fft(y) = [u+v; u-v];

Now, if $n$ is not only even but actually a power of 2, the process can be repeated. The FFT of length $n$ is expressed in terms of two FFTs of length $n/2$, then four FFTs of length $n/4$, then eight FFTs of length $n/8$, and so on until we reach $n$ FFTs of length one. An FFT of length one is just the number itself. If $n = 2^p$, the number of steps in the recursion is $p$. There is $O(n)$ work at each step, so the total amount of work is

$$ O(np) = O(n \mbox{ log}_2 n) $$

If $n$ is not a power of two, it is still possible to express the FFT of length $n$ in terms of several shorter FFTs. An FFT of length 100 is two FFTs of length 50, or four FFTs of length 25. An FFT of length 25 can be expressed in terms of five FFTs of length 5. If $n$ is not a prime number, an FFT of length $n$ can be expressed in terms of FFTs whose lengths divide $n$. Even if $n$ is prime, it is possible to embed the FFT in another whose length can be factored. We do not go into the details of these algorithms here.

The textbook function `ffttx` combines two basic ideas. If $n$ is a power of 2, it uses the $O(n \mbox{ log}_2 n)$ fast algorithm. If $n$ has an odd factor, it uses the fast recursion until it reaches an odd length, then sets up the discrete Fourier matrix and uses matrix-vector multiplication.

`ffttx` is one of my favorite MATLAB functions. Here is the entire code.

function y = ffttx(x) %FFTTX Textbook Fast Finite Fourier Transform. % FFTTX(X) computes the same finite Fourier transform % as FFT(X). The code uses a recursive divide and % conquer algorithm for even order and matrix-vector % multiplication for odd order. If length(X) is m*p % where m is odd and p is a power of 2, the computational % complexity of this approach is O(m^2)*O(p*log2(p)).

x = x(:); n = length(x); omega = exp(-2*pi*i/n);

if rem(n,2) == 0 % Recursive divide and conquer k = (0:n/2-1)'; w = omega .^ k; u = ffttx(x(1:2:n-1)); v = w.*ffttx(x(2:2:n)); y = [u+v; u-v]; else % The Fourier matrix j = 0:n-1; k = j'; F = omega .^ (k*j); y = F*x; end

Here is a plot of the Fourier matrix of order 16. Plots like this are the subject of the next post in this series. For now, just notice that this is *not* a complete graph with 16 nodes.

plot(fft(eye(16))) axis equal axis off

Steve Eddins has written extensively about the FFT in his blog. Select "Categories more" and then click on "Fourier transforms". <http://blogs.mathworks.com/steve>

James W. Cooley and John W. Tukey, "An algorithm for the machine calculation of complex Fourier series", Math. Comput. 19: 297-301, 1965. <http://dx.doi.org/10.2307%2F2003354>

Matteo Frigo and Steven G. Johnson, "FFTW, Fastest Fourier Transform in the West", <http://www.fftw.org>.

Get
the MATLAB code

Published with MATLAB® R2014a

*Marta is a senior application engineer at MathWorks UK, helping MATLAB users to design, develop and deploy production-quality code. Toshi is a Senior Marketing Manager at MathWorks.*

*by Marta Wilczkowiak and Toshi Takeuchi*

In the previous post, you learned how we built an RFID-based people tracking system from scratch in three months, a “technology stack” that spanned multiple hardware and software disciplines. We ended the post with how we managed to survive the moment of the truth: the first integration test, live at a conference with hundreds of delegates.

Now we had raw data to analyze to improve our system.

**The Twist – Making Sense of RFID Data**

Once we examined the real word data collected during the event more closely, we faced a new dilemma – we had originally assumed we could use signal strength to estimate distances. However, in reality there was so much fluctuation in signal strength that you could not make a confident estimate this way. This, incidentally, is why RFID-based localization is a hot research topic today.

We were determined to work on an engineering solution, but it would take more time and effort. In the meantime, was there a way to get meaningful insights from the messy data at hand?

**Web Analytics Meets Engineering **

As we discussed the conference data internally at MathWorks, this issue came to the attention of Toshi Takeuchi, a US-based analyst. He immediately saw a similarity with the familiar problems he faced in analyzing massive volume of page views and clicks – always messy. What would happen if you applied web analytics techniques to this engineering problem? We embraced this suggestion and immediately shared our code and data with Toshi. Our collaboration was possible because we all spoke the same language – MATLAB.

**Use of Proxy Metric**

Non-physical measurements are often used in web analytics. For example, the recommendations you see on online shopping or movie streaming websites are computed using distance measured in terms of similarity or affinity. Was there anything in the dataset, besides the signal strength, that we could turn into a proxy metric? The new approach Toshi came up with was to recast connection frequency as this virtual distance. If two tracking devices are connecting to one another more frequently than others in a given time interval, then we could think of them being “close” in the virtual space defined by connection frequency. This metric seemed far more reliable than signal strength. Our original goal was to understand the social interactions of the delegates, and physical distance metric was just a means to that end. If a virtual metric did the same job, then it seemed a good substitute.

**The Result**

With the support from Toshi, we were able to reconstruct the estimated movements of the delegates using the connection frequencies with stationary anchors such as demo stations (orange circles) and catering (green circles). Note that no personally identifiable information was collected or used for this analysis.

In the plot above, you see such an estimated movement for Delegate #120. This person appeared to have approached a nearby catering station first, then spent quite some time around BLOODHOUND Super Sonic Car in the middle, but in the afternoon stayed “closer” to software related demos than hardware-based demos as measured in this “virtual” space.

We can use this information to map the frequencies of connections between delegates and demo stations, and visualize it as a heat map. The vertical bands with lighter red stripes indicates which demos had more connections than others, and horizontal stripes show to which demo stations each delegate connected frequently. You can see that people who connected with “RT” (a real-time testing exhibit) and “HW” (a hardware testing exhibit) had very little overlap with the people who connected with “Robo” (a robotics exhibit).

**The Insight – the Three Factors of Delegate Behavior **

Ultimately, the data is only as good as the insights you get from it. When we computed the estimated movements of the delegates with a technique called Principal Component Analysis, we could summarize the complex data by a mere three factors.

- The “Web”: people who showed strong “connections” with the “Web” demo
- The “Robo”: people who showed strong “connections” with “Robo” demo
- The “Conventional”: people who showed strong “connections” with “HW” demo

Furthermore, “Web” people and “Robo” people tend to share similar connection patterns as compared to “Conventional” people – what is common to both? They are both what we might call “convergence” applications where software and hardware are being integrated in an unconventional way.

While this might not be a novel insight, it was still very useful for us to know that the data backed up what we had been intuitively sensing all along. It was also interesting to see which delegates showed which particular inclinations on an individual basis (each dot in the plot identify a specific delegate).

**Back to Reality – the Next Step**

It is tempting to speculate if the virtual proximity on this plot correlates to actual shared interest among those delegates. If so, it is also interesting that two of the main factors turned out to be the web and robotics, which are examples of convergence-oriented applications that cross the traditional virtual/physical divide.

If there was an exit survey that asked the participants which demos they found interesting, we could use predictive analytics techniques to validate how well this virtual metric predicts the actual interest. Alternatively, we could use additional stationary anchors to determine which sessions delegates attended. If we placed more anchors on the floors in a more evenly spaced grid, including the areas populated by third party exhibitors, it would give more data points and more accurate measurements.

If we validated this correlation, it could open up opportunities for new ways for our delegates to find like-minded people.

**The Journey Continues**

Our journey is not over yet – we still have challenges with the system. But our project proved how such an inter-disciplinary system could be built in three months using MATLAB, Simulink, and lots of team work.

]]>Sean's pick this week is Simscape Language: Nonlinear Rotational Spring by Steve Miller.

I spend a fair amount of time exploring the MathWorks product suite trying to learn new things. Exploring the File Exchange, I came upon this example of a nonlinear spring. Having not had much opportunity to use our Physical Modeling Tools this example caught my eye because it uses something us civil/structural engineers are familiar with: springs!

The example includes a README file to get started, a PDF file, and a link to a video. Not going to lie, I didn't watch the video. But I didn't have to! The model documents how things work and the README gave me all of the instructions necessary to get this off the ground.

Here is the model and the scope showing a linear spring compared to the nonlinear spring:

This is the design of the nonlinear spring in the Simscape Language.

Out of curiosity, I wanted to see what would happen for a much stiffer spring (that's what structural engineers use!), so
I changed the value of *spring_rate* to 50 and reran the model:

And that's the best part about simulation: I didn't have to go buy or test a bigger spring to see what would happen.

Give it a try and let us know what you think here or leave a comment for Steve.

Get
the MATLAB code

Published with MATLAB® R2014a

How can you find out what's in an image file before you read in the data? The answer is the function `imfinfo`. Today I want to explore the basics of using this function.

The function `imfinfo` returns a struct. The fields of the struct contain information about the image file.

Here's an example.

```
imfinfo('peppers.png')
```

ans = Filename: '/Applications/MATLAB_R2014a.app/toolbox/matla...' FileModDate: '02-Apr-2013 15:55:52' FileSize: 287677 Format: 'png' FormatVersion: [] Width: 512 Height: 384 BitDepth: 24 ColorType: 'truecolor' FormatSignature: [137 80 78 71 13 10 26 10] Colormap: [] Histogram: [] InterlaceType: 'none' Transparency: 'none' SimpleTransparencyData: [] BackgroundColor: [] RenderingIntent: [] Chromaticities: [] Gamma: [] XResolution: [] YResolution: [] ResolutionUnit: [] XOffset: [] YOffset: [] OffsetUnit: [] SignificantBits: [] ImageModTime: '16 Jul 2002 16:46:41 +0000' Title: [] Author: [] Description: 'Zesty peppers' Copyright: 'Copyright The MathWorks, Inc.' CreationTime: [] Software: [] Disclaimer: [] Warning: [] Source: [] Comment: [] OtherText: []

The fields returned by `imfinfo` aren't the same for all image files. Here's the output for a TIFF file.

```
imfinfo('forest.tif')
```

ans = Filename: '/Applications/MATLAB_R2014a.app/toolbox/im...' FileModDate: '25-Sep-2013 20:12:00' FileSize: 124888 Format: 'tif' FormatVersion: [] Width: 447 Height: 301 BitDepth: 8 ColorType: 'indexed' FormatSignature: [73 73 42 0] ByteOrder: 'little-endian' NewSubFileType: 0 BitsPerSample: 8 Compression: 'PackBits' PhotometricInterpretation: 'RGB Palette' StripOffsets: [1x17 double] SamplesPerPixel: 1 RowsPerStrip: 18 StripByteCounts: [1x17 double] XResolution: 72 YResolution: 72 ResolutionUnit: 'Inch' Colormap: [256x3 double] PlanarConfiguration: 'Chunky' TileWidth: [] TileLength: [] TileOffsets: [] TileByteCounts: [] Orientation: 1 FillOrder: 1 GrayResponseUnit: 0.0100 MaxSampleValue: 255 MinSampleValue: 0 Thresholding: 1 Offset: 122964 ImageDescription: 'Carmanah Ancient Forest, British Columbia,...'

Only the first nine fields of the returned struct are always the same:

`Filename``FileModDate``FileSize``Format``FormatVersion``Width``Height``BitDepth``ColorType`

`FileModDate` is the file modification date (as reported by the operating system). `FileSize` is the size of the file in bytes. The peppers.png file takes up 287,677 bytes on disk.

`Format` is an abbreviation indicating the type of image file (PNG, TIFF, JPEG, etc.). Although frequently the filename extension indicates the image format type, this isn't always the case. The `Format` field shows you the format type whether or not the filename extension indicates it.

`FormatVersion` has turned out to be less useful than I thought would be back in the mid-1990s. You generally don't need to pay attention to it.

`Width` is the number of image columns, and `Height` is the number of rows.

If I were designing `imfinfo` from scratch today, I'd probably call the next field `BitsPerPixel` instead of `BitDepth`. I might also add a field called `SamplesPerPixel`. Anyway, images such as peppers.png used to be frequently called "24-bit images" because each pixel is represented by three samples (one number for red, one for green, and one for blue), and each sample is represented by 8 bits. 3*8 is 24, hence 24-bit image.

The file forest.tif, on the other hand, uses only one sample for each pixel, and each sample is represented by 8 bits, so forest.tif is sometimes called an "8-bit image," and its `BitDepth` is 8.

`ColorType` tells us how the pixel values are interpreted as colors on the screen. The most common values for `ColorType` are `'grayscale'`, `'indexed'`, and `'truecolor'`. In a grayscale image, each pixel value represents a shade of gray. In an indexed image, each pixel value is really a lookup index used to get the pixel color from a colormap (also called a palette). Truecolor (in this context) originally referred to a 24-bit image with three samples per pixels, and with the samples representing red, green, and blue. In MATLAB, though, we use the term to refer to any RGB image represented by three samples per pixel, regardless of the bit depth.

Here's an example of using the information returned by `imfinfo` to compute the compression ratio for a JPEG file. Let's use ngc6543a.jpg, the first truecolor sample image shipped with MATLAB.

```
info = imfinfo('ngc6543a.jpg');
file_bytes = info.FileSize;
image_bytes = info.Width * info.Height * info.BitDepth / 8;
```

I divided by 8 in the line above because `BitDepth` is in bits instead of bytes.

compression_ratio = image_bytes / file_bytes

compression_ratio = 42.7210

So what about all those other fields after the first nine? Well, they vary depending on the file format and the specific information contained in the file. For example, TIFF files optionally contain horizontal and vertical resolution information.

```
info = imfinfo('forest.tif');
info.XResolution
```

ans = 72

info.YResolution

ans = 72

info.ResolutionUnit

ans = Inch

Most image file formats allow you to store text descriptions in them.

```
info = imfinfo('peppers.png');
info.Description
```

ans = Zesty peppers

These days, most cameras store a lot more information in image files than the older sample images that are in MATLAB and the Image Processing Toolbox. This information is also available using `imfinfo`.

imshow('IMG_1046.JPG','InitialMagnification',20)

```
info = imfinfo('IMG_1046.JPG')
```

info = Filename: '/Users/steve/Dropbox/MATLAB/Blogs/IMG_1046.JPG' FileModDate: '28-Jun-2014 22:07:32' FileSize: 1595137 Format: 'jpg' FormatVersion: '' Width: 2816 Height: 1880 BitDepth: 24 ColorType: 'truecolor' FormatSignature: '' NumberOfSamples: 3 CodingMethod: 'Huffman' CodingProcess: 'Sequential' Comment: {} ImageDescription: ' ' Make: 'Canon' Model: 'Canon PowerShot S95' Orientation: 1 XResolution: 180 YResolution: 180 ResolutionUnit: 'Inch' DateTime: '2014:06:28 19:07:32' YCbCrPositioning: 'Co-sited' DigitalCamera: [1x1 struct] ExifThumbnail: [1x1 struct]

That extra information is in the `DigitalCamera` field.

info.DigitalCamera

ans = ExposureTime: 0.0040 FNumber: 4.9000 ISOSpeedRatings: 80 ExifVersion: [48 50 51 48] DateTimeOriginal: '2014:06:28 19:07:32' DateTimeDigitized: '2014:06:28 19:07:32' ComponentsConfiguration: 'YCbCr' CompressedBitsPerPixel: 3 ShutterSpeedValue: 7.9688 ApertureValue: 4.5938 ExposureBiasValue: 0 MaxApertureValue: 4.5938 MeteringMode: 'Pattern' Flash: 'Flash did not fire, no strobe return detect...' FocalLength: 22.5000 MakerNote: [1x2764 double] UserComment: [1x264 double] FlashpixVersion: [48 49 48 48] ColorSpace: 'sRGB' CPixelXDimension: 2816 CPixelYDimension: 1880 InteroperabilityIFD: [1x1 struct] FocalPlaneXResolution: 9.6438e+03 FocalPlaneYResolution: 9.6575e+03 FocalPlaneResolutionUnit: 2 SensingMethod: 'One-chip color area sensor' FileSource: 'DSC' CustomRendered: 'Normal process' ExposureMode: 'Auto exposure' WhiteBalance: 'Auto white balance' DigitalZoomRatio: 1 SceneCaptureType: 'Standard' UnknownTags: [1x1 struct]

Do you have questions about `imfinfo`? Do you ever use it for writing batch processing scripts? I'd be interested to hear about it, so please leave a comment below.

Get
the MATLAB code

Published with MATLAB® R2014a

**The Problem**

If you ask me what is the difference between Simulink Coder and Embedded Coder, I would tell you that Simulink Coder allows you to generate code from a Simulink model, and Embedded Coder allows you to configure how the code looks like.

To illustrate that, we will start with a simple hand-written program, and see how to configure a model so that the code generated from it integrates without modification.

In this simple `main`, at every second, the program reads data from a text file and stores it in a global variable `u`. We want our auto-generated code to access this variable, and use it to compute value of another global variable `y` to be used later in the code.

**Configuring the Model**

For this example, let's use the following simple model:.

By default, if we generate code for this model we get something that looks like:

It is obvious that modifications are necessary to map the values of `u` and `y` in the hand-written code to the input and output of the generated code.

To configure the look of the code, the first step is to name the input signal `u` and the output `y`. Then we need to tell Embedded Coder that `u` and `y` are external variables, already defined outside of the generated code, in the hand-written code.

For that, right-click on the signal line and select Properties:

Go to the Code Generation tab, and set the storage class to `ImportedExtern` (Note that the storage class and a lot more could also have been specified using a data object).

**Generated Code**

Generate code for the model, and you should get something which looks like this:

Where `u` and `y` are declared as `extern`:

With this modification, the generated code integrates in the hand-written application, I can build the `main` program without errors.

**Now it's your turn**

There are many of ways to customize the code generated by Embedded Coder. This example is probably the simplest one possible, but I hope this gives you a good idea of how to get started. Let us know how you got started with the coders by leaving a comment here.

]]>Many of us carry around smartphones that can track our GPS positions and that's an interesting source of data. How can we analyze GPS data in MATLAB?

Today's guest blogger, Toshi Takeuchi, would like to share an analysis of a public GPS dataset from a popular ride sharing service Uber.

Uber is a ride sharing service that connects passengers with private drivers through a mobile app and takes care of payment. They are in fact so popular that you hear about them in the news due to their conflicts with local traffic regulations and taxi business interests.

Uber's ride sharing GPS dataset happens to be available publicly for analsysis. What can we learn from this dataset?

Let's start by downloading the dataset from the link above (a zipped TSV file), which contains the GPS logs taken from the mobile apps in Uber cars that were actively transporting passengers in San Francisco. The data have been anonymized by removing names, trip start and end points. The dates were also substituted. Weekdays and time of day are still intact.

For the purpose of this analysis, let's focus on the data captured in the city proper and visualize it with Mapping Toolbox.

Run the script to load data. Check `loadData.m` to see the details.

loadData

Overlay the GPS points on the map.

states = geoshape(shaperead('usastatehi', 'UseGeoCoords', true)); latlim = [min(T.Lat) max(T.Lat)]; lonlim = [min(T.Lon) max(T.Lon)]; ocean = [0.7 0.8 1]; land = [0.9 0.9 0.8]; figure ax = usamap(latlim, lonlim); setm(ax, 'FFaceColor', ocean) geoshow(states,'FaceColor',land) geoshow(T.Lat,T.Lon,'DisplayType','Point','Marker','.',... 'MarkerSize',4,'MarkerEdgeColor',[0 0 1]) title('Uber GPS Log Data') xlabel('San Francisco') textm(37.802069,-122.446618,'Marina') textm(37.808376,-122.426105,'Fishermans Wharf') textm(37.797322,-122.482409,'Presidio') textm(37.774546,-122.412329,'SOMA') textm(37.770731,-122.440481,'Haight') textm(37.818276,-122.498546,'Golden Gate Bridge') textm(37.819632,-122.376065,'Bay Bridge')

Let's start with a basic question - how does the use of Uber service change over time. We can use `grpstats` to summarize data grouped by specific categorical values, such as `DayName` and `TimeOfDay`, which were added in the data loading process.

Get grouped summaries.

byDay = grpstats(T(:,{'Lat','Lon','DayName'}),'DayName'); byDayTime = grpstats(T(:,{'Lat','Lon','TimeOfDay','DayName'}),... {'DayName','TimeOfDay'});

Reshape the count of entries into a 24x7 matrix.

byDayTimeCount = reshape(byDayTime.GroupCount,24,7)';

Plot the data by day of week and by hours per day of week.

figure subplot(2,1,1) bar(byDay.GroupCount); set(gca,'XTick',1:7,'XTickLabel',cellstr(byDay.DayName)); subplot(2,1,2) plot(byDayTimeCount'); set(gca,'XTick',1:24); xlabel('Hours by Day of Week'); legend('Mon','Tue','Wed','Thu','Fri','Sat','Sun',... 'Orientation','Horizontal','Location','SouthOutside')

It looks like the usage goes up during the weekend (Friday through Sunday) and usage peaks in early hours of the day. San Francisco has a very active night life!

Is there a way to figure out where people go during the weekend? Even though the dataset doesn't contain the actual starting and ending points of individual trips, we may still get a sense of how the traffic flows by looking at the first and last points of each record.

We can extract the starting and ending location data for weekend rides. Click `getStartEndPoints.m` to see how it is done. If you would like to run this script, please download `districts.xlsx` as well.

% Here we load the preprocessed data |startEnd.mat| to save time and plot % their starting points. % getStartEndPoints % commented out to save time load startEnd.mat % load the preprocessed data instead figure ax = usamap(latlim, lonlim); setm(ax, 'FFaceColor', ocean) geoshow(states,'FaceColor',land) geoshow(startEnd.StartLat,startEnd.StartLon,'DisplayType','Point',... 'Marker','.','MarkerSize',5,'MarkerEdgeColor',[0 0 1]) title('Uber Weekend Rides - Starting Points') xlabel('San Francisco') textm(37.802069,-122.446618,'Marina') textm(37.808376,-122.426105,'Fishermans Wharf') textm(37.797322,-122.482409,'Presidio') textm(37.774546,-122.412329,'SOMA') textm(37.770731,-122.440481,'Haight') textm(37.818276,-122.498546,'Golden Gate Bridge') textm(37.819632,-122.376065,'Bay Bridge')

When you plot the longitude and latitude data, you just get messy point clouds and it is hard to see what's going on. Instead, I broke the map of San Francisco into rectangular blocks to approximate its districts. Here is the new plot of starting points by district.

dist = categories(startEnd.StartDist); cc = hsv(length(dist)); figure ax = usamap(latlim, lonlim); setm(ax, 'FFaceColor', ocean) geoshow(states,'FaceColor',land) for i = 1:length(dist) inDist = startEnd.StartDist == dist(i); geoshow(startEnd.StartLat(inDist),startEnd.StartLon(inDist),... 'DisplayType','Point','Marker','.','MarkerSize',5,'MarkerEdgeColor',cc(i,:)) end title('Uber Weekend Rides - Starting Points by District') xlabel('San Francisco') textm(37.802069,-122.446618,'Marina') textm(37.808376,-122.426105,'Fishermans Wharf') textm(37.797322,-122.482409,'Presidio') textm(37.774546,-122.412329,'SOMA') textm(37.770731,-122.440481,'Haight') textm(37.818276,-122.498546,'Golden Gate Bridge') textm(37.819632,-122.376065,'Bay Bridge')

This is a step in the right direction. Now that we have the starting and ending points grouped by districts, we can represent the rides as connections among different districts - this is essentially a graph with districts as nodes and rides as edges. To visualize this graph, we can use a popular social networking analysis tool Gephi, which was also used in another post, Analyzing Twitter with MATLAB.

You can export `StartDist` and `EndDist` as the edge list to Gephi in CSV format.

writetable(startEnd(:,{'StartDist','EndDist'}),'edgelist.csv',... 'WriteVariableNames',false)

Once you export the edge list, you can plot the connections (edges) between districts (nodes) in Gephi. Now it is much easier to see where people went during the weekend! To see a bigger image, check out the PDF version.

- The size of the district nodes represents their in-degrees, the number of incoming connections, and you can think of it as measure of popularity as destinations. SOMA, Haight, Mission District, Downtown, and The Castro are the popular locations based on this measure.
- The districts are colored based on their modularity, which basically means which cluster of nodes they belong to. It looks like people hang around set of districts that are nearby - SOMA, Downtown, Mission District are all located towards the south (green). The Castro, Haight, Western Addition in the center (purple) and it is strongly connected to Richmond and Sunset District in the west. Since those are residential areas, it seems people from those areas hang out in the other districts in the same cluster.
- The locals don't seem to go to Fisherman's Wharf or Chinatown in the north (red) very much - they are probably considered not cool because of tourists?

Now you know where to go in San Francisco during the weekend if you want to experience an active night life there. We just looked at the overall weekend data, but you can explore more by time slicing the data to see how the traffic pattern changes based on the time of day or day of the week. You may be able to find traffic congestions by calculating the speed using the timestamps. Try it yourself and share what you find here!.

Get
the MATLAB code

Published with MATLAB® R2014a

Brett's Picks this week are: Image Blur Metric, by Do Quoc Bao; Blind image quality assessment through anisotropy, by Salvador Gabarda; and Noise Level Estimation from a Single Image, by Masayuki Tanaka.

Brett returns to his roots as an image processing geek this week to pose a question that he gets asked by customers from time to time. Namely, *'How can I quantify the noise in my image?'*

In R2014a, we added two functions that facilitate the measurement of image noise: `ssim`, with which you can calculate the overall or pixelwise "structural similarity index"; and `psnr`, for computing the signal-to-noise ratio (peak or simple) of an image.

Both of these calculations are made by comparing a potentially noisy image to a "pristine" one; if you have a reference image and you need to calculate deviations from it, these functions may serve you well. (Note that there are many other files on the the File Exchange that facilitate different reference-image quality metrics. Do a quick search for "image quality"; you'll find several.)

But what if you don't have a reference image to which to compare your noisy image? That is, what if you need to do a "blind" image quality assessment? This can be simultaneously much more useful and much more difficult. Approaches to blind noise assessment often focus on quantification of blur or, conversely, determination of edge sharpness. I have on numerous occasions successfully used the "threshold" output of the various `edge` metrics to decide which images are of higher quality. For instance:

ax = zeros(3,1); % Pristine img = imread('rice.png'); ax(1) = subplot(1,3,1); [~,thresh] = edge(img,'sobel'); imshow(img) title(sprintf('Edge threshold = %0.2f',thresh)) % Some noise h = fspecial('motion', 5, 0); img2 = imfilter(img,h,'replicate'); ax(2) = subplot(1,3,2); [~,thresh] = edge(img2,'sobel'); imshow(img2); title(sprintf('Edge threshold = %0.2f',thresh)) % More noise h = fspecial('motion', 15, 0); img3 = imfilter(img,h,'replicate'); ax(3) = subplot(1,3,3); [~,thresh] = edge(img3,'sobel'); imshow(img3) title(sprintf('Edge threshold = %0.2f',thresh))

Todays Picks facilitate blind image quality assessment. The first is Do Quoc Bao's Image Blur Metric. While the code could use some commenting and error checking, it is well implemented; it cites the SPIE Procedings from which it was taken; and it is exceedingly easy to use; simply, the blur metric is reported to range from 0 to 1, with 0 indicating "sharp," and 1 representing "blurry":

title(ax(1),sprintf('Blur = %0.2f',blurMetric(img))) title(ax(2),sprintf('Blur = %0.2f',blurMetric(img2))) title(ax(3),sprintf('Blur = %0.2f',blurMetric(img3)))

Next, check out Blind image quality assessment through anisotropy, by Salvador Gabarda. Salvador's blind analysis "discriminates blur and Gaussian noise...given a set of registered images, the image showing the highest index is the best. It works on grayscale or color images, and provides lots of parameters to play with.

title(ax(1),sprintf('1000*Quality = %0.2f',1000*blindimagequality(img,2))) title(ax(2),sprintf('1000*Quality = %0.2f',1000*blindimagequality(img2,2))) title(ax(3),sprintf('1000*Quality = %0.2f',1000*blindimagequality(img3,2)))

Finally, consider Noise Level Estimation from a Single Image, by Masayuki Tanaka. I haven't read Masayuki's original paper on the topic; perhaps it explains why the output value appears to change inversely with blur. (Masayuki, care to chime in on that?) This implementation also works on grayscale or color images, though analysis is plane-by-plane for the latter case. Also, I'm not sure why the output value appears to change inversely with blur:

title(ax(1),sprintf('1000*Noise Estimate = %0.2f',1000*NLEstimate(img))) title(ax(2),sprintf('1000*Noise Estimate = %0.2f',1000*NLEstimate(img2))) title(ax(3),sprintf('1000*Noise Estimate = %0.2f',1000*NLEstimate(img3)))

What use cases do you have for estimating noise in images? Are you more likely to need "blind" approaches, or do methods that compare to a reference image typically suffice? I'd love to hear from you!

As always, I welcome your thoughts and comments.

Get
the MATLAB code

Published with MATLAB® R2014a

*The story is broken into two separate blog posts. This week’s installment is told by Marta Wilczkowiak. Marta is a senior application engineer at MathWorks UK, helping MATLAB users to design, develop and deploy production-quality code.*

*by Marta Wilczkowiak*

Technology can be the kiss of death to a conversation. But can we instead, through the thoughtful convergence of hardware, software, and real-time streaming data, actually encourage social interactions?

The barrier that existed between physical and virtual in engineering is disappearing in the world where internet companies buy robotics companies, fly drones, test self-driving cars and wearable devices, while the industrial players are moving into sensor networks and Internet of Things. While those new approaches are promising, such inter-disciplinary engineering collaboration is very difficult to achieve.

This is a story about our attempt to build such a convergent system in just three months, through cross-disciplinary collaboration. This blue-sky project is called Expo Conversations.

**EXPO Conversations: The Idea**

MATLAB EXPO is the pinnacle of MathWorks UK events. It is a unique opportunity for the MATLAB community to network with peers across a range of industries and academia. Planning starts early in the year and involves engineers from all technical groups at MathWorks. During one of those meetings, we set ourselves a challenge: to create a system bringing together all the new technologies we tell our customers about, including low cost hardware, data analytics and cloud computing. At the same time, we saw an opportunity to understand how ideas and influences flow throughout EXPO.

Our idea was to use low cost hardware to detect the interactions between the EXPO participants, and provide a web interface displaying the live analysis of the social activity. To encourage social interactions we would run a contest and give a prize to the most prolific networker at the event.

*Figure 1: Simulated social interactions in the EXPO Exhibition Hall*

This conversation took place in June. EXPO was scheduled for October 8th so we had little time to develop something from nothing!

To summarize, the objective was to:

- Demonstrate end-to-end technology stack integration, starting from low cost hardware to the cloud-based visualization
- Enable cross-disciplinary team collaboration
- Move from a blue sky idea to cloud deployment in 3 months

**July: First tag**

14 weeks to go

July started with discussion about the best approach to detect the attendee interactions: Cameras? Apps on mobile phones? RF tags? We wanted participants to remain anonymous so we opted for RF tags, with attendees opting-in simply by picking up a tag on their way into the exhibition hall. Conor, our 2nd year summer intern, with support from Mark (embedded systems) and Graham (communications) developed the first tags using Arduino Uno microcontrollers (ATmega328P) and RF12 433MHz transceivers. They showed that two early tag prototypes can communicate with each other and the received signal strength is somewhat related to the distance between tags. Characterizing this relationship remained a holy grail throughout the project! You can read more about this part of the project in Conor’s MakerZone blog post.

**August: Algorithms**

9.5 weeks to go

With promising early results, we kicked off the data analysis part of the project. We started by brainstorming the elements we would like to understand about the social interaction patterns. Some of our favorite suggestions: Can we quantify the relationship between caffeine intake and social activity? Can we detect wallflowers, conversation boosters… or conversation killers? Can we see if visitors to a demo station recommend it to others? We quickly realized that most of those questions needed a definition of a conversation. We decided that two tag holders are in a conversation if they stay near to each other for 30 seconds.

But how do we know whether two tags are close to each other? We decided to use Received Signal Strength Indication (RSSI) as a proxy for distance between two tags. On the simplicity-robustness scale, it is closest to simplicity and that’s what we needed given the project timeframe.

By detecting conversations between tags, we can learn the popularity of demo and catering stations (see Figure 2) by monitoring tags located at those places of interest. All the tag data would be transmitted to Raspberry Pi base stations, which would send it in batches to the analytics server every 10 seconds.

*Figure 2: EXPO Exhibition Hall floor plan with marked demos and catering stations (click to enlarge)*

We had 10 engineers helping on the data analysis challenge in August. However this posed its own challenges:

- Many short term contributors. Median time to spend on the project was a week including understanding the project, developing algorithms, integration with the system and testing.
- Simultaneous development. Regardless of which part of the system they would work on, people only had time in August, so we needed to find a way to work simultaneously rather than consecutively.
- Experience. In general, one’s time allocated to the project was inversely proportional to ones’ experience.
- Communication. The “low cost hardware” team did not understand the “analytics” team and vice versa.
- Data. We had no real data to test our algorithms on – the hardware was in development!

We needed to design a system that would allow numerous people with different backgrounds and levels of experience to:

- write algorithms simultaneously that would depend on each other
- integrate their algorithms with main system without disrupting other’s work
- test their algorithms without the data they relied upon

We put in place a 5-point plan to address these challenges:

- Strong core. Following the philosophy that a faulty design would hurt us more than a few bugs in the algorithmic code, we asked a couple of our Consultants to spend a few days designing the architecture and coding the core classes of our system.
- Small independent modules. Representing the system by small modules with clear responsibilities and interfaces was key to being able to absorb code contributions from a large team. See Figure 3 for the main system components. Creating sensible modules was facilitated by support in MATLAB for Object Oriented Programming. Keeping them small was possible thanks to the rich existing capability for reading, manipulating, analyzing and saving data. This includes built-in support for communications with databases, UDP, and different file formats as well as processing functions such as logical indexing, set operations, group statistics and graph analysis functions. All the modules were small: the largest file implementing a data processing algorithm contained 100 lines of code. Short code was obviously faster to write, but it was also faster to test.
- Self-service model repository. We designed the system as a series of “analysis agents”: classes tasked to perform a very specific analysis operation. Some agents would be responsible for cleaning entry data, some for tracking conversations; some would search for the most active networkers, etc. We combined Observer and Template Method patterns to create agents that had a relatively sophisticated way of interacting with the main infrastructure, while allowing team members with no Object Oriented experience to contribute code. In practice, this meant that the algorithm developer only had to write very simple procedural, code specific to her or his algorithm: how to get input data, process it and save output. Then with one line of code, she/he could set his agent to “observe” and act on changes on the database. As an example, on the appearance of new raw data an agent would clean it and save it back to the database.
- Data: fake it till you make it. What might be a disputable strategy in everyday social interactions was essential to enable the algorithm developers to write and test their code. Almost all parts of the analytics system were developed using simulated data. In the first few days of the project, in parallel with designing and implementing the analytics system, we created a high-level simulator that would create synthetic data representing a number of tags following journeys around the exhibition hall. Key to developing the first simulator in just a couple of days was that MATLAB offered all the required components in a single environment: random number generation, visualization and an expressive programming language. As we needed to switch to real data late in the project, we wrote our code accessing data (“Data Access Layer”) in such a way that it was possible to use data from the simulator, test logs and database without any change in the algorithmic code.
- Code generation. The next step was to add a layer to the simulator representing the logic of the system that would relay information received from the tags to the analytics server: how the tags synchronize to acquire and share the information, how they send this to base stations, and finally how base stations form and send packets to the analytics server. This “executable specification” enabled early detection of the differences between how the “hardware” and “analytics” team understood the requirements. But even more importantly once this logic has been expressed in MATLAB, we could use MATLAB Coder to automatically generate C code that would be used on the Raspberry Pi base stations. This saved us time and increased our confidence that the data we were using to test our algorithms was representative of what our data acquisition system would produce when ready. It also allowed us to test the embedded algorithms while the tag hardware was still being developed, since they were included in the simulator. This is a big benefit for embedded systems designers.

*Figure 3: High-level system architecture: uniform access to real and simulated data; analysis agents representing system logic and producing aggregate data about the event to be displayed in MATLAB or via web interface*

In the meantime, the hardware team was busy creating in reality what we had simulated: the networking system that would relay the tag data to a server.

By the end of August we had early versions of all the analysis agents, tested on data from the simulator. We also had a long list of known issues, including the fact that our prototype code could not keep up to the predicted data rate.

**September: “Production”**

5 weeks to go

September saw the transformation of our prototype system into an early version of what could be called a production system.

First, we harnessed our code in a series of unit tests (better late than never!) to allow faster iterations when fixing bugs and optimizing the code.

Second, we optimized the prototype implementations for speed. A couple of the biggest wins were:

Vectorization of code of calculation of the social network graph Implementation of some of the tag data filtering in SQL instead of MATLAB to reduce time overhead of data transfersBoth operations reduced the time needed to analyze a batch of data from 10s of seconds to fractions of seconds.

Third, we re-implemented our analysis engine to run each analysis agent on a separate thread. Out of many choices for doing so, we decided to recode the thin analysis engine skeleton in Java and deploy each analysis agent as a separate function on MATLAB Production Server. Thanks to using the loosely coupled agents in the first place, this required only changing one line in the agent’s code.

As a final step we used a machine on EC2 with an elastic IP to set up a Tomcat webserver with our webpage and a directory that would be regularly synchronized with our local directory containing all the results.

At the end of September we had a system analyzing the tags every 10 seconds and updating the webpage every minute but still working only with the simulator. Throughout September attempts to connect the two failed – due to an Endian disagreement, too slow a processing rate in the early version of the analysis engine, data overflow in the prototype data gathering stage, and more.

But we were not ready to cancel the project. In the end, we would not often have an opportunity to test it in a hall with 500 people, 8 catering stations, 9 demo stations and a BLOODHOUND SSC life-size model in the middle (yes, a big pile of metal in the middle of our RF experiment).

**October: EXPO or The First Integration Test**

1 week to go

First week of October has gone between integration attempts and bug fixing. You can see that the moods were mixed:

*Figure 4: Rainbow of emotions; Note as moods are improving when we move from bug chasing to handling some real hardware*

The day before EXPO (October 7th) three of us set off to Silverstone, the conference venue. Two hours after arriving everything was connected. The moment the webpage started to show the two test tags in “conversation” with each other was worth all those sleepless nights. We stared in awe. By 7pm all 8 base stations were connected and receiving. At that point we put the system to sleep and left the site.

The morning of EXPO passed in a flash. We started with a pile of 200 tags (see Figure 5). By 9.30 am all of them has been distributed. Attendees were curious about the system and wanted to participate. The webpage started to update with plausible results (see sample in Figure 7). All team members, most of whom so far have seen only a small part of the system, were hypnotized by the updating web page. Throughout the day the demo drew a big crowd who asked about everything from working with low cost hardware, through machine learning and “big data” to web deployment. We also highlighted parts of the system during all relevant talks: Machine Learning, Accelerating MATLAB algorithms, Deployment, Modelling, Simulation, and Real-Time Testing for Model-Based Design. Finally, in the wrap up talk we announced the tag number that won the networking competition prize. We knew the system was working when we discovered that the winner was our ex-colleague Michael, known for being a social butterfly.

So through this project we achieved:

- end-to-end technology stack integration, starting from low cost hardware to the cloud-based visualization
- cross-disciplinary team collaboration
- blue sky idea to cloud deployment in 3 months

But as EXPO was also the first time the system ran together, we still had a lot of work to understand what it was really doing.

Learn more as the story continues: Expo Conversations – Part 2

*Figure 5 Tags waiting to be handed to attendees*

*Figure 6: Queue to pick up the tags*

*Figure 7: Screenshot of the live social networking analysis*

We all use Fourier analysis every day without even knowing it. Cell phones, disc drives, DVDs, and JPEGs all involve fast finite Fourier transforms. This post, which describes touch-tone telephone dialing, is the first of three posts about the computation and interpretation of FFTs. The posts are adapted from chapter 8 of my book, *Numerical Computing with MATLAB* .

Touch-tone telephone dialing is an example of everyday use of Fourier analysis. The basis for touch-tone dialing is the Dual Tone Multi-Frequency (DTMF) system. *Synthesis* is the generation of analog tones to represent digits in phone numbers. *Analysis* is the decoding of these tones to retrieve the digits. The program `touchtone`, which is available here, or which is included with my NCM App, demonstrates both these aspects of DTMF.

The telephone dialing pad acts as a 4-by-3 matrix. Associated with each row and column is a frequency. These basic frequencies are

fr = [697 770 852 941]; fc = [1209 1336 1477];

These frequencies were chosen so that none is an integer multiple of any other. In fact the ratios are 21/19. This avoids interfering overtones.

If `s` is a character that labels one of the buttons on the keypad, the corresponding row index `k` and column index `j` can be found with

switch s case '*', k = 4; j = 1; case '0', k = 4; j = 2; case '#', k = 4; j = 3; otherwise, d = s-'0'; j = mod(d-1,3)+1; k = (d-j)/3+1; end

An important parameter in digital sound is the sampling rate.

Fs = 32768

A vector of points in one-fourth of a second at this sampling rate is

t = 0:1/Fs:0.25

The tone generated by the button in position `(k,j)` is obtained by superimposing the two fundamental tones with frequencies `fr(k)` and `fc(j)`.

y1 = sin(2*pi*fr(k)*t); y2 = sin(2*pi*fc(j)*t); y = (y1 + y2)/2;

If your computer is equipped with a sound card, the statement

sound(y,Fs)

plays the tone.

For example, this is the display produced by `touchtone` for the "1" button. The top subplot depicts the two underlying frequencies and the bottom subplot shows a portion of the signal obtained by averaging the sine waves with those frequencies.

How is it possible to determine a phone number by listening to the signal generated? This involves the FFT, the Finite Fourier Transform, or, more likely, some specialized version of the FFT adapted to this task. I will discuss the FFT in my next post.

The data file `touchtone.mat` contains a recording of a telephone being dialed. The file is available from

http://www.mathworks.com/moler/ncm/touchtone.mat

Go to the `Home` tab above the MATLAB command window, click on the `Open` tab and insert this URL into the `File name` slot. This will load a structure `y` into your MATLAB workspace. Then the statement

y

will produce

y = sig: [1x74800 int8] fs: 8192

This shows that `y` has two fields, an integer vector `y.sig`, of length 74800, containing the signal, and a scalar `y.fs`, with the value 8192, which is the sample rate.

max(abs(y.sig))

reveals that the elements of the signal are bounded in absolute value by 127. So the statements

Fs = y.fs; y = double(y.sig)/128;

save the sample rate, rescale the vector and convert it to double precision. The statements

n = length(y); t = (0:n-1)/Fs

reproduce the sample times of the recording. The last component of `t` is `9.1307`, indicating that the recording lasts a little over 9 seconds. The next figure is a plot of the entire signal.

This recording is noisy. You can even see small spikes on the graph at the times the buttons were clicked. It is easy to see that 11 digits were dialed, but, on this scale, it is impossible to determine the specific digits.

Here is the magnitude of the FFT of the signal, which is the key to determining the individual digits.

p = abs(fft(y)); f = (0:n-1)*(Fs/n); plot(f,p); axis([500 1700 0 600])

The *x* -axis corresponds to frequency. The `axis` settings limit the display to the range of the DTMF frequencies. There are seven peaks, corresponding to the seven basic frequencies. This overall FFT shows that all seven frequencies are present someplace in the signal, but it does not help determine the individual digits.

The `touchtone` program also lets you break the signal into 11 equal segments and analyze each segment separately. The next figure is the display from the first segment.

For this segment, there are only two peaks, indicating that only two of the basic frequencies are present in this portion of the signal. These two frequencies come from the "1" button. You can also see that the waveform of a short portion of the first segment is similar to the waveform that our synthesizer produces for the "1" button. So we can conclude that the number being dialed in `touchtone` starts with a 1. We leave it as an exercise to continue the analysis with `touchtone` and identify the complete phone number.

`touchtone` has a hidden "Easter egg" that, as far as I know, has never been uncovered. If anyone finds it, submit a comment and let us know.

Cleve Moler, *Numerical Computing with MATLAB*, Electronic Edition, MathWorks, <http://www.mathworks.com/moler/index_ncm.html>,

Print Edition, SIAM Revised Reprint, SIAM, 2008, 334 pp., <http://bookstore.siam.org/ot87> .

Get
the MATLAB code

Published with MATLAB® R2014a

But what about Simulink? A lot of work is done using the mouse, so the touch typing configuration is less relevant. Are there guidelines or tips to build, edit, and navigate simulink models as efficiently as possible?

This week I share mine!

**Navigating**

When navigating, I typically keep my thumb on the **Alt** key, my middle finger on the **Esc** key, and the index on the **1** key.

This allows me to easily do:

**Esc**to go up one level with the middle finger**Atl+1**to zoom 100%, using thumb on**Alt**and index on**1****Space bar**to zoom to fit with the thumb**Alt+Tab**to switch between windows, typically the Library Browser and the model, using thumb on**Alt**and index on**Tab**.

Here is a typical example of me exploring a model. This typically ends up being a pattern of hitting the **Space bar** when I enter a new subsystem to get the big picture, then **Atl+1** to see the blocks zoomed at 100% and **Esc** to go up one level when I am done.

**Editing**

When editing, what I need the most are the **Ctrl** and **Ctrl+Shift** shortcuts. So I slightly move my left hand to keep my small finger on the **Ctrl** key, and my ring finger on the **Shift** key.

Here is the list of shortcuts I typically use in this configuration:

Bottom line:

**Ctrl+Z**to Undo**Ctrl+X**to Cut**Ctrl+C**to Copy**Ctrl+V**to Paste**Ctrl+B**to Build the model using Simulink Coder

Middle Line:

**Ctrl+A**to Select all blocks in the current system**Ctrl+S**to Save**Ctrl+D**to update the diagram (ALWAYS keep your model as close as possible to an updatable state, this will save you lots of debugging time!)**Ctrl+G**to Group selected blocks in one subsystem**Ctrl+H**to open the Model Explorer (am I the only one to pronounce it Model HHHexplorer because of that?)

Top line:

**Ctrl+E**to open the model configuration**Ctrl+R**to Rotate blocks**Ctrl+T**to start a simulation**Ctrl+Y**to redo something previously undone**Ctrl+U**to open a masked subsystem. With the introduction of the arrow badge, I use this shortcut less often since R2012b**Ctrl+I**to flip a block left-right. Ok, this one requires some intense stretching to be done with only one hand... be careful to not hurt yourself.

In addition go the **Ctrl** shortcuts, this hand position allows you to access a set of **Ctrl+Shift** shortcuts, by using the little finger for **Ctrl**, and the ring finger for **Shift**:

**Ctrl+Shift+X**to Comment out and uncomment blocks**Ctrl+Shift+Y**to Comment through blocks**Ctrl+Shift+H**to remove highlighting in the model**Ctrl+Shift+G**to Expand a subsystem**Ctrl+Shift+L**to open the Simulink Library Browser. (This one cannot be done only with the left hand, but I thought I should point out his existence. Most of the time, when I edit model I alternate between the Library Browser and the model using**Atl+Tab**)

You can see the full list of Simulink shortcuts here.

Since many **Ctrl+Shift** combinations do not have shortcuts assigned, I like to use those to assign custom shortcuts using an sl_customization file, as I previously highlighted in this post. One example for me is **Ctrl+Shift+Z** to display or not the name of blocks.

Here is an animation showing usage of the most common shortcuts:

**Now it's your turn**

Share your tricks to use Simulink as efficiently as possible by leaving a comment here.

]]>

I’d like to introduce you to this week’s guest blogger, Graham Dudgeon. Graham is with our Industry Marketing Team at MathWorks, and focuses on the Utilities & Energy Industry. In this blog, Graham shares a story about age being no barrier to exploring ideas and concepts in MATLAB and Simulink.

Hi Everyone, and thank you Loren for inviting me to be a guest blogger. This story begins with my dog, waiting patiently at the patio door expecting either of my two sons to let her in. She was out of luck. I am no dog psychologist, but I figure that because she could see the boys very clearly, that they in turn could see her. In her mind, she didn’t need to bark to raise their attention…. and so on she waited while my sons continued watching Phineas and Ferb. Once I was alerted to her predicament a mere fraction of a second after entering the family room (by seeing her hopeful eyes and her tail wagging with a vigor in direct proportion to how long she had been waiting), I let her in and then reinforced to my sons that they need to be more aware of their surroundings. This is where MATLAB and Simulink come into the story… To help pooch out, I made a slight modification to demoimaqsl_rgbhistogram_win.slx , aimed the web cam at the patio door and when pooch came to the door, we were alerted to here presence by a ‘gong’ that I took from ‘gong.mat’, and had a MATLAB callback to execute the sound when a color threshold was reached – it turns out red works best for a fox hound. Was pooch consequently let into the house by my boys? Unfortunately not. The boys decided that a much better use of my endeavor was to see if they could creep past the field of view without the gong going off. This kept them entertained for longer than I expected.

My eldest son, who is 9 years old, started asking questions about how the model worked, how the camera worked, what was doing what, how do you detect pooch, etc. etc. I must have answered with sufficient clarity and enthusiasm as he decided to take what I had done and modify it for his Science Expo.

My son watched the RGB plot as we passed different colors in front, and gained a feel for how to develop a detection algorithm. Our process was to let my son watch only the RGB plot, not the object in front of the camera, and tell me what color he thought was there. ‘Red!’, ’Blue!’, ’Kinda red and blue, and a bit of green!’. We would then confirm his observations by looking at the actual object and moving it again across the field of view to see the RGB response change with the movement. My son associated y-axis magnitude with detection and figured out that to detect, he need simply tell the computer to ‘gong’ when a color got to a certain magnitude. My son also figured that you could remain ‘invisible’ if the detector was tuned only for red and you wore another color, and that a black background would cause less noise on the color signals, as black caused only a small RGB signal. My youngest son also helped out, by wearing different colored shirts and walking (sometimes running) past the web cam. The list of insights my son gained go on and on. He presented his work at the Science Expo, and was met with great enthusiasm and interest by the High School Scientists who assessed his work.

So how does this story end? A few days ago my eldest son told me that when he grows up he wants to be an engineer… and that he wants to work for MathWorks. On hearing this, my 5 year old son said he wanted to be a scientist (as he ‘wants to see into stuff’)… and that he too wants to work for MathWorks. So the story continues, and our future is in good hands

And what of pooch? She’s standing outside the patio door (with hopeful eyes and tail wagging vigorously) waiting for me to finish this post.

Before I sign off, let me give you my main takeaway from this story… You can use MathWorks tools to help kids of any age explore and evolve their curiosity, by connecting them directly to the world of Science and Engineering. Give it a try!

Get
the MATLAB code

Published with MATLAB® R2014a

Jiro's pick this week is Reference Creator by Valerio Biscione.

Even though I have been away from the world of academia for quite some time, I can remember and appreciate the effort that people put into writing papers. And in that process, one of the more mundane tasks is putting together the references. Not only do I need to figure out the exact article and journal names, I need to put the information in the correct format. It may not be super difficult, but it's definitely time-consuming.

Valerio's Reference Creator makes this process extremely easy. It accesses Mendeley, an online research paper database, to gather the bibliographical data based on some search criteria, such as author name and year. You can use his tool either through the command line interface or the UI. I gave it a try using a paragraph from one of my papers.

It even gave me a warning when it found multiple matches with the same author(s) and year.

Thanks Valerio for this useful tool! I'm sure a lot of researchers out there will appreciate this.

*Warning: This entry uses some undocumented/unsupported capabilities, namely the displaying of colored and bold text in the
edit boxes. Use this entry with caution, as undocumented features can change in behavior or stop working at any time.*

**Comments**

Give this a try and let us know what you think here or leave a comment for Valerio.

Get
the MATLAB code

Published with MATLAB® R2014a

Since the video at the center of this post is about a rap battle between MATLAB and Simulink, I thought it would be interesting to bring that to a higher level and see how the video processing could have been implemented in Simulink instead of MATLAB.

Seriously... who wants to write code when you can simply connect blocks?

**Overview**

Since I have absolutely zero knowledge in image processing, I simply re-implemented the algorithm suggested by Matt in Simulink. The resulting model looks like:

Let's look at the details.

**Importing the video**

To begin, I used the From Multimedia File from the DSP System Toolbox to directly read the AVI-file with our rappers in front of a green background. Then I use the resize block from the Computer Vision System Toolbox. This allows me to work on images of a relatively small size. After, I normalize the data, as suggested by Matt.

**The Greenness factor**

I really like the simple equation the Matt figured out to calculate the "greenness" of each pixel in the image. In MATLAB, it looks like:

```
% Greenness = G*(G-R)*(G-B)
greenness = yd(:,:,2).*(yd(:,:,2)-yd(:,:,1)).*(yd(:,:,2)-yd(:,:,3));
```

**Thresholding**

The next piece of code is something that many users struggle to figure out:

`thresh = 0.3*mean(greenness(greenness>0));`

To implement logical indexing (e.g. `A(A>0)`) in Simulink, we need to combine the Find block with a Selector block. This results in a variable-size signal

**Combining the images**

The last line we need to implement is to replace the pixels identified as green in the foreground image but their values in the background image. In MATLAB, this is accomplished using this line:

`rgb2(isgreen) = rgb1(isgreen);`

In Simulink, we need to extend the logic seen at the previous step. We first select the desired pixels from the background image, and then assign them to the foreground image:

Note that the Computer Vision System Toolbox has a Compositing block implementing this functionality.

Here is what the final result looks like:

**Now it's your turn**

What do you think? Is it more convenient to implement this algorithm in MATLAB or Simulink? How do you typically decide between a *code-based* or a *block-based* implementation? Let us know by leaving a comment here.

At the party, we learned that Cleve is the winner of the 2014 IEEE John von Neumann Medal, which is awarded for outstanding achievements in computer-related science and technology.

Happy Birthday and Congratulations, Cleve! I've learned more just by working in the same hallway as you than in much of my formal education. Thanks for being so generous in sharing your knowledge and experience with so many of us for so many years.

Readers, be sure to check out Cleve's blog. Or listen to Cleve describe the three decades of mathematicians, computers, and technology (and a LINPACK license plate!) that led to the creation of MATLAB.

]]>For several years we thought Hadamard matrices showed maximum element growth for Gaussian elimination with complete pivoting. We were wrong.

I want to continue the discussion from my previous blog post of element growth during Gaussian elimination. Suppose we are solving a system of linear equations involving a matrix $A$ of order $n$. Let $a_{i,j}^{(k)}$ denote the elements in the matrix after the $k$ th step of the elimination. Recall that the *growth factor* for the pivoting process we are using is the quantity

$$ \rho_n = { \max_{i,j,k} |a_{i,j}^{(k)}| \over \max_{i,j} |a_{i,j}| } $$

Complete pivoting is intended to control the growth factor by using both row and column interchanges at each step in the reduction to bring the largest element in the entire unreduced matrix into the pivot position. In his 1962 analysis of roundoff error in Gaussian elimination, J. H. Wilkinson proved that the growth factor for complete pivoting satisfies

$$ \rho_n \le g(n) $$

where the *growth function* is

$$ g(n) = (n \ 2 \ 3^{1/2} 4^{1/3} \cdots n^{1/(n-1)})^{1/2} \approx 1.8 \ n^{1/2} n^{1/4 \log{n}} $$

Wilkinson's analysis makes it clear that the growth can never actually be anywhere near this large.

We saw that the growth factor for partial pivoting is $2^{n-1}$. For complete pivoting, it's much, much less. To get a feeling for now much, express $g(n)$ as a MATLAB one-liner

g = @(n) sqrt(n*prod((2:n).^(1./(1:(n-1)))))

g = @(n)sqrt(n*prod((2:n).^(1./(1:(n-1)))))

Check out $n = 100$

g(100)

ans = 3.5703e+03

So here $g(n) \approx 35n$.

Jacques Hadamard was a French mathematician who lived from 1865 until 1963. He made contributions in many fields, ranging from number theory to partial differential equations. The matrices named after him have entries +1 and -1 and mutually orthogonal rows and columns. The $n$ -dimensional parallelotope spanned by the rows of a Hadamard matrix has the largest possible volume among all parallelotopes spanned by matrices with entries bounded by one. We are interested in Hadamard matrices in the MATLAB world because they are the basis for Hadamard transforms, which are closely related to Fourier transforms. They are also applicable to error correcting codes and in statistics to estimation of variance.

So MATLAB already has a `hadamard` function. Here is the 8-by-8 Hadamard matrix.

H = hadamard(8)

H = 1 1 1 1 1 1 1 1 1 -1 1 -1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 -1 1 1 -1 -1 1 1 1 1 1 -1 -1 -1 -1 1 -1 1 -1 -1 1 -1 1 1 1 -1 -1 -1 -1 1 1 1 -1 -1 1 -1 1 1 -1

Check that the rows are perpendicular to each other.

H'*H

ans = 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8

The orthogonality of the rows leads to element growth during Gaussian elimination. If we call for the LU factorization of `H`, no pivoting actually takes places, but the same result would be produced by complete pivoting that settles ties in favor of the incumbent.

[L,U] = lu(H)

L = 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 U = 1 1 1 1 1 1 1 1 0 -2 0 -2 0 -2 0 -2 0 0 -2 -2 0 0 -2 -2 0 0 0 4 0 0 0 4 0 0 0 0 -2 -2 -2 -2 0 0 0 0 0 4 0 4 0 0 0 0 0 0 4 4 0 0 0 0 0 0 0 -8

This, then, is one matrix at $n = 8$ where

$$ \rho_n = n $$

For almost 30 years everybody believed that Hadamard matrices were the extreme cases and that the growth factor $\rho_n$ for complete pivoting with real matrices was equal to $n$.

At various times in the 1960s and '70s I personally did numerical experiments looking for matrices with large pivot growth. The computers I had at the time limited the order of my matrices to a few dozen. I never found any cases of $\rho_n > n$.

One of the people who worked on the problem was Leonard Tornheim at Chevron Research in California. He also did a lot of numerical experiments. He found a complex 3-by-3 matrix with $\rho_3 > 3$. He also proved that for real matrices, $\rho_3 = 2 \ 1/4$.

In 1968, Colin Cryer made an official conjecture that $\rho_n$ was equal to $n$ for real matrices.

A 1988 paper by Jane Day and Brian Peterson provides a good survey of what was known up to that time about element growth with complete pivoting.

Then, in 1991, Nick Gould, from Oxford, surprised us all by announcing the discovery of real matrices for which the complete pivoting $\rho_n$ is greater than $n$. For an $n$ -by- $n$ matrix, with $n$ between 13 and 16, Nick set up a large, sparse, nonlinear optimization problem involving roughly $n^3/3$ variables, and tackled it with the LANCELOT package that he and his colleagues Andy Conn and Philippe Toint had developed. Most of his paper is about a 13-by-13 matrix with

$$ \rho_{13} = 13.0205 $$

This just barely exceeds the $\rho_n = n$ conjecture. But he also found some slightly larger matrice that do a better job.

$$ \rho_{14} = 14.5949 $$

$$ \rho_{15} = 16.1078 $$

$$ \rho_{16} = 18.0596 $$

The 16-by-16 is particularly dramatic because it is not a Hadamard matrix.

So, as far as I know, this is where the matter still stands today. Nobody is interested in running larger experiments. I have no idea what $\rho_{100}$ or $\rho_{1000}$ might be. Of course, we don't really need to know because we don't use complete pivoting in practice.

This reminds me of a story that I like to tell. I'm not sure it's actually true.

Peter Businger was a grad student at Stanford in Math and Computer Science at the same time I was in the early '60s. He worked with Gene Golub to write the first published Algol procedure for the QR matrix factorization using Householder reflections. Peter left Stanford to join Joe Traub's group at Bell Labs that was producing what they called the Port library of mathematical software. They were importing programs from everybody else, running the routines on Bell Labs' machines, thoroughly testing the software, deciding which codes were best, and producing a consolidated library.

Peter was in charge of matrix computation. He checked out all the imported linear equation solvers, including mine. He decided none of them was satisfactory, because of this pivoting growth question. So he wrote a program of his own that did partial pivoting, and monitor the growth. If the growth was too large, the program switched to complete pivoting.

At the time, Bell Labs was trying to patent software, so Peter named his new subroutine "P^8". This stood for "Peter's Pseudo Perfect Perfect Partial Pivot Picker, Patent Pending". In my opinion, that is one of the great all time subroutine names.

The experience inspired Peter to go to law school at night and change jobs. He switched to the Bell Labs patent department.

I've lost contact with Peter, so I can't ask him if he really did name the routine P^8. If he didn't, he should have.

Les Foster, who I mentioned in the last post for finding examples of exponential growth with partial pivoting, promotes what he calls "Rook Pivoting". Search down the column for the largest element, as in partial pivoting, but then also search along that row for the last element. This is intermediate between partial and complete pivoting. Les is able to prove some results about pivot growth. But the algorithm does not generalize well to a block form.

I was present at a milestone in the history of Hadamard matrices. The orthogonality conditions readily imply that the order $n$ of a Hadamard matrix must be 1, 2, or a multiple of 4. But the question remains, does a Hadamard matrix of order $n = 4k$ exist for every $k$? This is an open question in number theory.

It is fairly easy to create Hadamard matrices of some sizes. A nifty way to generate a Hadamard matrix of order a power of two is the recursion.

H = 1; for n = 2, 4, 8, ... H = [H H H -H]; end

If `A` and `B` are Hadamard matrices, then so is

kron(A,B)

By 1961, with these and other similar constructions, it turned out that it was known how to construct Hadamard matrices for all orders $n \le 100$ that are multiples of four except $n = 92$. In 1961 I had a summer job at JPL, Caltech's Jet Propulsion Lab. My office was in a temporary trailer and my trailer mate was Len Baumert. Len proudly showed me a color graphic that he had just made and that he was proposing for the cover of *Scientific American*. It was a 92-by-92 matrix made up from 26-by-26 blocks of alternating light and dark cells representing +1 and -1. The graphic didn't make the cover of *Scientific American*, but I kept my copy for a long time.

Len had done a computation on JPL's machines that filled in the last value of $n$ less than 100. His paper with his colleagues Sol Golomb, a professor at USC, and Marshall Hall, Jr., a professor at Caltech, was published in the prestigious Bulletin of the AMS. It turns out that I had taken a course on difference sets, the mathematics involved in generating the matrix, from Hall at Caltech.

Here is a MATLAB picture of `Baumert92`. You can get the function from this link.

H = Baumert92; pcolor92(H);

Let's check that we get the expected pivot growth.

[L,U] = lu(H); unn = U(end,end)

unn = -92

This reference to the book by Trefethen and Bau should have been included in my previous post about partial pivoting. Lecture 22 has an explanation the stability of partial pivoting in practice.

Lloyd N. Trefethen and David Bau, III, *Numerical Linear Algebra*, <http://bookstore.siam.org/ot50>, SIAM, 1997, 362pp.

Peter Businger, "Monitoring the numerical stability of Gaussian elimination", <http://link.springer.com/article/10.1007/BF02165006>, Numerische Mathematik 16, 4, 1971, 360-361.

Jane Day and Brian Peterson, "Growth in Gaussian elimination", <http://www.jstor.org/stable/2322755>, American Mathematical Monthly, 95, 6, 1988, 489-513.

Nick Gould, "On growth in Gaussian elimination with complete pivoting", <http://www.numerical.rl.ac.uk/people/nimg/pubs/Goul91_simax.pdf> SIAM Journal on Matrix Analysis and Applications 12, 1991, 351-364.

Leslie Foster, "The growth factor and efficiency of Gaussian elimination with rook pivoting", <http://www.math.sjsu.edu/~foster/gerpwithcor.pdf>, Journal of Computational and Applied Mathematics, 86, 1997, 177-194.

Leslie Foster, "LURP: Gaussian elimination with rook pivoting", matlabcentral/fileexchange/37721-rook-pivoting, MATLAB Central File Exchange, 2012.

Leonard Baumert, S. W. Golomb, and Marshall Hall, Jr, "Discovery of an Hadamard Matrix of Order 92", <http://dx.doi.org/10.1090%2FS0002-9904-1962-10761-7>, Bulletin of the American Mathematical Society 68 (1962), 237-238.

Get
the MATLAB code

Published with MATLAB® R2014a

If you want the distinguished honor of a Will Pick of the Week, here are some easy guidelines to follow:

- Choose an interesting or fun topic.
- Package the submission so that it's easy to figure out how to use.
- Ensure that it actually works for standard user inputs.

I'll probably want to get a second opinion, but PianoTuner tells me that my keys are all falling flat. As to how it figures this out, the algorithm is about 80 lines of code embedded into the UI callback functions. It looks like there's a fast Fourier transform and a polynomial curve fit in there.

Incidentally, you can run the tool against non-piano instruments as well. I played a random song on my computer speakers and found that musicians with professional instruments do much better.

Let us know what you think here or leave a comment for John.

]]>

Here is my answer.

]]>

*I'd like to welcome guest blogger and ace MATLAB training content developer Matt Tearle for today's post. Thanks, Matt!*

What do you do if you don't have an image processing background and your boss asks you to produce a music video for a MATLAB-vs-Simulink rap battle? It's probably not a question you've given much thought to. I admit that I hadn't either. But that's the awkward situation I found myself in a little while back.

A MATLAB-vs-Simulink rap battle -- obviously I **had** to be a part of this project! There was just one significant problem: I'm not a video producer. Yes, I have a nice digital camera. I have some very basic editing software. And I certainly have some... "creative" ideas. But I don't have any software designed to do serious video editing.

In particular, as soon as the concept was explained to me, I decided that we needed to do a "green screen" video:

- We'd record our rappers busting out their rhymes in front of a green backdrop
- I'd make an entertaining background video
- Then we'd put the two together, using the background video in place of the green screen:

[This background may not seem particularly "entertaining", but if I showed the real one we used, lawyers might get involved... And nobody wants that.]

With no purpose-made software to achieve this, and only a few days before it was due (thanks, boss!), I turned to my default standby: MATLAB!

A video is just a sequence of images, so all I needed to do was to write MATLAB code to "green-screen" two images, then read in two videos and apply my algorithm frame-by-frame. A quick hunt around in the documentation, and I had the framework code ready to go:

% Open the input video files v1 = VideoReader('background.mp4'); v2 = VideoReader('foreground.mp4'); % Determine the number of frames for the final video nFrames = min(v1.NumberOfFrames,v2.NumberOfFrames); % Open a video file to write the result to vOut = VideoWriter('mixedvideo.mp4','MPEG-4'); vOut.FrameRate = 24; open(vOut); % Loop over all the frames for k = 1:nFrames % Get the kth frame of both inputs x = read(v1,k); y = read(v2,k); % Mix them together z = magicHappensHere(x,y); % TODO: Write this... % Write the result to file writeVideo(vOut,z) end % And we're done! close(vOut);

Now all I needed to do was write the magic function that would do the greenscreening...

Actually, I soon discovered that I needed to do a bit more preprocessing first. I had created the background video on my computer, but filmed the foreground on my digital camera. Of course the dimensions didn't match... No worries! That's what Image Processing Toolbox is for. I added a line of code to define the desired dimensions of the output video. Then `imresize` took care of the rest:

% Loop over all the frames for k = 1:nFrames % Get the kth frame of both inputs x = imresize(read(v1,k),outDims); y = imresize(read(v2,k),outDims); % Mix them together z = magicHappensHere(x,y); % Write the result to file writeVideo(vOut,z) end

Wow, this image processing stuff is **easy**! What's next, then?

The first thing to try, of course, when developing an algorithm is to see if one already exists. Sadly, searching the Image Processing Toolbox documentation didn't turn up a `greenscreen` function or anything I could see that would do the job. So it looks like I have to do this from scratch.

If I can identify the green pixels, then I can just replace those pixels in the foreground image with the corresponding pixels in the background image. So all of this really just boils down to: how do I find the green pixels? I know that a true-color image in MATLAB is an m-by-n-by-3 array, where each of the three m-by-n "planes" represent red, green, and blue intensity, respectively:

"Green" would, ideally, be [0, 255, 0] (in `uint8`). So a simple way to find the green pixels would be to do some kind of thresholding:

isgreen = (y(:,:,1) < 175) & (y(:,:,2) > 200) & (y(:,:,3) < 175);

This didn't work too badly, but neither was it great. One of the problems is the use of magic threshold values -- three of them, no less! Playing around with three magic parameters to try to find the right combination doesn't sound like fun to me. And besides, using absolute values like this might not be the best way to judge "greenness". If you look at the example image above, you can see that there's a fair amount of variation in the green background. (This is mainly because we were using an unevenly lit conference room that happened to have a green wall -- this was clearly a very professional production!)

At this point, I had to ask myself "what does it really mean to be 'green'?". As a human, I can easily distinguish between the various shades of green on the wall, despite the difference in brightness and shading. At first, this lead me into thinking about color spaces and how a color is a point in some 3-D space (RGB, CYM, HSV, etc.). Perhaps I could define "green" as "sufficiently close to [0 1 0] in RGB space". And so followed a few experiments with rounding, distance calculations, even playing with `rgb2ind`. But to no great success:

yd = round(double(y)/255); isgreen = (yd(:,:,1)==0) & (yd(:,:,2)==1) & (yd(:,:,3)==0);

As this image shows, the problem with our makeshift green screen was that it wasn't particularly green -- more a yellow-green, with significant variations in brightness.

Just as I was started to get worried that this wasn't going to work at all, I realized that one obvious characteristic of the green pixels is that the green value is significantly **greater** than the blue and red values. This is, of course, embedded in my simple thresholding code above, but I can do away with at least some of the magic constants by looking at **relative** values. So how about calculating a "greenness" value from the three color channels:

```
yd = double(y)/255;
% Greenness = G*(G-R)*(G-B)
greenness = yd(:,:,2).*(yd(:,:,2)-yd(:,:,1)).*(yd(:,:,2)-yd(:,:,3));
```

Now **that** looks very promising. If I threshold that, I should get what I'm looking for. But rather than use an absolute threshold value maybe I can use something based on the distribution of the greenness values. In particular, I can exploit one nice feature of the images: they have a lot of green pixels. The histogram of greenness values looks like this:

I tried a simple thresholding based on the average greenness value, ignoring the negative values:

thresh = 0.3*mean(greenness(greenness>0)); isgreen = greenness > thresh;

Success! OK, so there's one magic constant left in there, but it was pretty quick and easy to tune, and seemed to be quite robust to minor changes. I'm calling that a good result. Now that I know where the green pixels are, I just have to replace them with the corresponding pixels from the background.

% Start with the foreground (essentially preallocation) z = y; % Find the green pixels yd = double(y)/255; % Greenness = G*(G-R)*(G-B) greenness = yd(:,:,2).*(yd(:,:,2)-yd(:,:,1)).*(yd(:,:,2)-yd(:,:,3)); thresh = 0.3*mean(greenness(greenness>0)); isgreen = greenness > thresh; % Blend the images % Loop over the 3 color planes (RGB) for j = 1:3 rgb1 = x(:,:,j); % Extract the jth plane of the background rgb2 = y(:,:,j); % Extract the jth plane of the foreground % Replace the green pixels of the foreground with the background rgb2(isgreen) = rgb1(isgreen); % Put the combined image into the output z(:,:,j) = rgb2; end

I love using logical indexing -- it's probably my favorite construct in the MATLAB language. Here I'm using it to replace the green pixels of the foreground (`rgb2(isgreen)`) with the corresponding background pixels (`rgb1(isgreen)`).

At this point I had a video that would have sufficed, but there was a noticeable green outline around my rappers. (It's not necessarily apparent in a single still frame.) It was only Sunday morning by this point, which meant that I still had a few hours left to tinker. Could I find a way to "shave" a pixel or two from the edge of the black silhouette in the above image of `isgreen`? To do that, I'd first need to find that edge. I don't know much about image processing, but I do know that "edge detection" is a thing that image people do, so it's time for me to search the doc... Sure enough, there's an `edge` function:

```
outline = edge(isgreen,'roberts');
```

Easy! Now if I can thicken that edge and combine it with the original `isgreen`, I'll have the slightly larger greenscreen mask that I need. Two more Image Processing Toolbox functions help me thicken the edge:

```
se = strel('disk',1);
outline = imdilate(outline,se);
```

The edge variable `outline` is a logical array, so I can easily combine it with `isgreen`, to get my final result:

isgreen = isgreen | outline;

Put it all together and I have a pretty simple script:

% Open the input video files v1 = VideoReader('background.mp4'); v2 = VideoReader('foreground.mp4'); % Determine the number of frames for the final video nFrames = min(v1.NumberOfFrames,v2.NumberOfFrames); % Set the output dimensions outDims = [400 640]; % Open a video file to write the result to vOut = VideoWriter('mixedvideo.mp4','MPEG-4'); vOut.FrameRate = 24; open(vOut); % Loop over all the frames for k = 1:nFrames % Get the kth frame of both inputs x = imresize(read(v1,k),outDims); y = imresize(read(v2,k),outDims); % Mix them together z = y; % Preallocate space for the result % Find the green pixels in the foreground (y) yd = double(y)/255; % Greenness = G*(G-R)*(G-B) greenness = yd(:,:,2).*(yd(:,:,2)-yd(:,:,1)).*(yd(:,:,2)-yd(:,:,3)); % Threshold the greenness value thresh = 0.3*mean(greenness(greenness>0)); isgreen = greenness > thresh; % Thicken the outline to expand the greenscreen mask a little outline = edge(isgreen,'roberts'); se = strel('disk',1); outline = imdilate(outline,se); isgreen = isgreen | outline; % Blend the images % Loop over the 3 color planes (RGB) for j = 1:3 rgb1 = x(:,:,j); % Extract the jth plane of the background rgb2 = y(:,:,j); % Extract the jth plane of the foreground % Replace the green pixels of the foreground with the background rgb2(isgreen) = rgb1(isgreen); % Put the combined image into the output z(:,:,j) = rgb2; end % Write the result to file writeVideo(vOut,z) end % And we're done! close(vOut);

This is not the best greenscreen code ever written (although it **is** the best greenscreen code **I've** ever written). But the real point is that it shouldn't even exist -- I had no prior image processing knowledge, I was working with videos of different dimensions, the greenscreen was unevenly lit (and not even particularly green), and I had a weekend to make it happen. The fact that I was able to do this is one of the main reasons I love MATLAB (and, of course, the associated toolboxes): I was able to spend my time on the heart of **my algorithm**, not all the coding details. Different-sized images? Fixed with one function call (`imresize`). Need to find edges of a region? One function call (`edge`). Want to thicken that edge? Two function calls (`strel` and `imdilate`). The final script is less than 50 lines.

My time was spent wrestling with the core of the problem: how I was going to figure out what "green" looked like in my images. Because I could spend my time there, I was able to tinker with different ideas. This, to me, is what MATLAB is all about: rapid prototyping -- ideas becoming working code.

Having made the video and feeling pretty good about what I managed to pull together in a few hours of MATLAB, I later realized that I got lucky. The way I calculated greenness (`G*(G-R)*(G-B)`) was intended to give negative results for anything that didn't have more green than red or blue. But I simply overlooked the possibility of getting a high positive value if **both** `G-R` and `G-B` were negative. According to my formula, a light magenta -- e.g. `[1 0.6 1]` -- would be very green! Luckily for me, my source videos didn't include anything like that. A more robust approach would have been something like this:

```
yd = double(y)/255;
% Greenness = G*(G-R)*(G-B)
greenness = yd(:,:,2).*max(yd(:,:,2)-yd(:,:,1),0).*max(yd(:,:,2)-yd(:,:,3),0);
```

Now any pixel with more red or more blue than green will have a greenness of 0. Everything else works the same.

I think there's a nice moral in this postscript, as well. I love using MATLAB to solve a problem -- a real, immediate problem that I need to solve right now. I don't necessarily need the best solution, or the solution to the most general problem. The greenscreen algorithm I hacked together wasn't the best, but it worked for my application. If my image had light magenta in it, I would have discovered this bug and fixed it, but that didn't happen and didn't matter.

If I had approached this assignment like a professional software developer, I'd still be working on it. But as a MATLAB hacker, I was done in a weekend!

Get
the MATLAB code

Published with MATLAB® R2014a

**A Look into the Future?**

We’ve been publically developing a Simulink model of the Hyperloop for some time now. We recently looked into using MATLAB interfaces to visualize that journey in GoogleEarth. To create a virtual cockpit, we constructed a MATLAB figure combining GoogleEarth windows and several other native graphics capabilities. We think the resulting video is pretty cool.

To understand how we arrived at this vision of the future, here’s a look back at our past Hyperloop posts.

**Getting Started**

We started out with a conversational post about Elon Musk’s concept and outlining some of the interesting technical points. This post was criticized for not brining any value to the discussion. I guess we were simulating trolling.

To get things moving, we introduced different approaches to architecting a simulation model for such a system. Guy suggested splitting the model into a plant (physical elements) and a controller (functional software) as in conventional model-based design. I proposed a more distributed architecture with several functional components, each containing its own plant and controller (image below). Guy’s proposal included subsystem variants while mine leveraged model reference.

**Initial dynamic analysis**

Last November, we published a two-dimensional analysis of the Hyperloop’s proposed route. We used the original proposal’s design constraints for curve radius to see how the track could be laid out over the California terrain.

The results highlighted the difficulties in maintaining tolerable g-levels for passengers while traveling at such high speeds along existing infrastructure. However, the analysis did not include several important factors; elevation and banking. Elevation promised to make route planning more difficult, while the effects of banking the vehicle in the tube would help reduce the g-forces and potentially improve the route.

*Anticipated hyperloop route (yellow) to avoid excessive g-forces (Image created using Google Earth)*

**Promoting Collaboration**

For Valentine’s Day, we used Simulink Projects to package a Simulink model on the MATLAB File Exchange that contained elements of each of our architectural approaches. My partitioning remained but now utilized Guy’s idea of using subsystem variants.

In April, R2014a was released with Simulink Projects supporting Git as a version control system. So, we quickly jumped on the bandwagon and placed an updated Hyperloop model on GitHub.

**Improved dynamics**

In May, we investigated the elevation profile of the route. We used optimization techniques to find the best combination of pillars and tunnels to meet cost and comfort demands. There were some great improvement suggestions in the comments of this post that we’ve yet to implement.

Last month, we enhanced the model to include the effects of banking within the tube. We blogged about a simple approach in SimMechanics to model those dynamics.

**Roll your own!**

In keeping with our collaborative mindset, the necessary models and m-scripts to generate this video are now available on the GitHub submission.

The project has even been set up to support new routes. Load your own KML file and see how the Hyperloop looks in your neighborhood.

]]>Have you ever needed to move a figure? Perhaps because it is offscreen>? Or at least the menus or close box are not visible? This happens to me from time to time when I get code from someone else who is programming specifically to the size of their display. Working on a laptop, I often have fewer available pixels. What to do?

I used the Function Browser to search for "move figure" and found `movegui`. It might be just the ticket for you.

As you can see from the documentation, `movegui` gives you a fair amount of latitude for moving your figure around. The two choices I find most useful and easiest to remember for my purposes: `onscreen` and `center`. I think you can guess what they each do :-)

Do you use the Function Browser to help you? Do you ever need to reposition "lost" figure windows, like I have? Let me know here.

Get
the MATLAB code

Published with MATLAB® R2014a

Sean's pick this week is ThingSpeak Support for MATLAB by MathWorks' Internet of Things Team.

Up until recently, I'd heard the phrase the "internet of things" thrown around only a few times and didn't really know what it meant. So I spoke with our Marketing Manager for the Maker Movement, Paul Kassebaum, to get a better understanding. Paul describes it briefly as "The confluence of technologies in communication (the internet as we know it), the energy internet (or energy from sensors, etc.) and the logistics internet (the ability to use this information for logistics)."

This week's pick uses all three. It allows you to pull data directly into MATLAB from ThingSpeak, a platform and cloud service where data can be uploaded from sensors and low cost hardware. After pulling this data, you can use it for whatever research, design or logistics you want.

I saw this blog post on our Maker Zone blog that described measuring soda consumption here at MathWorks. They posted the file on the FEX as well.

This made a challenge for me: Could I gather a few people together, drink as much Powerade as possible at 4pm Eastern Time on August 5th, and create an outlier data point?

We had a few detractors, like Loren, who did not properly appreciate the health benefits of this type of social gathering.

To passerbyers, it probably looked like we were part of some obscure MATLAB cult. Let's see how we did:

% Grab the data for July 6th through August 6th. [drinkData, timeStamps] = thingSpeakFetch(7040,'DateRange',{'06-July-2014','06-August-2014'}); Powerade = drinkData(:,3); % Third channel is Powerade % Index into the party hour idxParty = hour(timeStamps)==16 & day(timeStamps) == 5 & month(timeStamps)==8;

scatter(timeStamps(idxParty),Powerade(idxParty),400,'MarkerFaceColor',[0.03 0.6 0.8],'MarkerEdgeColor',[0.03 0.6 0.8]); hold on scatter(timeStamps,Powerade,15,'MarkerFaceColor',[0.5 0 0],'MarkerEdgeColor',[0.5 0 0]); datetick('x','mmmdd','keepticks','keeplimits'); legend('Powerade Party','Powerade Consumption','location','northwest')

So it does look like we were able to create a Powerade Consumption outlier. Go Team!

Give it a try and let us know what you think here or leave a comment for our Internet of Things team.

Get
the MATLAB code

Published with MATLAB® R2014a

Our latest update has two features that should be of interest to blog readers:

- Searching with Categories: Blog authors can tag their posts with Categories; keywords, or phrases that organize or describe a particular post. It’s a helpful tool for finding related content. Like a particular post? Just click the Category link at the bottom to find similar posts.
- Print a Post: Several blog readers requested this feature. Now you can easily print out posts by your favorite bloggers and read them when you’re off the grid. After spending all day staring into a monitor, there’s nothing better than being outside, reading a nicely formatted post on paper.Bliss.

We’re always looking to improve the blogs. Tell us what you think about these new features or let us know what we can do to make the blogs section better for you. While you’re thinking about all this, drop us a line in the Comments section below.

Thanks!

]]>Of course, my answer was that there are many ways to hold a value in Simulink. I made him a few examples that I am sharing today in this post.

**The Desired Behavior**

I want to create a subsystem with two inputs, `u` and `hold`, and one output `y`. When `hold` is true, the output `y` is equal to the input `u`. When `hold` is zero, the output `y` remains constant, holding the last output value.

Here are example input and output signals:

Let's see a few ways to obtain such behavior

**Method 1: Switch and Delay**

The most common way to hold a value that I observe in customers models is using a Switch and a Unit Delay, or Memory block

Nice, clean and simple!

**Method 2: Enabled Subsystem**

Personally, my favorite way to hold a the value of a signal is using an Enabled Subsystem, with the Outport property `Output when disabled` set to **held**.

I like this method because it takes virtually no blocks.

**Method 3: If Action Subsystems and Delay**

This version is similar to the first method, but we replace the Switch by an arrangement of If Action Subsystems and a Merge block.

Personally, I don't use this syntax very often, mostly because it requires more blocks.

**Method 4: Stateflow**

Another easy way to obtain the behavior we are looking for is using Stateflow

**What about the generated code?**

When choosing an implementation, many factors need to be taken into account. One of my favorite guidelines for was suggested by a reader of this blog: **Clearly show the intended functionality**. I will let you judge by yourself which implementation most clearly shows the intention of holding a signal.

Another concern that many users have is: What does the code look like?

In this case, all the methods generate similar code. to make the code easier to read for this post, I set the storage class of the three signals or interest (`hold`, `u`, `y`) to ExportedGlobal.

Switch and Delay:

****Note: To obtain the above code, I enabled inline parameters from the model configuration. Otherwise the threshold of the switch is a tunable parameter.*

Enabled Subsystem:

If Action Subsystem:

****Note: To obtain the above code, I manually set the Function Packaging option of the If Action Subsystem to Inline. Otherwise, since the two subsystems are identical, the coder generates a reusable function.*

Stafelow:

**Now it's your turn**

What is your preferred method to hold the value of a signal? Are there other functionalities for which you consider multiple implementations? Let us know by leaving a comment here.

]]>In rare cases, Gaussian elimination with partial pivoting is unstable. But the situations are so unlikely that we continue to use the algorithm as the foundation for our matrix computations.

I almost hesitate to bring this up. Gaussian elimination with partial pivoting is potentially unstable. We know of a particular test matrix, and have known about it for years, where the solution to simultaneous linear equations computed by our iconic backslash operator is less accurate than we typically expect. The solution is contaminated by unacceptably large roundoff errors associated with large elements encountered during the elimination process. The offending matrix is generated by the following function, which is a special case of the `gfpp` function in Nick Higham's collection of MATLAB test matrices.

```
type gfpp
```

function A = gfpp(n) % A = gfpp(n) Pivot growth of partial pivoting. A = eye(n,n) - tril(ones(n,n),-1); A(:,n) = 1;

Here is the 8-by-8.

n = 8 A = gfpp(n)

n = 8 A = 1 0 0 0 0 0 0 1 -1 1 0 0 0 0 0 1 -1 -1 1 0 0 0 0 1 -1 -1 -1 1 0 0 0 1 -1 -1 -1 -1 1 0 0 1 -1 -1 -1 -1 -1 1 0 1 -1 -1 -1 -1 -1 -1 1 1 -1 -1 -1 -1 -1 -1 -1 1

Gaussian elimination with partial pivoting does not actually do any pivoting with this particular matrix. The first row is added to each of the other rows to introduce zeroes in the first column. This produces twos in the last column. As similar steps are repeated to create an upper triangular `U`, elements in the last column double with each step. The element growth in the final triangular factor `U` is `2^(n-1)`.

[L,U] = lu(A); U

U = 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 2 0 0 1 0 0 0 0 4 0 0 0 1 0 0 0 8 0 0 0 0 1 0 0 16 0 0 0 0 0 1 0 32 0 0 0 0 0 0 1 64 0 0 0 0 0 0 0 128

So if the matrix order `n` is set to the number of bits in the floating point word (plus one), the last diagonal element of `U` will grow to be `1/eps`.

n = 53 A = gfpp(n); [L,U] = lu(A); unn = U(n,n)

n = 53 unn = 4.5036e+15

If we try to use backslash to solve a linear system involving this matrix and a random right hand side, the back substitution will begin with `U(n,n)`. The roundoff errors can be expected to be as large as `eps(U(n,n))` and swamp the solution itself.

b = randn(n,1); x = A\b;

Ordinarily, we expect the solution computed by Gaussian elimination to have a residual on the order of roundoff error. But in this case, the residual is many orders of magnitude larger. Backslash has failed.

r = b - A*x; relative_residual = norm(r)/(norm(A)*norm(x))

relative_residual = 3.7579e-05

Anything we do to this matrix to provoke pivoting during the elimination will prevent the growth of the elements in `U`. One possibility is to swap the first two rows.

A = gfpp(n); A([1 2],:) = A([2 1],:); [L,U] = lu(A); unn = U(n,n)

unn = 2

That's much better. Now use backslash and look at the residual.

x = A\b; r = b - A*x; relative_residual = norm(r)/(norm(A)*norm(x))

relative_residual = 1.7761e-17

So the residual is now on the order of roundoff relative to `A` and `x`. Backslash works as expected.

Another way to provoke pivoting is to perturb the matrix with a little white noise.

A = gfpp(n); A = A + randn(n,n); [L,U] = lu(A); unn = U(n,n)

unn = -3.8699

So `U` is well behaved. Let's see about backslash.

x = A\b; r = b - A*x; relative_residual = norm(r)/(norm(A)*norm(x))

relative_residual = 1.2870e-16

Again the residual is on the order of roundoff and backslash works as expected.

Suppose we are using Gaussian elimination with any pivoting process to solve a system of linear equations involving a matrix $A$. Let $a_{i,j}^{(k)}$ denote the elements in the matrix after the $k$ th step of the elimination. The *growth factor* for that pivoting process is the quantity

$$ \rho = { \max_{i,j,k} |a_{i,j}^{(k)}| \over \max_{i,j} |a_{i,j}| } $$

This growth factor is a crucial quantity in Wilkinson's roundoff error analysis. If $\rho$ is not too large, then the elimination algorithm is stable. Unfortunately, the matrix we have just seen, `gfpp`, shows that the growth factor for partial pivoting is at least $2^{n-1}$.

At the time he was presenting his analysis in early 1970s, Wilkinson knew about this matrix. But he said it was a very special case. He reported:

It is our experience that any substantial increase in size of elements of successive An is extremely uncommon even with partial pivoting. ... No example which has arisen naturally has in my experience given an increase as large as 16.

During the development of LINPACK in the late 1970s, a colleague at Argonne, J. T. Goodman, and I did some experiments with random matrices to check for element growth with partial pivoting. We used several different element distributions and what were then matrices of modest order, namely values of $n$ between 10 and 50. The largest growth factor we found was for a matrix of 0's and 1's of order 40. The value was $\rho$ = 23. We never saw growth as large as linear, $\rho = n$, let alone exponential, $\rho = 2^{n-1}$.

Nick Trefethen and Rob Schreiber published an important paper, "Average-case Stability of Gaussian Elimination", in 1990. They present both theoretical and probabilistic models, and the results of extensive experiments. They show that for many different types of matrices, the typical growth factor for partial pivoting does not increase exponentially with matrix order $n$. It does not even grow linearly. It actually grows like $n^{2/3}$. Their final section, "Conclusions", begins:

Is Gaussian elimination with partial pivoting stable on average? Everything we know on the subject indicates that the answer is emphatically yes, and that one needs no hypotheses beyond statistical properties to account for the success of this algorithm during nearly half a century of digital computation.

The Higham Brothers, Nick and Des Higham, published a paper in 1989 about growth factors in Gaussian elimination. Most of their paper applies to any kind of pivoting and involves growth that is $O(n)$. But they do have one theorem that applies only to partial pivoting and that characterizes all matrices that have $\rho = 2^{n-1}$. Essentially these matrices have to have the behavior we have seen in `gfpp`, namely no actual pivoting and each step adds a row to all the later rows, doubling the values in the last column in the process.

In two different papers, Stephen Wright in 1993 and Les Foster in 1994, the authors present classes of matrices encountered in actual computational practice for which Gaussian elimination with partial pivoting blows up with exponential growth in the elements in the last few columns of `U`. Wright's application is a two-point boundary value problem for an ordinary differential equation. The discrete approximation produces a band matrix. Foster's application is a Volterra integral equation model for population dynamics. The discrete approximation produces a lower trapezoidal matrix. In both cases the matrices are augmented by several final columns. Certain choices of parameters causes the elimination to proceed with no pivoting and results in exponential in growth in the final columns. The behavior is similar to `gfpp` and the worst case growth in the Highams theorem.

Complete pivoting requires $O(n^3)$ comparisons instead of the $O(n^2)$ required by partial pivoting. More importantly, the repeated access to the entire matrix complete changes the memory access patterns. With modern computer storage hierarchy, including cache and virtual memory, this is an all-important consideration.

The growth factor $\rho$ with complete pivoting is interesting and I plan to investigate it in my next blog.

`lugui` is one of the demonstration programs included with Numerical Computing with MATLAB. I will also describe it in my next blog. For now, I will just use it to generate a graphic for this blog. Here is the result of a factorization, the lower triangular factor `L` in green and the upper triangular factor `U` in blue.

```
A = gfpp(6);
lugui(A,'partial')
```

Nicholas J. Higham and Desmond J. Higham, "Large Growth Factors in Gaussian Elimination with Pivoting", <http://www.maths.manchester.ac.uk/~higham/papers/hihi89.pdf>, SIAM J. Matrix Anal. Appl. 10, 155-164, 1989.

Lloyd N. Trefethen and Robert S. Schreiber, "Average-case Stability of Gaussian Elimination", https://courses.engr.illinois.edu/cs591mh/trefethen/trefethen_schreiber.pdf, SIAM J. Matrix Anal. Appl. 11, 335-360, 1990.

Stephen J. Wright, A Collection of Problems for Which Gaussian "Elimination with Partial Pivoting is Unstable", <ftp://info.mcs.anl.gov/pub/tech_reports/reports/P266.ps.Z>, SIAM J. Sci. Statist. Comput. 14, 231-238, 1993.

Leslie Foster, "Gaussian Elimination Can Fail In Practice", <http://www.math.sjsu.edu/~foster/geppfail.pdf>, SIAM J. Matrix Anal. Appl. 15, 1354-1362, 1994.

Get
the MATLAB code

Published with MATLAB® R2014a

Sean's pick this week is Par Tic Toc by Sarah Wait Zaranek.

I run a fair number of computationally expensive operations and often use parallel for (parfor) loops to speed them up. This utility interested me mainly because it shows how work is distributed across the workers and provides worker utilization rates.

First, I will make sure a parallel pool is open using `gcp()` or "Get Current Pool". This opens the headless MATLAB processes that a parfor loop distributes iterations to.

pool = gcp

pool = Pool with properties: Connected: true NumWorkers: 2 Cluster: local AttachedFiles: {} IdleTimeout: 90 minute(s) (89 minutes remaining) SpmdEnabled: true

We'll do a simple example with a loop that has a random pause. The pause is a proxy for an operation that takes a varying and unpredicable amount of time.

n = 50; p = Par(n); % Build the Par timer parfor ii = 1:n Par.tic; % start timer pause(randi(5)) % Pause for up to five seconds p(ii) = Par.toc; % measure elapsed time end stop(p); % all done plot(p);

The parfor loop has done a good job distributing the calculations between the two workers. The more highly utilized worker only has one additional iteration that the other.

There are a few other neat tricks in here as well that I found from looking through the code.

- If you right click on the axes with each iteration's time, it gives you context menu to select the scale.
- Since the Par class inherits from handle, you get those visible methods. Sarah hides those from
`methods()`by making their signature hidden. - Sarah has also provided a full set of documentation on the capabilities of this utiltiy as well as examples.

Give it a try and let us know what you think here or leave a comment for Sarah.

Get
the MATLAB code

Published with MATLAB® R2014a

Today's blog post comes from the stream of image processing questions over at MATLAB Answers.

User arjun wanted to know how to remove circles from an image by specifying their center and radius.

Famous MATLAB Answer man Image Analyst quickly gave a helpful reply.

I thought it might be worth a little discussion here (plus, I wanted to point out the value of MATLAB Answers.)

Typically, the process of answering the question started out with trying to understand the question better. "What does *remove* mean to you?" asked Image Analyst. The answer came back, "I want to make the pixels in the circle be black."

OK, let's give that a try.

```
I = imread('coins.png');
imshow(I)
```

(Hmm. Guess I didn't have any quarters in my pocket when I took this picture.)

Suppose we want to set to black all the pixels centered at (X = 175, Y = 120) and with radius 35.

I like Image Analyst's recommended procedure:

- Compute a
*mask image*. Generally, the term*mask*or*mask image*means a binary image whose foreground pixels correspond to a set of pixels of interest. - Use logical indexing with the mask image to modify the desired pixels.

Start with creating a mask image.

```
xc = 175;
yc = 120;
r = 33;
x = 1:size(I,2);
y = 1:size(I,1);
[xx,yy] = meshgrid(x,y);
mask = hypot(xx - xc, yy - yc) <= r;
imshow(mask)
title('Mask image')
```

Next, use logical indexing to modify the pixels of `I` corresponding to the foreground pixels of `mask`.

```
I2 = I;
I2(mask) = 0;
imshow(I2)
title('Image with some pixels set to black')
```

And we just lost 5 cents.

Image Analyst pointed out that you can use this technique to set pixel values in the circle to any value you want. How about white:

```
I2(mask) = 255;
imshow(I2)
title('Image with some pixels set to white')
```

For fun, how about setting the pixels to the background color. How do we determine that? I might start by thresholding the original image.

```
bw = im2bw(I,graythresh(I));
imshow(bw)
title('Thresholded image')
```

Now fill in the holes.

bw = imfill(bw,'holes'); imshow(bw) title('Thresholded image with holes filled')

Compute the mean value of the background pixels (using logical indexing again!).

bg_mean = mean(I(~bw))

bg_mean = 66.9959

Replace the pixels in the circle with the computed average background value.

```
I2(mask) = bg_mean;
imshow(I2)
title('Circle pixels filled with average background')
```

Let me point out two related Image Processing Toolbox functions. The `roifill` function can fill a region based on a mask; it fills pixels "smoothly" from the adjacent background pixels.

```
I3 = roifill(I,mask);
imshow(I3)
title('Circle pixels replaced using roifill')
```

I also want to mention `imellipse`. You can use this function to draw circles interactively using the mouse. Then you can use the `createMask` function to get the mask image corresponding to the circle you drew.

Happy circle filling!

Get
the MATLAB code

Published with MATLAB® R2014a

Here is the result:

**Why Simulink Code Inspector?**

You are designing a high-integrity software application using Simulink (good idea!). Once you have an excellent design that has been tested and meets the design requirements, you use Embedded Coder to generate C code that you will then compile, link, and load to your embedded hardware target.

You know your Simulink design does exactly what you want it to do - nothing more, nothing less. But you are not going to be running Simulink on the embedded hardware - you will be running an executable built from the automatically generated C code.

So what about that C code - does it still represent your exact design? How can you tell? You could use Processor-in-the-Loop (PIL) and re-run your requirements-based tests on the target processor to verify that the outputs match your simulation outputs. That's a good start. However, this is not necessarily going to prove the absence of unintended code. So, what can?

**So... what is Simulink Code Inspector?**

Simulink Code Inspector automatically compares generated code with its source model. It examines the generated code and the model to determine if they are structurally equivalent. After inspection is complete, it produces detailed reports with model-to-code and code-to-model traceability analysis.

Because Simulink Code Inspector has been implemented completely independently of Embedded Coder, it can be used as a means of compliance for DO-178C/DO-331 certification objectives. These structural equivalence and traceability reports can be submit to certification authorities as evidence of code reviews for high-integrity standards such as DO-178C. Simulink Code Inspector is supported by our DO-178C/DO-331 Qualification Kit, allowing you to obtain certification credit when using it for DO-178C/DO-331 applications.

**How does it do that?**

Ok, magic and abstract syntax trees.

**Working with Simulink Code Inspector**

To highlight what Code Inspector can do for you, let's take a very simple model to be deployed to an embedded target.

Not all Simulink/Stateflow features are supported by Code Inspector. Before running an inspection, you need to check that your model is compatible with Code Inspector. A set of Model Advisor checks, included with Code Inspector, can be run to verify compatibility.

I run these checks on the model and review the Model Advisor report.

No compatibility issues, so let's go ahead with inspection.

Open the Simulink Code Inspector window from the 'Code' menu...

And get this window:

Once code generation and inspection are complete, a report with the results will open.

**Code Inspection Report**

Here's the report from the inspection - we passed!

The report is extremely detailed - let's look at some of the information it includes.

**Model-To-Code Traceability**

Model-to-code traceability - with hyperlinks for navigating to the exact model object for convenience.

**Code-To-Model Traceability**

Depending on your goal, it might be more convenient to know which block corresponds to a specific line of code. The report also includes a code-to-model traceability section.

**I guess it's your turn now...**

We've introduced some of the basic features of Simulink Code Inspector. Obviously there's quite a bit more to this tool, but I hope that this overview will inspire you to think about your processes for verifying autogenerated code for high-integrity applications.

Does your organization's development process for high-integrity embedded software include manual code reviews or manual tracing from model to code? Are you interested in seeing more posts about high-integrity software development with Simulink? Let us know what you think by leaving a comment below.

One more thing... as mentioned in a previous post, if you need help with any part of your Model-Based Design process, you can contact our Consulting Servies.

]]>

Photo by Sara Kerens

Nick is what you might call a “heroic user” of MATLAB. He was there at the first class in which Cleve introduced the original FORTRAN MATLAB as a teaching tool. Near the lobby of the main MathWorks building here in Natick, we have a framed copy of the very first purchase order for MATLAB. Who bought that first copy of MATLAB? None other than Nick Trefethen. And it’s not merely a question of precedence or longevity. Across his long career, Nick has been a contributor of the highest order to the ecosystem of mathematical code written in MATLAB. He is the author of many books, an influential educator, and most recently he has been leading an innovative open source software project, a project that grew out of work he initiated back in 2002: Chebfun.

I can personally attest that the examples associated with Chebfun are among the most beautiful and thoughtful you will find documenting any mathematical project on the web. We recently conducted a virtual interview with Nick to learn about the latest release of Chebfun. Here’s what we learned.

Q: When did you first start using MATLAB?

A: I took a course from Cleve Moler when I was a graduate student at Stanford, so I must have typed x = A\b for the first time around 1978. When MathWorks started selling MATLAB, I bought the very first copy! That was in February, 1985. It was about ten years later when MathWorks told me I had been the first.

Q: What is Chebfun?

A: Chebfun is a system written in MATLAB that looks and feels like MATLAB, except that vectors and matrices become functions and operators. Whenever MATLAB has a discrete command, Chebfun aims to have the same command overloaded to execute a continuous analogue. For example, SUM(f) gives you the integral of a chebfun f over its interval of definition, MAX(f) gives you its maximum, ROOTS computes the roots, and so on.

Q: Why is it called Chebfun?

A: Well, Chebfun stands for Chebyshev-functions. Our algorithms are based on representations of functions by piecewise Chebyshev polynomial interpolants. The details are given in my book Approximation Theory and Approximation Practice.

Q: This Chebyshev fellow must have been an interesting character. Can you tell us a little more about how his name came to be associated with your favorite functions?

A: Pafnuty Lvovich Chebyshev was the preeminent Russian mathematician of the 19th century, born near Moscow and a professor in St. Petersburg. He was remarkable for contributing in a big way to both pure and applied mathematics. The crucial thing for Chebfun was that he laid the foundations of how one can approximate nonperiodic functions by polynomials. What Fourier is to periodic functions, Chebyshev is to nonperiodic functions.

Q: A new version of Chebfun has recently been released. What is different in this release?

A: The big thing is that the code has been completely rewritten in a more modular structure and released on GitHub; we are a fully public open-source project. There’s also a beautiful new web site at www.chebfun.org, with a user’s guide and hundreds of examples showing how to use the code. Besides that there are many speedups and a number of new features, notably the capability of taking advantage of periodicity of functions (i.e., Fourier as opposed to Chebyshev methods). And the two-dimensional side of Chebfun, for representing functions f(x,y), keeps advancing.

Q: Why have you released Chebfun on GitHub?

A: We’d like to encourage new users and new developers. Broadening the Chebfun community seems the best way to make the software develop as strongly as possible and ensure its long-term future. (We have thousands of users already, but we don’t know as much about them as we would like.)

Q: What benefit do you see in the MATLAB File Exchange / GitHub integration?

A: So far as we are aware, Chebfun is the most advanced project written in MATLAB. It’s completely natural that it should be easily available from the File Exchange. We are very pleased at this new development.

Q: How might MATLAB users apply Chebfun?

A: We think every MATLAB user is potentially a Chebfun user, for every MATLAB user works with functions. It’s so much simpler and more natural to call SUM, MAX, ROOTS and so on than to worry about the calling sequences of QUAD, FMINBND, FZERO. In 2D, the same commands find integrals and extrema and roots of functions f(x,y). Chebfun is also the easiest way to solve ODE boundary value problems. We’ve overloaded x=A\b to u=N\f, where N is a differential operator with boundary conditions and f is a right-hand side.

The big advantage in all these computations is that we’re always dealing with functions (chebfuns), they all take the same form, and the user doesn’t have to know anything about the underlying grids. Input to SUM or MAX or ROOTS — just a chebfun. Output from \ for ODEs — just another chebfun. This is very different from the traditional style of computing in MATLAB and other languages with codes like BVP4C/BVP5C where for each code you need to learn a calling sequence that may involve specification of an initial grid and the code’s special structures like “DEVAL” for representing output functions.

Q: How can MATLAB users access Chebfun?

A: Just go to the File Exchange, or google Chebfun. Be sure to check out the People page with its photo-bios of Chebfun developers including Nick Hale, Toby Driscoll, Asgeir Birkisson, Alex Townsend, Anthony Austin, Grady Wright, and Hrothgar.

]]>Guest blogger Peter Webb returns with another in an occasional series of postings about application deployment.

I've written a MATLAB application that reports on the presence of contaminants in surface water, using a publically available database of water quality observations maintained by the United States Geological Survey and the Environmental Protection Agency. Many groups of people might be interested in this kind of information. MATLAB made it easy to write the application, and some tools we've added recently make sharing the application easy as well.

I can share the application with other MATLAB users as a MATLAB App. With MATLAB Compiler I can create a standalone program that will run on a Windows, Linux or Mac desktop. And Builder JA lets me turn my application into a Java class for use by other developers.

To follow the rest of this article, please download the example code from MATLAB Central.

You could share your code with other MATLAB users by emailing them your function and data files. But then you'd have to explain how to install, update and uninstall your application -- and the process would be very vulnerable to human error. Or you could create a MATLAB App. Introduced in R2012b, MATLAB Apps are self-contained programs that automate a technical computing task, typically via a graphical user interface. MATLAB displays the installed apps in the App Gallery, which you access from the **Apps** tab in the MATLAB toolstrip. Packaging your code as a MATLAB App creates an automated installer (a `.mlappinstall` file) that integrates your application into the App Gallery. The MATLAB App installer includes only your code; if the App uses any toolboxes, the recepient of the App must have licened copies of these toolboxes in order to use the App. Click on the **Package App** button on the **Apps** tab to begin packaging your program as an app.

The packaging interface consists of three columns, each representing a step in the packaging process:

- Pick main file
- Describe your app
- Package into installation file

All Apps must have a main file -- this is the MATLAB file that starts the graphical user interface or performs the task. Click on the `Add main file` link:

From the downloaded example code, select `swq.m`. The **Files included through analysis** section fills in with the required function files. Since the automated analysis cannot discover data files, you must add them manually. Select the `Add files/folders` in the **Shared resources and helper files** section and add `Contaminants.xml`, `ContaminantLimits.xml` and `swqLogo.png`.

All Apps must also have a name. Type `SurfaceWaterQuality` into the `App Name` field at the top of the **Describe your app** area. You can add your contact info and a description too, if you like, but that's optional. If you want to add an icon and a screenshot, I've provided appropriate images: `sqwIcon.png` for the icon and `swqApp.png` for the screenshot. Click on the screenshot and icon to browse for these images.

Once you're satisfied with the description, press the **Package** button to create the installer: `SurfaceWaterQuality.mlappinstaller`. On any machine with MATLAB installed, double-clicking `SurfaceWaterQuality.mlappinstaller` will install the `SurfaceWaterQuality` app into the MATLAB App Gallery. Click once on the app to start it:

With these inputs, the app produces a report of contaminants recorded within five miles of the MathWorks buildings in Natick, Massachusetts.

MATLAB Apps allow you to share programs with other MATLAB users. But what if your target audience doesn't have MATLAB? MATLAB Compiler packages MATLAB programs to run against the MATLAB Compiler Runtime, which does not require MATLAB. The process of creating a standalone executable with MATLAB Compiler begins on the **Apps** tab of the MATLAB toolstrip. Click the **Application Compiler** button:

Like an App, a MATLAB Compiler-generated standalone executable requires a main file. Click on the `+` next to the `Add main file` field and select `waterQualityReport.m` from the downloaded example files:

`applicationCompiler` derives the application name from the name of the main file and immediately begins determining the files required to run the application. The **Files required for your application to run** section fills with MATLAB function dependencies. You'll need to add data file dependencies manually. Click on the `+` button in that section and add `Contaminants.xml` and `ContaminantLimits.xml`:

Like a MATLAB App, a standalone application may contain contact and description information to be displayed in the installer. You may also add a splash screen, for example `waterQuality.jpg` from the downloaded example files. `applicationCompiler` automatically includes the splash screen image in the installer, so there's no need to add it manually. Once you're done entering this optional information, change the name of the installer to `WQRInstaller_web`.

Press the **Package** button to build the installer for your standalone application.

Packaging creates a platform-specific installer: `waterQualityReport\for_redistribution\WQRInstaller_web.exe` on Windows and `waterQualityReport/for_redistribution/WQRInstaller_web` on Linux and Mac. Copy this file to a machine of the appropriate type, run the installer, and you'll be able to install and run the water quality application without installing MATLAB. Run the installer like this:

`WQRInstaller_web`

Look for the executable in the `application` folder, located in the installation folder. And then run the application like this:

`waterQualityReport 42.2973025 -71.3525913 5 1/1/2004`

On Linux and Mac platforms, you may find it easier to run the application using the generated `run_waterQualityReport.sh` shell script. This command assumes you've installed the MCR in the `/tmp/WQTest/MATLAB_Compiler_Runtime/v83` folder:

./run_waterQualityReport.sh \ /tmp/WQTest/MATLAB_Compiler_Runtime/v83/ \ 42.2973025 -71.3525913 5 1/1/2004

See the MATLAB Compiler documentation for more information about how to run a MATLAB Compiler-generated application.

Builder JA creates a Java component that you integrate into a larger host application. Before using Builder JA, you must configure your environment to ensure Builder JA uses a supported version of the Java compiler.

When you have set up your environment, type `libraryCompiler` at the MATLAB prompt and choose **Java Package** as the application type to begin:

Instead of running a main file, a Java component exports one or more functions into a public API. Use the `+` button to add `detectContamination.m` to the **Exported Functions** list. Then, in **Files required for your application to run** section, add the data files `Contaminants.xml`, `ContaminantLimits.xml` and `swqLogo.png`.

A Java API requires a class with at least one method. Placing the class in a namespace helps prevent class name conflicts. `libraryCompiler` uses generic defaults: no namespace, a class named `Class1` and method names corresponding to the MATLAB functions you chose to export. Change the class name to `SurfaceWaterQuality` and the namespace to `WaterQuality`.

Change the name of the generated installer (as illustrated previously, in **Deploying to the Desktop**) to `DCInstaller_web`. Press the **Package** button to create `WaterQuality.jar`.

Unlike the desktop application, the Java component in `WaterQuality.jar` won't run by itself. It needs to be invoked from a host Java application. The downloaded code contains an example main program, `WaterQuality.java`.

Much like the MATLAB App and the desktop application, this Java host program collects a search area and start date from the user and invokes the web service. In addition to creating a user interface via AWT, the main program must perform three tasks to interact with the Builder JA-generated Java component:

- Import Builder JA generated and utility classes.
- Create a
`WaterQuality.SurfaceWaterQuality`object. - Invoke the
`detectContaminants`method.

The `WaterQuality` namespace contains the generated `SurfaceWaterQuality` class and the `com.mathworks.toolbox.javabuilder` namespace contains Builder JA runtime management and error handling classes such as `MWException`. The Java program imports these namespaces:

import WaterQuality.*; import com.mathworks.toolbox.javabuilder.*;

Builder JA creates object, rather than static, methods, so you must create an instance of the generated class to invoke the methods in your exported API. The constructor might throw an `MWException`, so your Java code must either `catch` `MWException` or declare it in a `throws` clause. The unsophisticated code in the example catches (and ignores) errors that might occur during object creation:

try { swq = new SurfaceWaterQuality(); } catch (MWException me) {}

Pushing the **Report** button invokes the web service via the `detectContamination` method. The first input is the number of expected outputs. Methods in the exported API return an array of Java `Object` instances. The host program converts each result to the appropriate native type, a `String` in this case, before using them. Builder JA includes a robust set of type-conversion routines to make this task easy. `detectContamination` preformats the text for tabular display with a monospaced font such as `Courier`. In the Java code below, `reportData` is a Java `TextArea` object capable of scrolling to display the list of reported contaminants.

Object[] lhs; lhs = swq.detectContamination(1, latitude, longitude, radius, startDate); if (lhs != null) { String result = lhs[0].toString(); reportData.setText(result); }

The download contains scripts to help you build and run the sample application. See the `Readme.txt` for platform-specific instructions. Note that the application does not do much validation of its inputs, and is somewhat fragile as a result.

Managing the interaction with the web service is the most complex part of these three water quality report applications. And in each of them, that part is exactly the same. Only the target-specific wrapper differs, adapting the core functionality to the interface norms of the target environment: MATLAB, the standalone desktop and a Java host program.

MATLAB Apps, `applicationCompiler` and `libraryCompiler` use the same visual conventions and very similar three-step workflows: select functions to deploy, add data files and descriptive data, create an installer. This shared language and process makes it simpler to deploy the same code to multiple target environments.

Do you deploy code to more than one environment? What challenges do you face? Let us know here.

Get
the MATLAB code

Published with MATLAB® R2014a

Jiro's pick this week is `legappend` by Chad Greene.

Chad is no stranger to MATLAB Central. He has over 50 File Exchange entries, and two of his entries have been highlighted (unit converters and ccc) in Pick of the Week. His entries are well-written, and like this one, many of his entries have published example files.

Many of you may know that the command `legend` creates one legend per axes.

t = 0:.1:10; x1 = sin(t); x2 = cos(t); plot(t,x1,t,x2) legend('Sine','Cosine')

Let's say that you wanted to add another line to this plot.

x3 = sin(t).*cos(t); hold on plot(t,x3,'--r')

To add this entry to the legend, you would re-run the `legend` command with the three entries.

legend('Sine','Cosine','Sine*Cosine')

Chad's `legappend` allows you to append new entries to an existing legend. This means that you can simply call it along with the new lines you
create.

% New figure figure; plot(t,x1,t,x2) legend('Sine','Cosine') % Add another line hold on plot(t,x3,'--r') % Append a new legend item legappend('Sine*Cosine')

Great! You can also delete the last legend entry with the following command.

`legappend('')`

If you want to see more examples, check out his published example.

Chad explains that `legappend` simply deletes the current legend and recreates the updated legend. This is a good example of how you can take a normal mode
of operation and tweak it in order to adapt it to your needs.

Give it a try and let us know what you think here or leave a comment for Chad.

Get
the MATLAB code

Published with MATLAB® R2014a

*Today's post is part of an ongoing (but long delayed) tutorial series on digital image processing using MATLAB. I'm covering topics in roughly the order used in the book Digital Image Processing Using MATLAB.*

In the previous post in this series, I discussed the different numeric data types that commonly come into play when doing image processing in MATLAB. Today I want to talk about the four main *image types*:

- Gray-scale images
- Binary images
- Indexed images
- RGB images

A *gray-scale image* is a matrix whose values represent shades of gray. When the matrix is of type `uint8`, the integer-valued elements are in the range [0,255]. By convention, the value 0 is displayed as black, and the value 255 is displayed as white. Values in-between are displayed as intermediate shades of gray.

When the matrix is of type `uint16`, then 0 is displayed as black and 65535 is displayed as white.

For a floating-point matrix, either of type `double` or `single`, the value 1.0 is displayed as white.

Originally, the Image Processing Toolbox documentation called these *intensity images*, and you might still find this term used in some places. Some of our color scientist users complained, though, that the term *intensity image* meant something slightly different in their field, so we (mostly) changed our terminology.

I = imread('rice.png'); whos I

Name Size Bytes Class Attributes I 256x256 65536 uint8

imshow(I)

In image processing, the term *binary image* refers to a two-valued image whose pixes are either black or white. (Or, a bit more generally, the pixels are either background or foreground.) In MATLAB and the Image Processing Toolbox, we have adopted the convention that binary images are represented as *logical matrices*. Here's an example of constructing a matrix whose type is logical and then displaying the result as a binary (black-and-white) image.

```
[xx,yy] = meshgrid(-100:100);
bw = hypot(xx,yy) <= 50;
whos bw
```

Name Size Bytes Class Attributes bw 201x201 40401 logical

```
imshow(bw)
title('Binary image')
```

Note that a common mistake is to create a `uint8` matrix and fill it with 0s and 1s, thinking that it will display as black and white.

bw = zeros(30,30,'uint8'); bw(10:20,10:20) = 1; imshow(bw,'InitialMagnification','fit') title('Help, my image is all black!')

When a matrix is of type `uint8`, then the value 1 is not white; it's almost completely black!

An *indexed image* has two components: an *index matrix* of integers, commonly called `X`, and a `color map matrix`, commonly called `map`. The matrix `map` is an M-by-3 matrix of floating-point values (either `double` or `single`) in the range [0.0,1.0]. Each row of `map` specifies the red, green, and blue components of a single color. An indexed image is displayed by mapping values in the index matrix to colors in the color map. A quirk of MATLAB is that this mapping is data-type specific. If the index matrix is floating-point, then the value 1.0 corresponds to the first color in the color map. But if the index matrix is `uint8` or `uint16`, then the value 0 corresponds to the first color. (Maybe I'll explain the reasons for that on another day.)

You display an indexed image by passing both the index matrix and the color map matrix to `imshow`, like this:

[X,map] = imread('trees.tif'); whos X map

Name Size Bytes Class Attributes X 258x350 90300 uint8 map 256x3 6144 double

```
imshow(X,map)
title('Indexed image')
```

An RGB image is an M-by-N-by-3 array. For a particular pixel at row `r` and column `c`, the three values `RGB(r,c,1)`, `RGB(r,c,2)`, and `RGB(r,c,3)` specify the red, green, and blue color components of that pixel. A pixel whose color components are [0,0,0] is displayed as black. For a floating-point array, a pixel whose color components are [1.0,1.0,1.0] is displayed as white. For a `uint8` or `uint16` array, either [255,255,255] or [65535,65535,65535] is displayed as white.

RGB = imread('peppers.png'); whos RGB

Name Size Bytes Class Attributes RGB 384x512x3 589824 uint8

imshow(RGB)

You can think of an RGB image as a "stack" of three gray-scale images. These gray-scale images are commonly called the *component images*.

```
imshow(RGB(:,:,1))
title('Red component image')
```

```
imshow(RGB(:,:,2))
title('Green component image')
```

```
imshow(RGB(:,:,3))
title('Blue component image')
```

In my first year of blogging (2006), I wrote a series of blog posts explaining in great detail how the MATLAB image display model works for various image and data types. The series is still up-to-date and worth reading today. Or, to see it all in one place, take a look at my MATLAB Digest article, "How MATLAB Represents Pixel Colors."

For more information, see Sections 2.7 and 7.1 of *Digital Image Processing Using MATLAB*.

Get
the MATLAB code

Published with MATLAB® R2014a

**Active State Output**

Did you know that with Stateflow, it is possible to monitor which state is currently active, and use that information in the rest of the Simulink model? This feature is named Active State Output. The screenshot below shows how to do this for one of the charts in the example Temporal Logic Using the AT Function.

The checkbox creates a new output port on your chart block, whose value (in simulation) is of an enumerated type with values that match the names of the child states.

When doing this for a chart with parallel decomposition, Active State Output needs to be enabled for each parallel state individually, and each parallel state has a separate corresponding output port. *(The sf_car demonstration uses this to output the current gear selected by the shift logic to its transmission model)*.

We can also take advantage of this feature in order to accomplish something special in the code generated from Stateflow charts.

**Stateflow States in Generated Code (Default Behavior)**

When using Embedded Coder to generate code from Stateflow charts, by default the states of the chart are created as macros using the `#define` directive. For example, here is a snippet from the generated code from the chart shown above.

```
/* Named constants for Chart: '<Root>/Temporal Logic' */
#define sf_tlat_IN_NO_ACTIVE_CHILD ((uint8_T)0U)
#define sf_tlat_IN_State ((uint8_T)1U)
/* Named constants for Chart: '<Root>/Without Temporal Logic' */
#define sf_tlat_IN_A ((uint8_T)1U)
#define sf_tlat_IN_B ((uint8_T)2U)
#define sf_tlat_IN_C ((uint8_T)3U)
```

The macros are grouped together by chart under an identifying comment line, but there is no intrinsic distinction as to which chart they belong to.

**Stateflow States as Enums**

So what if you would like to have a more organized grouping? Specifically, what if you would like to have your states be members of an enumeration?

As you might have guessed, we need to enable Active State Output. But we need to do two more things in order to get this enumeration in a nice, clean, standalone form. First, open the signal properties for the signal coming from the Active State Output port and give the signal a name. Second, set the signal’s storage class (under the Code Generation tab) to ExportedGlobal.

Now, in the *_types.h generated code file, there will be an enumeration definition, like so:

```
typedef enum {
listofStates_None = 0, /* Default value */
listofStates_A,
listofStates_B,
listofStates_C
} listofStates;
```

And in the generated c code for the model, macros for the chart are no longer defined; rather, a global (with the name of the signal) is declared of type listOfStates:

```
/* Exported block signals */
listofStates currentState; /* '<Root>/Without Temporal Logic' */
```

Further down in that c code file, you will find a switch-case statement that implements the logic of the Without Temporal Logic chart, like so:

```
switch (currentState) {
case listofStates_A:
...
break;
case listofStates_B:
...
break;
default:
...
break;
```

**One Last Note**

When you enable Active State Output, you can also check the “Define enumerated type manually” box and click on the helpful hyperlink to automatically create a template m-file that defines the enumerated type.

At the top of this file you will find an enumeration declaration with a list of all the state names. Tempting as it may be, you cannot change these names here unless you also change them in the chart itself. However, you can use this file to change some other behaviors using the methods that follow (but that’s a blog post for another day).

**Now it's your turn**

Are you already using the active state output? Give this a try and let us know what you think by leaving a comment here.

]]>Denormal floating point numbers and gradual underflow are an underappreciated feature of the IEEE floating point standard. Double precision denormals are so tiny that they are rarely numerically significant, but single precision denormals can be in the range where they affect some otherwise unremarkable computations. Historically, gradual underflow proved to be very controversial during the committee deliberations that developed the standard.

My previous post was mostly about normalized floating point numbers. Recall that normalized numbers can be expressed as

$$ x = \pm (1 + f) \cdot 2^e $$

The *fraction* or *mantissa* $f$ satisfies

$$ 0 \leq f < 1 $$

$f$ must be representable in binary using at most 52 bits for double precision and 23 bits for single precision.

The exponent $e$ is an integer in the interval

$$ -e_{max} < e \leq e_{max} $$

where $e_{max} = 1023$ for double precision and $e_{max} = 127$ for single precision.

The finiteness of $f$ is a limitation on *precision*. The finiteness of $e$ is a limitation on *range*. Any numbers that don't meet these limitations must be approximated by ones that do.

Double precision floating point numbers are stored in a 64-bit word, with 52 bits for $f$, 11 bits for $e$, and 1 bit for the sign of the number. The sign of $e$ is accommodated by storing $e+e_{max}$, which is between $1$ and $2^{11}-2$.

Single precision floating point numbers are stored in a 32-bit word, with 23 bits for $f$, 8 bits for $e$, and 1 bit for the sign of the number. The sign of $e$ is accommodated by storing $e+e_{max}$, which is between $1$ and $2^{8}-2$.

The two extreme values of the exponent field, all zeroes and all ones, are special cases. All zeroes signifies a denormal floating point number, the subject of today's post. All ones, together with a zero fraction, denotes infinity, or `Inf`. And all ones, together with a a nonzero fraction, denotes Not-A-Number, or `NaN`.

My program `floatgui`, available here, shows the distribution of the positive numbers in a model floating point system with variable parameters. The parameter $t$ specifies the number of bits used to store $f$, so $2^t f$ is an integer. The parameters $e_{min}$ and $e_{max}$ specify the range of the exponent.

If you look carefully at the output from `floatgui` shown in the previous post you will see a gap around zero. This is especially apparent in the logarithmic plot, because the logarithmic distribution can never reach zero.

Here is output for slightly different parameters, $t = 3$, $e_{min} = -5$, and $e_{max} = 2$. Howrever, the gap around zero has been filled in with a spot of green. Those are the denormals.

Zoom in on the portion of these toy floating point numbers less than one-half. Now you can see the individual green denormals -- there are eight of them in this case.

Denormal floating point numbers are essentially roundoff errors in normalized numbers near the underflow limit, `realmin`, which is $2^{-e_{max}+1}$. They are equally spaced, with a spacing of `eps*realmin`. Zero is naturally included as the smallest denormal.

Suppose that `x` and `y` are two distinct floating point numbers near to, but larger than, `realmin`. It may well be that their difference, `x - y`, is smaller than `realmin`. For example, in the small `floatgui` system pictured above, `eps = 1/8` and `realmin = 1/32`. The quantities `x = 6/128` and `y = 5/128` are between `1/32` and `1/16`, so they are both above underflow. But `x - y = 1/128` underflows to produce one of the green denormals.

Before the IEEE standard, and even on today's systems that do not comply with the standard, underflows would simply be set to zero. So it would be possible to have the MATLAB expression

x == y

be false, while the expression

x - y == 0

be true.

On machines where underflow flushes to zero and division by zero is fatal, this code fragment can produce a division by zero and crash.

if x ~= y z = 1/(x-y); end

Of course, denormals can also be produced by multiplications and divisions that produce a result between `eps*realmin` and `realmin`. In decimal these ranges are

format compact format short e [eps*realmin realmin] [eps('single')*realmin('single') realmin('single')]

ans = 4.9407e-324 2.2251e-308 ans = 1.4013e-45 1.1755e-38

Denormal floating point numbers are stored without the implicit leading one bit,

$$ x = \pm f \cdot 2^{-emax+1}$$

The *fraction* $f$ satisfies

$$ 0 \leq f < 1 $$

And $f$ is represented in binary using 52 bits for double precision and 23 bits for single recision. Note that zero naturally occurs as a denormal.

When you look at a double precision denormal with `format hex` the situation is fairly clear. The rightmost 13 hex characters are the 52 bits of the fraction. The leading bit is the sign. The other 12 bits of the first three hex characters are all zero because they represent the biased exponent, which is zero because $emax$ and the exponent bias were chosen to complement each other. Here are the two largest and two smallest nonzero double precision denormals.

format hex [(1-eps); (1-2*eps); 2*eps; eps; 0]*realmin format long e ans

ans = 000fffffffffffff 000ffffffffffffe 0000000000000002 0000000000000001 0000000000000000 ans = 2.225073858507201e-308 2.225073858507200e-308 9.881312916824931e-324 4.940656458412465e-324 0

The situation is slightly more complicated with single precision because 23 is not a multiple of four. The fraction and exponent fields of a single precision floating point number -- normal or denormal -- share the bits in the third character of the hex display; the biased exponent gets one bit and the first three bits of the 23 bit fraction get the other three. Here are the two largest and two smallest nonzero single precision denormals.

format hex e = eps('single'); r = realmin('single'); [(1-e); (1-2*e); 2*e; e; 0]*r format short e ans

ans = 007fffff 007ffffe 00000002 00000001 00000000 ans = 1.1755e-38 1.1755e-38 2.8026e-45 1.4013e-45 0

Think of the situation this way. The normalized floating point numbers immediately to the right of `realmin`, between `realmin` and `2*realmin`, are equally spaced with a spacing of `eps*realmin`. If just as many numbers, with just the same spacing, are placed to the left of `realmin`, they fill in the gap between there and zero. These are the denormals. They require a slightly different format to represent and slightly different hardware to process.

The IEEE Floating Point Committee was formed in 1977 in Silicon Valley. The participants included representatives of semiconductor manufacturers who were developing the chips that were to become the basis for the personal computers that are so familiar today. As I said in my previous post, the committee was a remarkable case of cooperation among competitors.

Velvel Kahan was the most prominent figure in the committee meetings. He was not only a professor of math and computer science from UC Berkeley, he was also a consultant to Intel and involved in the design of their math coprocessor, the 8087. Some of Velvel's students, not only from the campus, but also who had graduated and were now working for some of the participating companies, were involved.

A proposed standard, written by Kahan, one of his students at Berkeley, Jerome Coonen, and a visiting professor at Berkeley, Harold Stone, known as the KCS draft, reflected the Intel design and was the basis for much of the committee's work.

The committee met frequently, usually in the evening in conference rooms of companies on the San Francisco peninsula. There were also meetings in Austin, Texas, and on the East Coast. The meetings often lasted until well after midnight.

Membership was based on regular attendance. I was personally involved only when I was visiting Stanford, so I was not an official member. But I do remember telling a colleague from Sweden who was coming to the western United States for the first time that there were three sites that he had to be sure to see: the Grand Canyon, Las Vegas, and the IEEE Floating Point Committee.

The denormals in the KCS draft were something new to most of the committee. Velvel said he had experimented with them at the University of Toronto, but that was all. Standards efforts are intended to regularize existing practice, not introduce new designs. Besides, implementing denormals would reguire additional hardware, and additional transistors were a valuable resource in emerging designs. Some experts claimed that including denormals would slow down all floating point arithmetic.

A mathematician from DEC, Mary Payne, led the opposition to KCS. DEC wanted a looser standard that would embrace the floating point that was already available on their VAX. The VAX format was similar to, but not the same as, the KCS proposal. And it did not include denormals.

Discussion went on for a couple of years. Letters of support for KCS from Don Knuth and Jim Wilkinson did not settle the matter. Finally, DEC engaged G. W. (Pete) Stewart, from the University of Maryland. In what must have been a surprise to DEC, Pete also said he thought that the KCS proposal was a good idea. Eventually the entire committee voted to accept a revised version.

Denormal floating point numbers are still the unwanted step children in today's floating point family. I think it is fair to say that the numerical analysis community has failed to make a strong argument for their importance. It is true that they do make some floating point error analyses more elegant. But with magnitudes around $10^{-308}$ the double precision denormals are hardly ever numerically significant in actual computation. Only the single precision denormals around $10^{-38}$ are potentially important.

Outside of MATLAB itself, we encounter processors that have IEEE format for the floating point numbers, but do not conform to the 754 standard when it comes to processing. These processors usually flush underflow to zero and so we can expect different numerical results for any calculations that might ordinarily produce denormals.

We still see processors today that handle denormals with microcode or software. Execution time of MATLAB programs that encounter denormals can degrade significantly on such processors.

The Wikipedia page on denormals has the macros for setting trap handlers to flush underflows to zero in C or Java programs. I hate to think what might happen to MATLAB Mex files with such macros. Kids, don't try this at home.

William Kahan, Home Page.

Charles Severance, An Interview with the Old Man of Floating-Point, Reminiscences for *IEEE Computer*, February 20, 1998.

Wikipedia page, Denormal number.

Get
the MATLAB code

Published with MATLAB® R2014a

Today I looked around the File Exchange for recent image processing submissions or updates. (You can do this yourself by searching using the tag image processing.) Here are a few things that caught my eye.

getTimeStamp, by Christopher MacMinn, is an example of someone using the recently introduced linking between the File Exchange and GitHub. This is a straightforward function that illustrates how to use `imfinfo` to read useful metadata from a file. In particular, it shows how to extract the information that digital cameras stick into image files these days.

The critical part of the code looks something like this (simplified a bit):

info = imfinfo(filename); if isfield(info,'DigitalCamera') % There is digital camera information in the file. % Get the time the picture was taken (as a string). str = info.DigitalCamera.DateTimeOriginal; end

If you have data that originally came from ImageJ, you might find Dylan Muir's ReadImageJROI function to be useful. Dylan originally submitted this in 2011, and he recently updated it to fix a couple of issues. The file is regularly downloaded and has received several good ratings.

PIVlab is an impressive-looking application for partical image velocimetry created by William Thielicke. This contribution gets a LOT of traffic and has very high ratings. William has updated it regularly, and he has been responsive in the comment thread.

Sometimes the author of a research paper will submit the MATLAB implementation of the paper's method to the File Exchange. That appears to be the case with An improved computer vision method for white blood cells detection (using differential evolution) by Erik. I'm guessing that Erik is Erik Cuevas, the lead author of the paper quoted in the submission's description.

I would be very interested to hear your opinions about the most useful image processing submissions on the File Exchange. Please post your comments here.

Get
the MATLAB code

Published with MATLAB® R2014a

Greg's pick this week is module - encapsulate MATLAB package into a name space module by Daniel Dolan.

The concept of namespaces has been available in MATLAB for a number of years at this point. While the technology is very mature at this point, a common feature of namespaces was not included in the MATLAB implementation. You cannot call one function in the same namespace from another function in the same namespace without including the namespace definition somewhere. Most namespaces in other languages permit functions to call each other within the namespace without needing to identify the namespace.

Daniel’s “MODULE” function provides an interesting means to work around that constraint using a method that demonstrates the powerful concept of function handles in MATLAB.

The nature of Daniel’s solution also provides a means to create namespace specifiers, without needing to continually reference the full namespace.

Finally Daniel makes it possible to have what one might call “dynamic namespaces”.

I selected this submission because it demonstrates the powerful feature of function handles, advertises the use of namespaces in MATLAB, and uses the “DBSTACK” function in a way that surprised me.

Yes, it happens. There’s what appears to be a bug in this File Exchange submission. Change line *110* from

`root=sprintf('%s.',root{:});`

to

`root=sprintf('%s.',root{end:-1:1});`

The order in which package namespaces is being constructed is backward. Changing this line reverses the construction and resolves the issue.

If you’re not familiar with namespaces, don’t fret. It is an advanced software engineering concept. I work with a lot of software engineers all over the world, and many of them are not familiar with namespaces because they work in coding languages or environments that don’t have them (such as C).

A namespace provides a means to specifically call unique functions that share the same name. I like to think of a namespace as a context or a classifier for a function.

Let’s say you have two functions, each named “myDisp”, but they have different behavior.

`>> myDisp('Hello')`

My Display 1 Hello

Versus

`>> myDisp('Hello')`

My Display 2 ------------- Hello

How do you invoke the appropriate *myDisp* at the correct time?

Perhaps you could give the functions different names: *myDisp* and *myDisp2*.

But if you’re concerned about modularity, and these functions are used independently in other projects and code files, changing the name of the functions will be problematic.

What if instead you could include an identifier that would make the call to the *myDisp* function more specific?

>> myFirstNamespace.myDisp('Hello') >> myOtherNamespace.myDisp('Hello')

Now both functions can be available, without introducing a name conflict, or needing to change the name of the function.

To accomplish this in MATLAB, you need to use package folders

While namespaces provide a means to be more specific about function calls, they can be inconvenient when it comes to actually writing them out, and reading them in the code file. Especially as the namespaces get longer.

`>> myMainNamespace.myOtherSubNamespace.myOtherNamespace.myDisp('Hello')`

This seems hard to read, and a pain to write.

We can use the IMPORT function to handle this:

```
>> import myMainNamespace.mySubNamespace.myFirstNamespace.*
>> myDisp('hello')
```

My Display 1 Hello

Now contents of the myFirstNamespace can be referenced without using the full namespace notation. But if you import two functions with the same name but live in different namespaces, you are back to the same problem that namespaces try to solve! You can partially import namespaces, but this gets tedious.

The MODULE function makes it possible to refer to functions in namespace by essentially providing an alias for the namespace. This means you can refer to functions deep within a namespace without needing to write out the full namespace, or have the potential conflict issues using the IMPORT function.

>> myLocalSpace = module('myMainNamespace.mySubNamespace.myFirstNamespace'); >> myLocalSpace.myDisp('Hello')

>> myOtherLocalSpace = module('myMainNamespace.myOtherSubNamespace.myOtherNamespace'); >> myOtherLocalSpace.myDisp('Hello')

Here the calls to the functions in the namespace are concise and specific to the desired function.

This also allows you to do fun things like make the namespace dynamic

if (a < b) myLocalSpace = module('myMainNamespace.mySubNamespace.myFirstNamespace'); else myLocalSpace = module('myMainNamespace.myOtherSubNamespace.myOtherNamespace'); end myLocalSpace.myDisp('Hello')

I cannot say at this point whether or not having dynamic namespaces is a good idea, but it is certainly an interesting concept.

In MATLAB, if a function in a namespace calls another function in the same namespace, it must refer to the full namespace of the function being called.

function myDisp( displayInput ) displayHeader = myMainNamespace.myOtherSubNamespace.myOtherNamespace.otherFunction; disp(displayHeader); disp(displayInput); end

This is fine, until later when you wrap the existing namespace in another parent namespace, and you must go into all of the functions and append the new parent namespace. If you have hundreds of functions, this is going to get old fast.

Daniel provides a really neat mechanism to get around this constraint. Calling MODULE with no inputs assumes the namespace you want is relative to the current function.

function myDisp( displayInput ) localNamespace = module; displayHeader = localNamespace.otherFunction(); disp(displayHeader); disp(displayInput); end

Now, even if the structure of the namespace changes, you can still call the functions local to the namespace, because the
local namespace is automatically updated when the *myDisp* function is called.

This is the part I think is really elegant.

In order to make this work, Daniel takes advantage of function handles. Function handles are pointers or aliases to functions. This means you can use a variable to store a reference to a function.

Daniel creates a collection of these function handles by creating function handles for each of the functions in a specific package folder. Then he stores these function handles in a MATLAB struct. Subreferencing the struct using the dot notation (struct.field) will now act like referencing a function in a namespace. How cool is that!

Finally, to get around the constraint MATLAB imposes of functions in a namespace referencing other functions in the same namespace, Daniel uses the DBSTACK function.

I’m pretty sure he’s using DBSTACK in a way it was never intended, but I think it’s absolutely brilliant!

Daniel uses the DBSTACK function to determine what function called MODULE, so he can then figure out what namespace the function belongs to.

While there may be some performance implications from using DBSTACK, it is an excellent demonstration of the flexibility of the MATLAB environment. You can ask a function “Who called you?” I’ve been working with MATLAB over 15 years now, and I didn’t realize you could do that!

If you would like to leave any comments regarding this post, please click here.

Get
the MATLAB code

Published with MATLAB® R2014a

Today I set out to write a post called "Radon transform - under the hood."

Things didn't turn out the way I expected.

When I considered how to write the post I originally intended, I started thinking about how it would be nice to have a good way to visualize (and therefore help understand) the Radon transform itself. So I started working on that instead.

Of course, I didn't have a lot of time today, so I didn't get anywhere near finished with a working app. But I thought I would post anyway!

I'm inviting you, dear reader, to weigh in about what you think would make a good Radon transform visualizer.

To get you started, here's my sketch:

On the left, you can see a simple image (light square on dark background), and you can see the Radon projection along one particular angle (45 degrees).

On the right, you can see the entire Radon transform, displayed as an image. You can also see a vertical dashed line at 45 degrees, which corresponds to the projection angle shown on the left.

I imagine you could interactively drag either the projection axis on the left or the vertical line on the right to change the projection angle and watch the projection change.

So, what do you think? Good idea, bad idea? Any suggestions?

Get
the MATLAB code

Published with MATLAB® R2014a

Collecting and tracking health and fitness data with wearable devices is about to go mainstream as the smartphone giants like Apple, Google and Samsung jump into the fray. But if you collect data, what's the point if you don't analyze it?

Today's guest blogger, Toshi Takeuchi, would like to share an analysis of a weight lifting dataset he found in a public repository.

- Motivation, dataset, and prediction accuracy
- Data preprocessing and exploratory analysis
- Predictive Modeling with Random Forest
- Plot misclassification errors by number of trees
- Variable Importance
- Evaluate trade-off with ROC plot
- The reduced model with 12 features
- Conclusion and the next steps - integrate your code into your app

The Human Activity Recognition (HAR) Weight Lifting Exercise Dataset provides measurements to determine "how well an activity was performed". 6 subjects performed 1 set of 10 Unilateral Dumbbell Biceps Curl in 5 different ways.

When I came across this dataset, I immediately thought of building a mobile app to advise end users whether they are performing the exercise correctly, and if not, which common mistakes they are making. I used the powerful 'Random Forest' algorithm to see if I could build a successful predictive model to enable such an app. I was able to achieve **99% prediction accuracy** with this dataset and I would like to share my results with you.

The dataset provides 39,242 samples with 159 variables labeled with 5 types of activity to detect - 1 correct method and 4 common mistakes:

- exactly according to the specification (Class A)
- throwing the elbows to the front (Class B)
- lifting the dumbbell only halfway (Class C)
- lowering the dumbbell only halfway (Class D)
- throwing the hips to the front (Class E)

Sensors were placed on the subjects' belts, armbands, glove and dumbbells, as described below:

**Citation** *Velloso, E.; Bulling, A.; Gellersen, H.; Ugulino, W.; Fuks, H. Qualitative Activity Recognition of Weight Lifting Exercises. Proceedings of 4th International Conference in Cooperation with SIGCHI (Augmented Human '13) . Stuttgart, Germany: ACM SIGCHI, 2013.* Read more: http://groupware.les.inf.puc-rio.br/har#ixzz34dpS6oks

Usually you cannot use raw data directly. Preprocessing is an important part of your analysis workflow that has significant downstream impact.

- Load the dataset and inspect data for missing values
- Partition the dataset for cross validation
- Clean and normalize variables
- Select predictor variables (features)

Among those steps, cross validation is a key step specific to predictive modeling. Roughly speaking, you hold out part of available data for testing later, and build models using the remaining dataset. The held out set is called the 'test set' and the set we use for modeling is called the 'training set'. This makes it more difficult to overfit your model, because you can test your model against the data you didn't use in the modeling process, giving you a realistic idea how the model would actually perform with unknown data.

Exploratory analysis usually begins with visualizing data to get oriented with its nuances. Plots of the first four variables below show that:

- data is organized by class - like 'AAAAABBBBBCCCCC'. This can be an artifact of the way the data was collected and real life data may not be structured like this, so we want to use more realistic data to build our model. We can fix this issue by randomly reshuffling the data.
- data points cluster around a few different mean values - indicating that measurements were taken by devices calibrated in a few different ways
- those variables exhibit a distinct pattern for Class E (colored in magenta) - those variables will be useful to isolate it

Review `preprocess.m` if you are curious about the details.

preprocess

The dataset has some issues with calibration. We could further preprocess the data in order to remove calibration gaps. This time, however, I would like to use a flexible predictive algorithm called Random Forest. In MATLAB, this algorithm is implemented in the TreeBagger class available in Statistics Toolbox.

Random Forest became popular particularly after it was used by a number of winners in Kaggle competitions. It uses a large ensemble of decision trees (thus 'forest') trained on random subsets of data and uses the majority votes of those trees to predict the result. It tends to produce a highly accurate result, but the complexity of the algorithm makes it slow and difficult to interpret.

To accelerate the computation, I will enable the parallel option supported by Parallel Computing Toolbox. You can comment out unnecessary code if you don't use it.

Once the model is built, you will see the confusion matrix that compares the actual class labels to predicted class labels. If everything lines up on a diagonal line, then you achieved 100% accuracy. Off-diagonal numbers are misclassification errors.

The model has a very high prediction accuracy even though we saw earlier that our dataset had some calibration issues.

Initialize parallel option - comment out if you don't use parallel

poolobj = gcp('nocreate'); % don't create a new pool even if no pool exits if isempty(poolobj) parpool('local',2); end opts = statset('UseParallel',true);

Starting parallel pool (parpool) using the 'local' profile ... connected to 2 workers.

Create a Random Forest model with 100 trees, parallel enabled...

rfmodel = TreeBagger(100,table2array(Xtrain),Ytrain,... 'Method','classification','oobvarimp','on','options',opts);

Here's the non-parallel version of the same model rfmodel = TreeBagger(100,table2array(Xtrain),Ytrain,... 'Method','classification','oobvarimp','on');

Predict the class labels for the test set.

[Ypred,Yscore]= predict(rfmodel,table2array(Xtest));

Compute the confusion matrix and prediction accuracy.

C = confusionmat(Ytest,categorical(Ypred)); disp(array2table(C,'VariableNames',rfmodel.ClassNames,'RowNames',rfmodel.ClassNames)) fprintf('Prediction accuracy on test set: %f\n\n', sum(C(logical(eye(5))))/sum(sum(C)))

A B C D E ____ ___ ___ ___ ___ A 1133 1 0 0 0 B 4 728 1 0 0 C 0 3 645 3 0 D 0 0 8 651 0 E 0 0 0 6 741 Prediction accuracy on test set: 0.993374

I happened to pick 100 trees in the model, but you can check the misclassification errors relative to the number of trees used in the prediction. The plot shows that 100 is an overkill - we could use fewer trees and that will make it go faster.

figure plot(oobError(rfmodel)); xlabel('Number of Grown Trees'); ylabel('Out-of-Bag Classification Error');

One major criticism of Random Forest is that it is a black box algorithm and it not easy to understand what it is doing. However, Random Forest can provide the variable importance measure, which corresponds to the change in prediction error with and without the presence of a given variable in the model.

For our hypothetical weight lifting trainer mobile app, Random Forest would be too cumbersome and slow to implement, so you want to use a simpler prediction model with fewer predictor variables. Random Forest can help you with selecting which predictors you can drop without sacrificing the prediction accuracy too much.

Let's see how you can do this with `TreeBagger`.

Get the names of variables

vars = Xtrain.Properties.VariableNames;

Get and sort the variable importance scores. Because we turned `'oobvarimp'` to `'on'`, the model contains `OOBPermutedVarDeltaError` which acts as the variable importance measure.

varimp = rfmodel.OOBPermutedVarDeltaError'; [~,idxvarimp]= sort(varimp); labels = vars(idxvarimp);

Plot the sorted scores.

figure barh(varimp(idxvarimp),1); ylim([1 52]); set(gca, 'YTickLabel',labels, 'YTick',1:numel(labels)) title('Variable Importance'); xlabel('score')

Now let's do the trade-off between the number of predictor variables and prediction accuracy. The Receiver operating characteristic (ROC) plot provides a convenient way to visualize and compare performance of binary classifiers. You plot the false positive rate against the true positive rate at various prediction thresholds to produce the curves. If you get a completely random result, the curve should follow a diagonal line. If you get a 100% accuracy, then the curve should hug the upper left corner. This means you can use the area under the curve (AUC) to evaluate how well each model performs.

Let's plot ROC curves with different sets of predictor variables, using the "C" class as the positive class, since we can only do this one class at a time, and the previous confusion matrix shows more misclassification errors for this class than others. You can use `perfcurve` to compute ROC curves.

Check out `myROCplot.m` to see the details.

```
nFeatures = [3,5,10,15,20,25,52];
myROCplot(Xtrain,Ytrain,Xtest,Ytest,'C',nFeatures,varimp)
```

Based on the previous analysis, it looks like you can achieve a high accuracy rate even if you use as few as 10 features. Let's say we settled for 12 features. We now know you don't have to use the data from the glove for prediction, so that's one less sensor our hypothetical end users would have to buy. Given this result, I may even consider dropping the arm band, and just stick with the belt and dumbbell sensors.

Show which features were included.

disp(table(varimp(idxvarimp(1:12)),'RowNames',vars(idxvarimp(1:12)),... 'VariableNames',{'Importance'}));

Importance __________ accel_belt_y 0.69746 pitch_dumbbell 0.77255 accel_arm_z 0.78941 accel_belt_x 0.81696 magnet_arm_y 0.81946 accel_arm_x 0.87168 magnet_arm_x 0.87897 accel_dumbbell_x 0.92222 magnet_forearm_x 1.0172 total_accel_belt 1.0461 gyros_arm_z 1.1077 gyros_belt_x 1.1235

Shut down the parallel pool.

delete(poolobj);

Despite my initial misgivings about the data, we were able to maintain high prediction accuracy with a Random Forest model with just 12 features. However, Random Forest is probably not an ideal model to implement on a mobile app given its memory foot print and slow response time.

The next step is to find a simpler model, such as logistics regression, that can perform decently. You may need to do more preprocessing of the data to make it work.

Finally, I have never tried this before, but you could generate C code out of MATLAB to incorporate it into a mobile app. Watch this webinar, MATLAB to iPhone Made Easy, for more details.

Do you track your activities with wearable devices? Have you tried analyzing the data? Tell us about your experience here!

Get
the MATLAB code

Published with MATLAB® R2014a

This summer my mother-in-law is renting a house on a lake in New Hampshire. Looking at the calendar, my wife said: "The ten-day forecast makes it look like it's going to be pretty hot up at the lake next week." This led to a more general discussion of the merits of ten-day forecasts.

It's funny how we can make decisions based on long-term predictions of weather even though we rarely go back and verify that the forecast was any good. Somehow the fact that the forecast exists at all gives it value. I'm left pondering this question: how much should we trust a ten-day prediction? As it happens, I have some data that can be useful here. For some time, I have been collecting some relevant data on Trendy: the ten day forecast for Natick, Massachusetts (hometown for MathWorks). So let's run some numbers.

Here's the trend: Ten Day Forecast Highs for Natick, MA.

Once a day this trend collects ten data points: today's high temperature and the predicted high temperature for the next nine days. In MATLAB, we'll be working with a matrix with one row for each day and ten columns.

Let's get the data into MATLAB so we can play around with it. I can retrieve (and so can you) the data from Trendy as a JSON object using the following call:

http://www.mathworks.com/matlabcentral/trendy/trends/1655/trend_data.json

In order to read this into MATLAB, I'm going to use Joe Hicklin's JSON parser.

```
url = 'http://www.mathworks.com/matlabcentral/trendy/trends/1655/trend_data.json';
json = urlread(url);
raw = JSON.parse(json);
```

t = zeros(length(raw),1); d = zeros(length(raw),10); for i = 1:length(raw) t(i) = raw{i}{1}; predictions = raw{i}{2}; for j = 1:10 d(i,j) = str2num(predictions{j}); end end firstTenRows = d(1:10,:)

firstTenRows = 44 51 59 66 46 48 53 50 51 54 50 58 63 49 48 52 48 46 57 54 59 61 49 47 53 49 48 43 41 48 62 49 48 54 49 39 39 44 47 46 49 48 54 50 40 38 39 47 51 54 47 55 50 39 40 48 52 53 53 53 54 50 40 39 48 53 54 52 52 50 49 40 38 50 55 54 56 56 53 49 40 39 50 56 54 52 56 54 47 43 39 50 55 55 55 59 58 40 41 46

Now I have a temperature prediction matrix that's structured like this.

I want to re-order this matrix so that each line shows the prediction trajectory for a single day in time. That means picking off the diagonals highlighted in the diagram above. So let's write some code that does this shift. I'm going to end up with two new matrices, d1 and d2

rowIndex = 1:10; colIndex = 10:-1:1; sz = size(d); len = (size(d,1)-10); d1 = zeros(len,10); d2 = zeros(len,10); t1 = zeros(len,1); for i = 1:len ind = sub2ind(sz,rowIndex+i-1,colIndex); trend = d(ind); d1(i,:) = trend; d2(i,:) = trend-trend(end); t1(i) = t(i+9); end firstTenRows = d1(1:10,:)

firstTenRows = 54 57 43 39 38 40 39 38 39 39 54 41 44 39 48 48 50 50 50 49 48 47 47 52 53 55 56 55 54 57 46 51 53 54 54 54 55 55 57 58 54 53 52 56 52 55 57 57 60 58 53 52 56 56 59 59 61 63 63 64 50 53 54 58 45 42 45 44 44 43 49 47 40 39 37 42 43 43 43 43 43 41 41 42 44 48 47 49 48 47 46 44 48 49 46 52 51 50 49 48

In d1, each row is the evolving temperature prediction for each day. So when we plot the first row of d1, we're getting the predictive arc for November 13th of last year.

i = 1; plot(-9:0,d1(i,:)) title(sprintf('Predicted Temperature for %s',datestr(t1(i),1))) xlabel('Time of Prediction (Offset in Days)') ylabel('Predicted Temperature (Deg. F)')

In d2, we just subtract from each row the last value in each row. Since this last value is the final (and presumably correct) temperature, this difference gives us the predictive error across the ten days. Here's the error for the November 13th prediction.

i = 1; plot(-9:0,d2(i,:)) title(sprintf('Error in Predicted Temperature for %s',datestr(t1(i),1))) xlabel('Time of Prediction (Offset in Days)') ylabel('Prediction Error (Deg. F)')

Notice that it shrinks to zero over time. That's good! Our predictions get more accurate as we approach the actual day in question. But the early predictions were off by as much as 18 degrees. Is that good or bad? You tell me.

Now let's look at all the days.

plot(-9:0,d2','Color',[0.5 0.5 1]) title('Error in Predicted Temperature') xlabel('Time of Prediction (Offset in Days)') ylabel('Prediction Error (Deg. F)')

It's hard to get a sense of the error distribution. So let's finish with a histogram of the absolute value of the error. Out of 240 measurements in this data set, the median error for a ten-day prediction is six degrees.

hist(abs(d2(:,1)),1:25) title('Histogram of Error in the Ten-Day Forecast') xlabel('Error (deg. F)') ylabel('Number of Samples')

That seems pretty good. Most of the time that error is going to be less than seven or so degrees Fahrenheit (or four degrees Celsius). I probably don't need to pack a sweater for the weekend at the lake.

Get
the MATLAB code

Published with MATLAB® R2014a

I'm going to play a small trick on you today. Try reading in this JPEG file using `imread`:

```
url = 'http://blogs.mathworks.com/images/steve/2014/peppers.jpg';
rgb = imread(url);
imshow(rgb)
```

So what's the trick? Well, look more closely at this file using `imfinfo`:

info = imfinfo(url)

info = Filename: 'http://blogs.mathworks.com/images/steve/2014/...' FileModDate: '11-Jul-2014 14:46:33' FileSize: 287677 Format: 'png' FormatVersion: [] Width: 512 Height: 384 BitDepth: 24 ColorType: 'truecolor' FormatSignature: [137 80 78 71 13 10 26 10] Colormap: [] Histogram: [] InterlaceType: 'none' Transparency: 'none' SimpleTransparencyData: [] BackgroundColor: [] RenderingIntent: [] Chromaticities: [] Gamma: [] XResolution: [] YResolution: [] ResolutionUnit: [] XOffset: [] YOffset: [] OffsetUnit: [] SignificantBits: [] ImageModTime: '16 Jul 2002 16:46:41 +0000' Title: [] Author: [] Description: 'Zesty peppers' Copyright: 'Copyright The MathWorks, Inc.' CreationTime: [] Software: [] Disclaimer: [] Warning: [] Source: [] Comment: [] OtherText: []

See it yet? No?

Look at the Format field:

info.Format

ans = png

The function `imfinfo` is claiming this this JPEG file is really a PNG file, which is a completely different image file format!

So what's going on here? Is this a JPEG file or not?

This trick question is really just an excuse to peek under the hood of `imread` to see how an interesting piece of it works. (Well, it's interesting to me, at least.)

Before opening the hood, though, let's try reading one more file. And notice that this filename has an extension that has nothing to do with any particular image format.

```
url2 = 'http://blogs.mathworks.com/images/steve/2014/peppers.fruit_study_2014_Jul_11';
rgb2 = imread(url2);
imshow(rgb2)
```

OK, so `imread` can successfully read this image file even without an extension indicating its format.

If you're curious, a lot of the `imread` code that makes all this work is available for you to look at in your installation of MATLAB. (If, on the other hand, you're not curious, then this would be a good time to go over and read Cleve's blog instead.) For example, you can view the source code for `imread` in the MATLAB Editor by typing `edit imread`. (Please don't modify the code, though!)

Here's a partial fragment of code near the top:

if (isempty(fmt_s)) % The format was not specified explicitly.

... snip ...

% Try to determine the file type. [format, fmt_s] = imftype(filename);

Hmm, what's that function `imftype`?

```
which imftype
```

'imftype' not found.

It doesn't appear to exist!

It does exist, but it happens to be a private function. The function `which` will find it if you tell `which` to look at little harder.

which -all imftype

/Applications/MATLAB_R2014a.app/toolbox/matlab/imagesci/private/imftype.m % Private to imagesci

Even though you can't directly call this function (that's what *private* means here), you can still look at it in the MATLAB Editor by typing `edit private/imftype`.

Here's some code from near the beginning of `imftype`.

idx = find(filename == '.'); if (~isempty(idx)) extension = lower(filename(idx(end)+1:end)); else extension = ''; end

% Try to get useful imformation from the extension.

if (~isempty(extension))

% Look up the extension in the file format registry. fmt_s = imformats(extension);

if (~isempty(fmt_s))

if (~isempty(fmt_s.isa))

% Call the ISA function for this format. tf = feval(fmt_s.isa, filename);

if (tf)

% The file is of that format. Return the ext field. format = fmt_s.ext{1}; return;

end end end end

In English: If the filename has an extension on it, use the `imformats` function to get a function that can test to see whether the file really has that format.

So what's that new function `imformats` in the middle there? Well, you can call this one directly. Try it.

s = imformats

s = 1x19 struct array with fields: ext isa info read write alpha description

That output is not very readable. If we were designing this today, we'd probably make `imformats` return a table. Fortunately we've got an easy way to convert a struct array into table!

t = struct2table(s)

t = ext isa info read write alpha __________ _______ ___________ _________ _________ _____ {1x1 cell} @isbmp @imbmpinfo @readbmp @writebmp 0 {1x1 cell} @iscur @imcurinfo @readcur '' 1 {1x2 cell} @isfits @imfitsinfo @readfits '' 0 {1x1 cell} @isgif @imgifinfo @readgif @writegif 0 {1x1 cell} @ishdf @imhdfinfo @readhdf @writehdf 0 {1x1 cell} @isico @imicoinfo @readico '' 1 {1x2 cell} @isjp2 @imjp2info @readjp2 @writej2c 0 {1x1 cell} @isjp2 @imjp2info @readjp2 @writejp2 0 {1x2 cell} @isjp2 @imjp2info @readjp2 '' 0 {1x2 cell} @isjpg @imjpginfo @readjpg @writejpg 0 {1x1 cell} @ispbm @impnminfo @readpnm @writepnm 0 {1x1 cell} @ispcx @impcxinfo @readpcx @writepcx 0 {1x1 cell} @ispgm @impnminfo @readpnm @writepnm 0 {1x1 cell} @ispng @impnginfo @readpng @writepng 1 {1x1 cell} @ispnm @impnminfo @readpnm @writepnm 0 {1x1 cell} @isppm @impnminfo @readpnm @writepnm 0 {1x1 cell} @isras @imrasinfo @readras @writeras 1 {1x2 cell} @istif @imtifinfo @readtif @writetif 0 {1x1 cell} @isxwd @imxwdinfo @readxwd @writexwd 0 description __________________________________ 'Windows Bitmap' 'Windows Cursor resources' 'Flexible Image Transport System' 'Graphics Interchange Format' 'Hierarchical Data Format' 'Windows Icon resources' 'JPEG 2000 (raw codestream)' 'JPEG 2000 (Part 1)' 'JPEG 2000 (Part 2)' 'Joint Photographic Experts Group' 'Portable Bitmap' 'Windows Paintbrush' 'Portable Graymap' 'Portable Network Graphics' 'Portable Any Map' 'Portable Pixmap' 'Sun Raster' 'Tagged Image File Format' 'X Window Dump'

Now we've gotten to some interesting stuff! This table represents the guts of how `imread`, `imfinfo`, and `imwrite` knows how to deal with the many different image file formats supported.

If you pass an extension to `imformats`, it looks through file formats it knows about to see if it matches a standard one.

```
imformats('jpg')
```

ans = ext: {'jpg' 'jpeg'} isa: @isjpg info: @imjpginfo read: @readjpg write: @writejpg alpha: 0 description: 'Joint Photographic Experts Group'

Let's go back to the original file peppers.jpg and consider what happens in the code we've seen so far.

1. We did not specify the format explicitly (with a 2nd argument) when we called `imread`, so `imread` called `imftype` to determine the format type.

2. The function `imftype` found an extension ('.jpg') at the end of the filename, so it asked the `imformats` function about the extension, and `imformats` returned a set of function handles useful for doing things with JPEG files. One of the function handles, `@isjpg`, tests to see whether a file is a JPEG file or not.

To be completely truthful, `@isjpg` just does a quick check based only on the first few bytes of the file. Look at the code by typing `edit private/isjpg`. Here are the key lines.

fid = fopen(filename, 'r', 'ieee-le'); assert(fid ~= -1, message('MATLAB:imagesci:validate:fileOpen', filename)); sig = fread(fid, 2, 'uint8'); fclose(fid); tf = isequal(sig, [255; 216]);

*OK, I have now taught you enough so that you can thoroughly confuse imread if you really want to. But don't you already have enough hobbies?*

3. In this case, the file wasn't actually a JPEG file, so the function handle `@isjpg` returned 0 (false).

That brings us to the rest of the excitement in `imftype`.

% Get all formats from the registry. fmt_s = imformats;

% Look through each of the possible formats. for p = 1:length(fmt_s) % Call each ISA function until the format is found. if (~isempty(fmt_s(p).isa)) tf = feval(fmt_s(p).isa, filename); if (tf) % The file is of that format. Return the ext field. format = fmt_s(p).ext{1}; fmt_s = fmt_s(p); return end else warning(message('MATLAB:imagesci:imftype:missingIsaFunction')); end end

In English: For every image file format we know about, run the corresponding `isa` function handle on the file. If one of the `isa` functions returns true, then return the corresponding set of information from `imformats`.

Back to the story for our misnamed image file peppers.jpg. The `@isjpg` function handle returned false for it. So `imftype` then tried the `isa` function handles for every image file format. One of them, `@ispng`, returned 1 (true). That information was passed back up to `imread`, which then read the file successfully as a PNG, which was the file's true format type.

Finally, here's what happened for the image file peppers.2014_Jul_11. When `imftype` passed the extension `'2014_Jul_11'` to `imformats`, no such image format extension was found, so `imformats` returned empty. That caused `imftype` to go into the code that simply tried every image format it knew about, which again worked when it got to PNG.

Phew! That's the story of the effort `imread` makes to read your images in correctly.

For the three of you that are still reading along, I'll send a t-shirt to the first one to post a convincingly complete explanation of this line of code from above:

extension = lower(filename(idx(end)+1:end));

Get
the MATLAB code

Published with MATLAB® R2014a

**The SolidWorks Assembly**

Let's start by creating an assembly of a double pendulum in SolidWorks. It consists of 2 identical links and 2 identical pins. These components are assembled by defining mates between them. For example to create a revolute joint between two parts in SolidWorks, we can constrain their axes and make them coincident:

I recommend looking at the Mates and Joints documentation page for more details on how SolidWorks mates are translated into SimMechanics Joints. For our example, here is how the final assembly looks:

**Exporting the SolidWorks assembly to SimMechanics**

This process requires the use of SimMechanics Link. Once SimMechanics Link is downloaded successfully, you need to Install and Register it with SolidWorks.

**Note: SimMechanics Link can also be used to interface with PTC Creo and the Autodesk Inventor.*

After registering SimMechanics Link, a new menu is available in SolidWorks. We can now export the assembly:

This generates an XML file for the assembly, well as STL files for the geometry. The XML file contains assembly structure and part parameters required to generate an equivalent SimMechanics model, such as reference frames, mass and inertia, color, and location of part STL files.

**Create SimMechanics model**

Once the XML file is exported from SolidWorks, we can import it in Simulink using `smimport`:

`smimport('DoublePendulum.xml')`

and we have a model:

**Do what you came here for!**

At this point, you can combine any feature available in Simulink with your mechanism. For example, we can design a controller for the base joint using the automated PID Tuning capability of Simulink Control Design

I modify the base joint to measure its position and apply a torque, and connect the PID block:

After going through the auto tuning procedure, we can see the resulting motion:

**What if you’re not using SolidWorks?**

As mentionned previously, SimMechanics Link can also be used to interface with PTC Creo and Autodesk Inventor. If you are designing mechanical systems using a different CAD tool, I recommend looking at the MATLAB Central submission CAD to MATLAB to SimMechanics by my colleague Steve Miller. Using the technique in this submission it might be possible to write your own adapter to import assemblies from other CAD software without too much work.

**Now it's your turn**

Here is one more animation from the Stewart Platform example, just to highlight that pretty complex assemblies can be imported.

Are you already importing CAD assemblies to SimMechanics? If yes, let us know what kind of analysis or design task you do with the assembly once in Simulink by leaving a comment here.

]]>This is the first part of a two-part series about the single- and double precision floating point numbers that MATLAB uses for almost all of its arithmetic operations. (This post is adapted from section 1.7 of my book Numerical Computing with MATLAB, published by MathWorks and SIAM.)

If you look carefully at the definitions of fundamental arithmetic operations like addition and multiplication, you soon encounter the mathematical abstraction known as real numbers. But actual computation with real numbers is not very practical because it involves notions like limits and infinities. Instead, MATLAB and most other technical computing environments use floating point arithmetic, which involves a finite set of numbers with finite precision. This leads to the phenomena of *roundoff*, *underflow*, and *overflow*. Most of the time, it is possible to use MATLAB effectively without worrying about these details, but, every once in a while, it pays to know something about the properties and limitations of floating point numbers.

Thirty years ago, the situation was far more complicated than it is today. Each computer had its own floating point number system. Some were binary; some were decimal. There was even a Russian computer that used trinary arithmetic. Among the binary computers, some used 2 as the base; others used 8 or 16. And everybody had a different precision. In 1985, the IEEE Standards Board and the American National Standards Institute adopted the ANSI/IEEE Standard 754-1985 for Binary Floating-Point Arithmetic. This was the culmination of almost a decade of work by a 92-person working group of mathematicians, computer scientists, and engineers from universities, computer manufacturers, and microprocessor companies.

A revision of IEEE_754-1985, known as IEEE 754-2008, was published in 2008. The revision combines the binary standards in 754 and the decimal standards in another document, 854. But as far as I can tell, the new features introduced in the revision haven't had much impact yet.

William M. Kahan was the primary architect of IEEE floating point arithmetic. I've always known him by his nickname "Velvel". I met him in the early 1960's when I was a grad student at Stanford and he was a young faculty member at the University of Toronto. He moved from Toronto to UC Berkeley in the late 1960's and spent the rest of his career at Berkeley. He is now an emeritus professor of mathematics and of electrical engineering and computer science.

Velvel is an eminent numerical analyst. Among many other accomplishments, he is the developer along with Gene Golub of the first practical algorithm for computing the singular value decomposition.

In the late 1970's microprocessor manufacturers including Intel, Motorola, and Texas Instruments were developing the chips that were to become the basis for the personal computer and eventually today's electronics. In a remarkable case of cooperation among competitors, they established a committee to develop a common standard for floating point arithmetic.

Velvel was a consult to Intel, working with John Palmer, a Stanford grad who was the principal architect of the 8087 math coprocessor. The 8087 accompanied the 8086, which was to become the CPU of the IBM PC. Kahan and Palmer convinced Intel to allow them to release the specs for their math coprocessor to the IEEE committee. These plans became the basis for the standard.

Velvel received the ACM Turing Award in 1989 for "his fundamental contributions to numerical analysis". He was named an ACM Fellow in 1994, and was inducted into the National Academy of Engineering in 2005.

All computers designed since 1985 use IEEE floating point arithmetic. This doesn't mean that they all get exactly the same results, because there is some flexibility within the standard. But it does mean that we now have a machine-independent model of how floating point arithmetic behaves.

MATLAB has traditionally used the IEEE double precision format. In recent years, the single precision format has also been available. This saves space and can sometimes be faster. There is also an extended precision format, which is not precisely specified by the standard and which may be used for intermediate results by the floating point hardware. This is one source of the lack of uniformity among different machines.

Most nonzero floating point numbers are normalized. This means they can be expressed as

$$ x = \pm (1 + f) \cdot 2^e $$

The quantity $f$ is the *fraction* or *mantissa* and $e$ is the *exponent*. The fraction satisfies

$$ 0 \leq f < 1 $$

The fraction $f$ must be representable in binary using at most 52 bits for double precision and 23 bits for single recision. In other words, for double, $2^{52} f$ is an integer in the interval

$$ 0 \leq 2^{52} f < 2^{52} $$

For single, $2^{23} f$ is an integer in the interval

$$ 0 \leq 2^{23} f < 2^{23} $$

For double precision, the exponent $e$ is an integer in the interval

$$ -1022 \leq e \leq 1023 $$

For single, the exponent $e$ is in the interval

$$ -126 \leq e \leq 127 $$

The finiteness of $f$ is a limitation on *precision*. The finiteness of $e$ is a limitation on *range*. Any numbers that don't meet these limitations must be approximated by ones that do.

Double precision floating point numbers are stored in a 64-bit word, with 52 bits for $f$, 11 bits for $e$, and 1 bit for the sign of the number. The sign of $e$ is accommodated by storing $e+1023$, which is between $1$ and $2^{11}-2$. The 2 extreme values for the exponent field, $0$ and $2^{11}-1$, are reserved for exceptional floating point numbers that I will describe later.

Single precision floating point numbers are stored in a 32-bit word, with 23 bits for $f$, 8 bits for $e$, and 1 bit for the sign of the number. The sign of $e$ is accommodated by storing $e+127$, which is between $1$ and $2^{8}-2$. Again, the 2 extreme values for the exponent field are reserved for exceptional floating point numbers.

The entire fractional part of a floating point number is not $f$, but $1+f$. However, the leading $1$ doesn't need to be stored. In effect, the IEEE formats pack $33$ or $65$ bits of information into a $32$ or $64$-bit word.

My program `floatgui`, available here, shows the distribution of the positive numbers in a model floating point system with variable parameters. The parameter $t$ specifies the number of bits used to store $f$. In other words, $2^t f$ is an integer. The parameters $e_{min}$ and $e_{max}$ specify the range of the exponent, so $e_{min} \leq e \leq e_{max}$. Initially, `floatgui` sets $t = 3$, $e_{min} = -4$, and $e_{max} = 2$ and produces the following distribution.

Within each binary interval $2^e \leq x \leq 2^{e+1}$, the numbers are equally spaced with an increment of $2^{e-t}$. If $e = 0$ and $t = 3$, for example, the spacing of the numbers between $1$ and $2$ is $1/8$. As $e$ increases, the spacing increases.

It is also instructive to display the floating point numbers with a logarithmic scale. Here is the output from `floatgui` with `logscale` checked and $t = 5$, $e_{min} = -4$, and $e_{max} = 3$. With this logarithmic scale, it is more apparent that the distribution in each binary interval is the same.

A very important quantity associated with floating point arithmetic is highlighted in red by `floatgui`. MATLAB calls this quantity `eps`, which is short for *machine epsilon*.

`eps`is the distance from $1$ to the next larger floating point number.

For the `floatgui` model floating point system, `eps` = $2^{-t}$.

Before the IEEE standard, different machines had different values of `eps`. Now, for IEEE double precision the approximate decimal value of `eps` is

format short format compact eps_d = eps

eps_d = 2.2204e-16

And for single precision,

```
eps_s = eps('single')
```

eps_s = 1.1921e-07

Either `eps/2` or `eps` can be called the roundoff level. The maximum relative error incurred when the result of an arithmetic operation is rounded to the nearest floating point number is `eps/2`. The maximum relative spacing between numbers is `eps`. In either case, you can say that the roundoff level is about 16 decimal digits for double and about 7 decimal digits for single.

Actually `eps` is a function. `eps(x)` is the distance from `abs(x)` to the next larger in magnitude floating point number of the same precision as `x`.

A frequently occurring example is provided by the simple MATLAB statement

t = 0.1

The mathematical value $t$ stored in `t` is not exactly $0.1$ because expressing the decimal fraction $1/10$ in binary requires an infinite series. In fact,

$$ \frac{1}{10} = \frac{1}{2^4} + \frac{1}{2^5} + \frac{0}{2^6} + \frac{0}{2^7} + \frac{1}{2^8} + \frac{1}{2^9} + \frac{0}{2^{10}} + \frac{0}{2^{11}} + \frac{1}{2^{12}} + \cdots $$

After the first term, the sequence of coefficients $1, 0, 0, 1$ is repeated infinitely often. Grouping the resulting terms together four at a time expresses $1/10$ n a base 16, or *hexadecimal*, series.

$$ \frac{1}{10} = 2^{-4} \cdot \left(1 + \frac{9}{16} + \frac{9}{16^2} + \frac{9}{16^3} + \frac{9}{16^{4}} + \cdots\right) $$

Floating-point numbers on either side of $1/10$ are obtained by terminating the fractional part of this series after $52$ binary terms, or 13 hexadecimal terms, and rounding the last term up or down. Thus

$$ t_1 < 1/10 < t_2 $$

where

$$ t_1 = 2^{-4} \cdot \left(1 + \frac{9}{16} + \frac{9}{16^2} + \frac{9}{16^3} + \cdots + \frac{9}{16^{12}} + \frac{9}{16^{13}}\right) $$

$$ t_2 = 2^{-4} \cdot \left(1 + \frac{9}{16} + \frac{9}{16^2} + \frac{9}{16^3} + \cdots + \frac{9}{16^{12}} + \frac{10}{16^{13}}\right) $$

It turns out that $1/10$ is closer to $t_2$ than to $t_1$, so $t$ is equal to $t_2$. In other words,

$$ t = (1 + f) \cdot 2^e $$

where

$$ f = \frac{9}{16} + \frac{9}{16^2} + \frac{9}{16^3} + \cdots + \frac{9}{16^{12}} + \frac{10}{16^{13}} $$

$$ e = -4 $$

The MATLAB command

```
format hex
```

causes `t` to be displayed as

3fb999999999999a

The characters `a` through `f` represent the hexadecimal "digits" 10 through 15. The first three characters, `3fb`, give the hexadecimal representation of decimal $1019$, which is the value of the biased exponent $e+1023$ if $e$ is $-4$. The other 13 characters are the hexadecimal representation of the fraction $f$.

In summary, the value stored in `t` is very close to, but not exactly equal to, $0.1$. The distinction is occasionally important. For example, the quantity

0.3/0.1

is not exactly equal to $3$ because the actual numerator is a little less than $0.3$ and the actual denominator is a little greater than $0.1$.

Ten steps of length `t` are not precisely the same as one step of length 1. MATLAB is careful to arrange that the last element of the vector

0:0.1:1

is exactly equal to $1$, but if you form this vector yourself by repeated additions of $0.1$, you will miss hitting the final $1$ exactly.

What does the floating point approximation to the golden ratio, $\phi$, look like?

```
format hex
phi = (1 + sqrt(5))/2
```

phi = 3ff9e3779b97f4a8

The first hex digit, `3`, is `0011` in binary. The first bit is the sign of the floating point number; `0` is positive, `1` is negative. So `phi` is positive. The remaining bits of the first three hex digits contain $e+1023$. In this example, `3ff` in base 16 is $3 \cdot 16^2 + 15 \cdot 16 + 15 = 1023$ in decimal. So

$$ e = 0 $$

In fact, any floating point number between $1.0$ and $2.0$ has $e = 0$, so its `hex` output begins with `3ff`. The other 13 hex digits contain $f$. In this example,

$$ f = \frac{9}{16} + \frac{14}{16^2} + \frac{3}{16^3} + \cdots + \frac{10}{16^{12}} + \frac{8}{16^{13}} $$

With these values of $f$ and $e$,

$$ (1 + f) 2^e \approx \phi $$

Another example is provided by the following code segment.

```
format long
a = 4/3
b = a - 1
c = 3*b
e = 1 - c
```

a = 1.333333333333333 b = 0.333333333333333 c = 1.000000000000000 e = 2.220446049250313e-16

With exact computation, `e` would be zero. But with floating point, we obtain double precision `eps`. Why?

It turns out that the only roundoff occurs in the division in the first statement. The quotient cannot be exactly $4/3$, except on that Russian trinary computer. Consequently the value stored in `a` is close to, but not exactly equal to, $4/3$. The subtraction `b = a - 1` produces a quantity `b` whose last bit is 0. This means that the multiplication `3*b` can be done without any roundoff. The value stored in `c` is not exactly equal to $1$, and so the value stored in `e` is not 0. Before the IEEE standard, this code was used as a quick way to estimate the roundoff level on various computers.

The roundoff level `eps` is sometimes called ``floating point zero,'' but that's a misnomer. There are many floating point numbers much smaller than `eps`. The smallest positive normalized floating point number has $f = 0$ and $e = -1022$. The largest floating point number has $f$ a little less than $1$ and $e = 1023$. MATLAB calls these numbers `realmin` and `realmax`. Together with `eps`, they characterize the standard system.

Double Precision Binary Decimal eps 2^(-52) 2.2204e-16 realmin 2^(-1022) 2.2251e-308 realmax (2-eps)*2^1023 1.7977e+308

Single Precision Binary Decimal eps 2^(-23) 1.1921e-07 realmin 2^(-126) 1.1755e-38 realmax (2-eps)*2^127 3.4028e+38

If any computation tries to produce a value larger than `realmax`, it is said to *overflow*. The result is an exceptional floating point value called *infinity* or `Inf`. It is represented by taking $e = 1024$ and $f = 0$ and satisfies relations like `1/Inf = 0` and `Inf+Inf = Inf`.

If any computation tries to produce a value that is undefined even in the real number system, the result is an exceptional value known as Not-a-Number, or `NaN`. Examples include `0/0` and `Inf-Inf`. `NaN` is represented by taking $e = 1024$ and $f$ nonzero.

If any computation tries to produce a value smaller than `realmin`, it is said to *underflow*. This involves one of most controversial aspects of the IEEE standard. It produces exceptional *denormal* or *subnormal* floating point numbers in the interval between `realmin` and `eps*realmin`. The smallest positive double precision subnormal number is

format short smallest_d = eps*realmin smallest_s = eps('single')*realmin('single')

smallest_d = 4.9407e-324 smallest_s = 1.4013e-45

Any results smaller than this are set to 0.

The controversy surrounding underflow and denormals will be the subject of my next blog post.

Get
the MATLAB code

Published with MATLAB® R2014a

I teach the MATLAB 101 class to every new hire at MathWorks. Inevitably, someone will ask me how to make a global variable. I then Google up: “How to make MATLAB Doug Cry” and find the above post. I think this is a post I should bring up every now and then because these things should be avoided today just like they should have been when I first wrote it.

I will be back from vacation soon with fresh content.

Thanks,
Doug]]>

I have been posting blogs about MATLAB with code examples for many years. Steve Eddins, my fellow blogger of Steve on Image Processing fame, developed and maintains an internal tool that automates a lot of tasks and I rely on it to publish my blog posts.

One of my guest bloggers, Toshi Takeuchi, showed me a new tool he found, and he would like to do a quick introduction.

Hi, it's Toshi again. If you use MATLAB for fun like me, you would probably like to publish your work to share with other people. I do web analysis in marketing and I can tell you there are many people like you out there, based on the increasing search volume for personal use of MATLAB.

I have been writing to our internal WordPress blog service, and most of my posts involve MATLAB code examples. I would first do my analysis in MATLAB, then add commentary to explain what I did, and then publish it to HTML. After that, I had to manually copy and paste the published text and code to WordPress – and the process is tolerable but isn’t particularly easy.

I recently started contributing, as a guest blogger, some of my personal posts for Loren who maintains her public-facing blog. The other day I had a chance to witness how she posts her blogs. While her process was much more efficient than mine, I started wondering if we could do even better.

I suspect that this may be a widely shared pain. So I dug around a bit, and I found this wonderful MATLAB module called matlab-wordpress by John Kitchin on GitHub.

Installation was very straight forward. I just downloaded the zip file, added the extracted folder to the MATLAB search path, and jar files to `javaclasspath`, as described in `README.md`.

Next, enable XML-RPC (`Enable the WordPress, Movable Type, MetaWeblog and Blogger XML-RPC publishing protocols`) in your admin console. This is under:

Settings -> Writing -> Remote Publishing

You can avoid entering your username and password for your inside WP account if you set up `blogCredentials.m` as described in `README.md`.

user = 'your blog username'; password = 'your blog password'; server = 'http://your-blog/xmlrpc.php';

If you include an image or generate a plot in your MATLAB code, then the image will be automatically uploaded to the Media Library automatically. Sweet!

```
figure
hist(randn(1000,1),20)
title('Normal Distribution with \mu=0, \sigma=1')
```

Let's say you want to link to a `.mat` file used in your post, and the file path should be defined relative to the location of the MATLAB file you are publishing. If it is in the same location as your MATLAB file, you can link to it as follows:

"You will need to :download:`sample.mat` for this example."

This will then translates to:

"You will need to download sample.mat for this example."

The file will be automatically uploaded to the same directory that holds all your images and the URL will be updated to reflect the new file location. This alone will save tons of hassle. Unfortunately this may not work if your WordPress server is not configured to accept nonmedia files like `.mat` files.

You can add a link using MATLAB Markup syntax, but `matlab-wordpress` also gives you an ability to link to another post on your blog. Since Loren doesn't have this MATLAB module, you won't see the effects of these next sections in this post.

"In :postid:`920` we discussed how to analyze Twitter with MATLAB..."

You can add a tooltip like this.

:tooltip:`<this is the tooltip.> Hover on this`

You can also add color to text to draw attention to a note or warning.

:note:`This is a note in light blue.` :warning:`This is a warning in pink.`

You can add categories and tags to your post if you include the following lines in your code. Currently it appears you can only add one of each, for some reason.

% categories: Blog % tags: MATLAB

Now it's time to publish your MATLAB file to WordPress. If your MATLAB file is `myFile.m`, then you enter the following command in the command window.

blogpost myFile.m

Perhaps you want to publish locally first as a dryrun. If so, instead do:

blogpost myFile.m true % set |dryrun| parameter to |true|

When you publish your post, your post ID and permalink is automatically added to the MATLAB file. If you update your file, you can update your blog post by simply typing `blogpost myFile.m` again. If you remove your post ID from your MATLAB file, that will result in a new post.

Because I happen to use WP-Syntax plugin on my blog for syntax highlighting, I have to remove the CSS tags generated by `publish` from MATLAB and make other adjustments related to that. If you can use the MATLAB generated CSS as is, then my guess is that you don't have to do any additional tweaks.

Even though I still had to do some cleaning, the overall process took a lot less time than the manual process I use. Mileage may vary depending on your settings, but I think this is going to be useful for many people. Let us know how it works for you here.

Get
the MATLAB code

Published with MATLAB® R2014a

The nineteenth Householder Symposium, Householder XIX, was held June 8-13 at Sol Cress, a conference center near Spa, Belgium. If you have been following either the web or the newletter edition of Cleve's Corner you know that the Gatlinburg/Householder series of conferences have played an important role in both my professional life and the history of MATLAB. I attended what turned out to be the third conference in the series, in Gatlinburg, Tennesse, when I was a graduate student in 1964. I have been to all 17 of the conferences that have been held since 1964. Here is a link to my News and Notes article about the Gatlinburg/Householder conferences.

The Householder Symposium has been organized over the years by an evolving volunteer committee. The symposium is not affiliated with any professional society. The organizing committee for Householder XIX was:

- Ilse Ipsen (chair), North Carolina State University, USA
- Jim Demmel, University of California at Berkeley, USA
- Alan Edelman, Massachusetts Institute of Technology, USA
- Heike Fassbender, Technical University of Braunschweig, Germany
- Volker Mehrmann, Technical University of Berlin
- Jim Nagy, Emory University, USA
- Yousef Saad, University of Minnesota, USA
- Valeria Simoncini, University of Bologna, Italy
- Zdenek Strakos, Charles University in Prague, Czech Republic
- Andy Wathen, Oxford University, UK

The local organizers in Belgium were:

- Paul Van Dooren (chair), Universite catholique de Louvain
- Pierre-Antoine Absil, Universite catholique de Louvain
- Karl Meerbergen, Katholieke Universiteit Leuven
- Annick Sartenaer, Universite de Namur
- Marc Van Barel, Katholieke Universiteit Leuven
- Sabine Van Huffel, Katholieke Universiteit Leuven

In order to facilitate interaction, attendance is limited. The organizing committee accepts applications and evaluates resumes. This time only about half of the 300 applicants were selected. This has been a controversial aspect of the conference, but I am in favor of limiting the number of participants.

A little over 100 of the attendees made it to the group photo. A few are wearing neckties and their best dresses in the photo because the conference banquet was scheduled shortly thereafter.

A list of the attendees is available on the symposium web site.

Everybody who attended the conference had an opportunity to give a talk or present a poster during the five day conference. There were a total of 26 half-hour plenary talks. Each day there were two sets of three parallel sessions of less formal 20 minute talks. A couple of days had late afternoon or evening poster sessions.

The complete program is available here. The overall theme is numerical linear algebra. Applications include partial differetial equations, image processing, control theory, graph theory, model reduction, and gyroscopes.

The Householder Prize is given for the best PhD dissertation written during the three-year period preceeding the meeting. A list of the winners of the prize in previous years is available here The committee judging the dissertations for the prize this time was:

- Volker Mehrmann, Technical University of Berlin, Germany (chair)
- Michele Benzi, Emory University, USA
- Inderjit Dhillon, UT Austin, USA
- Howard Elman, University of Maryland, USA
- Francoise Tisseur, University of Manchester, UK
- Stephen Vavasis, University of Waterloo, Canada

Volker reported that 17 theses from 9 countries were submitted. From these, a short list of six finalists were selected:

- Grey Ballard, UC Berkeley
- Saifon Chaturantabut, Rice University
- Cedric Effenberger, EPF Lausanne
- Nicolas Gillis, UC Louvain
- Agnieszka Miedlar, TU Berlin
- Yuji Nakatsukasa, UC Davis

From this short list, the two winners of the 2014 Householder Prize were announced:

- Nicolas Gillis, UC Louvain
- Yuji Nakatsukasa, UC Davis

Nicolas is now an Assistant Professor at Universite de Mons in Belgium and Yuji is an Assistant Professor at the University of Tokyo.

The two winners gave talks about their thesis work in the last days of the meeting. They shared prize money of a little over $2000 obtained by passing a hat at the banquet dinner during Householder XVIII three years ago in Lake Tahoe. A hat was passed again at this meeting to fund the next Householder Prize. The location for the next Householder meeting has not yet been decided.

Traditionally the Wednesday afternoon of the Householder meeting is devoted to an excursion of some kind. The XIX excursion was to "Notre Dame du Val-Dideu", an abbey founded in 1216 by Cistercian monks. There are no longer any monks living at the abbey -- it is now best known as a brewery. Their beer is Val-Dieu. We toured the abbey and the tiny brewery, which is operated by just five people. We then had the opportunity to taste three different beers and their wonderful cheese. Belgian abbey beers have nine percent alcohol. It was a very pleasant afternoon.

Get
the MATLAB code

Published with MATLAB® R2014a

Today I’d like to introduce a fairly frequent guest blogger Sarah Wait Zaranek who works for the MATLAB Marketing team here at The MathWorks. She and I will be writing about the new capabilities for the webcam in R2014a.

- Webcam is Available
- Installing the Support Package
- Listing Webcams and Previewing
- Taking a Single Image
- Taking Images Within a Loop
- Taking Images Within a Loop and Saving to an AVI-file
- Set up video writer
- Grab and process frames
- Taking Images Within a Loop and Creating an Animated GIF
- Here's the movie we just made
- Tidy up
- Additional Camera Support
- Do You Have a Project that Uses a Webcam?

In R2014a, you can bring live images from webcams into MATLAB.

Webcam support is available through a hardware support package. Hardware support pacakges have existed in previous release for Simulink but now they are available for MATLAB, too.

You can find the support package installer in the resources section of the home tab of the Toolstrip. From there, choose install from Internet, select USB Webcams

And install your webcam support package.

Now, that the webcam support is installed – let’s get started using our webcam.

You can see a list of available webcams

webcamlist

ans = 'Microsoft LifeCam Cinema' 'Integrated Camera'

You can see that Loren has two different webcams. We use the function < `webcam`, to select which camera to use, by using either the camera name or the index in the `webcamlist` corresponding to the camera.

```
mycam = webcam('Microsoft LifeCam Cinema')
```

mycam = webcam with properties: Name: 'Microsoft LifeCam Cinema' Resolution: '640x480' AvailableResolutions: {1x12 cell} Brightness: 133 BacklightCompensation: 0 WhiteBalanceMode: 'auto' Saturation: 83 Zoom: 0 Pan: 0 FocusMode: 'auto' Sharpness: 25 WhiteBalance: 4500 ExposureMode: 'auto' Tilt: 0 Focus: 0 Contrast: 5 Exposure: -6

If you only have a single webcam available, that webcam will be used by default. You can use `preview` to check on the webcam view.

preview(mycam)

pause(10) snapnow closePreview(mycam)

You can experiment and set any properties that you may want to change. For example, you might want to change the resolution or the brightness.

mycam.Brightness = 200;

You can acquire a single live image from your webcam.

```
img = snapshot(mycam);
imagesc(img)
% You can see me and Loren hanging out in her office!
```

You can set up a loop to acquire many images – and can process each frame within the loop. For example, we can reduce the number of distinct colors used in the image.

Grab and process frames

frames = 50; for i = 1:frames % Acquire frame for processing img = snapshot(mycam); % Quantize image by thresholding idx = img > 60 & img < 170 ; img(idx)= 255; img(~idx)= 0; % Display frame imagesc(img); axis image; axis off; end

Additionally, you can create a video file of the processed frames by using `VideoWriter`.

```
mywriter = VideoWriter('mymovie.avi');
open(mywriter);
```

frames = 50; for ii = 1:frames % Acquire frame for processing img = snapshot(mycam); % Quantize image by thresholding idx = img > 60 & img < 170 ; img(idx)= 255; img(~idx)= 0; % Display frame imagesc(img); axis image; axis off; % Write frame to video writeVideo(mywriter,img); end close(mywriter)

We're going to do ALMOST the same thing we've just done, but output an animated GIF file instead.

Grab and process frames

frames = 50; filename = 'mymovie.gif'; for i = 1:frames % Acquire frame for processing img = snapshot(mycam); % Quantize image by thresholding idx = img > 60 & img < 170 ; img(idx)= 255; img(~idx)= 0; % Display frame imagesc(img) axis image; axis off; [img, cm] = rgb2ind(img, 256); if i == 1; imwrite(img,cm,filename,'gif','Loopcount',inf); else imwrite(img,cm,filename,'gif','WriteMode','append'); end end

delete(mycam)

Support for high-end scientific and industrial cameras and more advanced features such as triggering, data logging can be found in the Image Acquisition Toolbox.

Do you have a project, for school, work, or fun, where you want to use a webcam with MATLAB? Tell us about it here.

Get
the MATLAB code

Published with MATLAB® R2014a

Stiffness is a subtle concept that plays an important role in assessing the effectiveness of numerical methods for ordinary differential equations. (This article is adapted from section 7.9, "Stiffness", in *Numerical Computing with MATLAB*.)

Stiffness is a subtle, difficult, and important concept in the numerical solution of ordinary differential equations. It depends on the differential equation, the initial conditions, and the numerical method. Dictionary definitions of the word "stiff" involve terms like "not easily bent," "rigid," and "stubborn." We are concerned with a computational version of these properties.

A problem is stiff if the solution being sought varies slowly, but there are nearby solutions that vary rapidly, so the numerical method must take small steps to obtain satisfactory results.

Stiffness is an efficiency issue. If we weren't concerned with how much time a computation takes, we wouldn't be concerned about stiffness. Nonstiff methods can solve stiff problems; they just take a long time to do it.

Stiff ode solvers do more work per step, but take bigger steps. And we're not talking about modest differences here. For truly stiff problems, a stiff solver can be orders of magnitude more efficient, while still achieving a given accuracy.

A model of flame propagation provides an example. I learned about this example from Larry Shampine, one of the authors of the MATLAB ordinary differential equation suite. If you light a match, the ball of flame grows rapidly until it reaches a critical size. Then it remains at that size because the amount of oxygen being consumed by the combustion in the interior of the ball balances the amount available through the surface. The simple model is

$$ \dot{y} = y^2 - y^3 $$

$$ y(0) = \delta $$

$$ 0 \leq t \leq {2 \over \delta} $$.

The scalar variable $y(t)$ represents the radius of the ball. The $y^2$ and $y^3$ terms come from the surface area and the volume. The critical parameter is the initial radius, $\delta$, which is "small." We seek the solution over a length of time that is inversely proportional to $\delta$. You can see a dramatization by downloading `flame.m` from <http://www.mathworks.com/moler/ncmfilelist.html>.

At this point, I suggest that you start up MATLAB and actually run our examples. It is worthwhile to see them in action. I will start with `ode45`, the workhorse of the MATLAB ode suite. If $\delta$ is not very small, the problem is not very stiff. Try $\delta$ = 0.02 and request a relative error of $10^{-4}$.

```
delta = 0.02;
F = @(t,y) y^2 - y^3;
opts = odeset('RelTol',1.e-4);
ode45(F,[0 2/delta],delta,opts);
```

With no output arguments, `ode45` automatically plots the solution as it is computed. You should get a plot of a solution that starts at $y$ = 0.01, grows at a modestly increasing rate until $t$ approaches 50, which is $1/\delta$, then grows rapidly until it reaches a value close to 1, where it remains.

Now let's see stiffness in action. Decrease $\delta$ by more than two orders of magnitude. (If you run only one example, run this one.)

delta = 0.0001; ode45(F,[0 2/delta],delta,opts);

You should see something like this plot, although it will take several seconds to complete the plot. If you get tired of watching the agonizing progress, click the stop button in the lower left corner of the window. Turn on zoom, and use the mouse to explore the solution near where it first approaches steady state. You should see something like the detail obtained with this `axis` command.

axis([.995e4 1.03e4 0.9998 1.0002]) last_plot = getframe(gcf);

Notice that `ode45` is doing its job. It's keeping the solution within $10^{-4}$ of its nearly constant steady state value. But it certainly has to work hard to do it. If you want an even more dramatic demonstration of stiffness, decrease $\delta$ or decrease the tolerance to $10^{-5}$ or $10^{-6}$.

This problem is not stiff initially. It only becomes stiff as the solution approaches steady state. This is because the steady state solution is so "rigid." Any solution near $y(t) = 1$ increases or decreases rapidly toward that solution. (I should point out that "rapidly" here is with respect to an unusually long time scale.)

What can be done about stiff problems? You don't want to change the differential equation or the initial conditions, so you have to change the numerical method. Methods intended to solve stiff problems efficiently do more work per step, but can take much bigger steps. Stiff methods are *implicit*. At each step they use MATLAB matrix operations to solve a system of simultaneous linear equations that helps predict the evolution of the solution. For our flame example, the matrix is only 1 by 1, but even here, stiff methods do more work per step than nonstiff methods.

Let's compute the solution to our flame example again, this time with `ode23s`. The " *s* " in the name is for "stiff."

ode23s(F,[0 2/delta],delta,opts);

Here is the zoom detail.

axis([.995e4 1.03e4 0.9998 1.0002])

You can see that *ode23s* takes many fewer steps than *ode45*. This is actually an easy problem for a stiff solver. In fact, *ode23s* takes only 99 steps and uses just 412 function evaluations, while *ode45* takes 3040 steps and uses 20179 function evaluations. Stiffness even affects graphical output. The print files for the *ode45* figures are much larger than those for the *ode23s* figures.

Imagine you are returning from a hike in the mountains. You are in a narrow canyon with steep slopes on either side. An explicit algorithm would sample the local gradient to find the descent direction. But following the gradient on either side of the trail will send you bouncing back and forth across the canyon, as with #ode45#. You will eventually get home, but it will be long after dark before you arrive. An implicit algorithm would have you keep your eyes on the trail and anticipate where each step is taking you. It is well worth the extra concentration.

All numerical methods for stiff odes are *implicit*. The simplest example is the *backward Euler* method, which involves finding the unknown $v$ in

$$ v = y_n + h f(t_{n+1},v) $$

and then setting $y_{n+1}$ equal to that $v$. This is usually a nonlinear system of equations whose solution requires at least an approximation to the Jacobian, the matrix of partial derivatives

$$ J = {\partial F \over \partial y} $$

By default the partial derivatives in the Jacobian are computed by finite differences. This can be quite costly in terms of function evaluations. If a procedure for computing the Jacobian is available, it can be provided. Or, if the sparsity pattern is known, it can be specified. The blocks in a Simulink diagram, for example, are only sparsely connected to each other. Specifying a sparse Jacobian initiates sparse linear equation solving.

Newton's method for computing the $v$ in the backward Euler method is an iteration. Start perhaps with $v^0 = y_n$. Then, for $k = 0,1,...$, solve the linear system of equations

$$ (I - hJ) u^k = v^k - y_n - h f(t_{n+1},v^k) $$

for the correction $u^k$ . Set

$$ v^{k+1} = v^k + u^k $$

When you are satisfied that the $v^k$ have converged, let $y_{n+1}$ be the limit.

Stiff ODE solvers may not use Newton's method itself, but whatever method is used to find the solution, $y_{n+1}$, at the forward time step can ultimately be traced back to Newton's method.

`ode15s` employs two variants of a method that is quite different from the single step methods that I've described so far in this series on ode solvers. Linear multistep methods save solution values from several time steps and use all of them to advance to the next step.

Actually, `ode15s` can be compared to the other multistep method in the suite, `ode113`. One saves values of the solution, $y_n$, while the other saves values of the function, $F(t_n,y_n)$. One includes the unknown value at the new time step $y_{n+1}$ in the formulation, thereby making it implicit, while the other does not. Both methods can vary the order as well as the step size. As their names indicate, `ode15s` allows the order to vary between 1 and 5, while `ode113s` allows the order to vary between 1 and 13.

A property specified via `odeset` switches `ode15s` between two variants of a linear multistep method, BDF, Backward Differentiation Formulas, and NDF, Numerical Differentiation Formulas. BDFs were introduced for stiff odes in 1971 by Bill Gear. Gear's student, Linda Petzold, extended the ideas to DAEs, differential-algebraic equations, and produced DASSL, software whose successors are still in widespread use today. NDFs, which are the default for `ode15s`, include an additional term in the memory and consequently can take larger steps with the same accuracy, especially at lower order.

`ode23s` is a single step, implicit method described in the paper by Shampine and Reichelt referenced below. It uses a second order formula to advance the step and a third order formula to estimate the error. It recomputes the Jacobian with each step, thereby making it quite expensive in terms of function evaluations. If you can supply an analytic Jacobian then `ode23s` is a competitive choice.

`ode23t` and `ode23tb` are implicit methods based on the trapezoid rule and the second order BDF. The origins of the methods go back to the 1985 paper referenced below by a group at the old Bell Labs working on electronic device and circuit simulation. Mike Hosea and Larry Shampine made extensive modifications and improvements described in their 1996 paper when they implemented the methods in MATLAB.

Stiff ODE solvers are not actually using MATLAB's iconic backslash operator on a full system of linear equations, but they are using its component parts, LU decomposition and solution of the resulting triangular systems.

Let's look at the statistics generated by `ode23` when it solves the flame problem. We'll run it again, avoiding the plot by asking for output, but then ignoring the output, and just looking at the stats.

opts = odeset('stats','on','reltol',1.e-4); [~,~] = ode23s(F,[0 2/delta],delta,opts);

99 successful steps 7 failed attempts 412 function evaluations 99 partial derivatives 106 LU decompositions 318 solutions of linear systems

We see that at every step `ode23s` is computing a Jacobian, finding the LU decomposition of a matrix involving that Jacobian, and then using L and U to solve three linear systems.

Now how about the primary stiff solver, `ode15s`.

[~,~] = ode15s(F,[0 2/delta],delta,opts);

140 successful steps 39 failed attempts 347 function evaluations 2 partial derivatives 53 LU decompositions 342 solutions of linear systems

We see that `ode15s` takes more steps than `ode23s`, but requires only two Jacobians. It does only half as many LU decompositions, but then uses each LU decomposition for twice as many linear equation solutions.

We certainly can't draw any conclusions about the relative merits of these two solvers from this one example, especially since the Jacobian in this case is only a 1-by-1 matrix.

Cleve Moler, *Numerical Computing with MATLAB*, Electronic Edition, MathWorks, <http://www.mathworks.com/moler/index_ncm.html>,

Print Edition, SIAM Revised Reprint, SIAM, 2008, 334 pp., <http://www.ec-securehost.com/SIAM/ot87.html>.

Lawrence F. Shampine and Mark W. Reichelt, "The MATLAB ODE Suite", SIAM Journal on Scientific Computing, 18 (1997), pp.1-22, <http://www.mathworks.com/help/pdf_doc/otherdocs/ode_suite.pdf>

Lawrence F. Shampine, Mark W. Reichelt, and Jacek A. Kierzenka, "Solving Index-1 DAEs in MATLAB and Simulink", SIAM Review, 41 (1999), pp. 538-552. <http://epubs.siam.org/doi/abs/10.1137/S003614459933425X>

M. E. Hosea and L. F. Shampine, "Analysis and Implementawtion of TR-BDF2", Applied Numerical Mathematicss, 20 (1996), pp. 21-37, <http://www.sciencedirect.com/science/article/pii/0168927495001158>

R. E. Bank, W. M. Coughran, Jr., W. Fichtner., E. H. Grosse, D. J. Rose, and R. K. Smith, "Transient simulation of silicon devices and circuits", IEEE Transactions on Computer-Aided Design CAD-4 (1985), 4, pp. 436-451.

I want to repeat this plot because it represents the post on the lead-in page at MATLAB Central.

imshow(last_plot.cdata,'border','tight')

Get
the MATLAB code

Published with MATLAB® R2014a

*by Gareth Thomas*

Today I want to talk about sharing tables in reports (a table is a relatively new data type in MATLAB suitable for column-oriented or tabular data). In the current release of MATLAB (R2014a) it can be a little tricky to represent a table variable in an HTML document. So I’m going to show you how to take the reporting capabilities that MATLAB currently offers, combine them with some HTML 5 and JavaScript, and end up with a beautiful report that functions nicely as a standalone document.

Today there are basically two different ways to create HTML reports from MATLAB.

- MATLAB’s publish command
- The MATLAB Report Generator

The publish command ships with MATLAB, while the Report Generator is a product dedicated to the creation of sophisticated reports from MATLAB. Beyond this, the key difference between the two methods really comes down to when you want to deploy your algorithm. That is, do you want the report to update itself in real time? If you want to create a standalone executable, you can really only use the Report Generator.

I’m going to focus on MATLAB’s publish command. Suppose you want to pass a lot of MATLAB-generated HTML into your published document. It turns out that you can use the disp command to add that HTML code into the generated document. However it is not really easy to pass arguments, as this HTML is hard coded at publish time. To get around this I created a function (makeHTMLTableFilter) that creates HTML code that, when published, is also dynamic and responsive. It achieves this by adding Javascript code. This allows you to do things like filter the table interactively.

See the example below:

This is a very nice way of sharing your results in a form of a table while allowing others to filter the fields that they wish.

One question that I get is, “Okay, but how would I deploy this?” As it happens, this idea of using

`disp(makeHTMLTableFilter(mytable))`

is also supported in the MATLAB Report Generator.

So download makeHTMLTableFilter and give it a try. You’ll soon be making interactive reports of your own.

]]>Whatever your opinion of social media these days, there is no denying it is now an integral part of our digital life. So much so, that social media metrics are now considered part of altmetrics, an alternative to the established metrics such as citations to measure the impact of scientific papers.

Today's guest blogger, Toshi, will show you how to access the Twitter API and analyze tweets with MATLAB.

- Why Twitter
- Sentiment Analysis
- Tweet Content Visualization
- Who Tweeted the News?
- Does Follower Count Really Matter? Going Viral on Twitter
- Visualizing the Retweet Social Graph
- Getting Started with Twitter using Twitty
- Processing Tweets and Scoring Sentiments
- Processing Tweets for Content Visualization
- Get the Profile of Top 5 Users
- Streaming API for High Volume Real Time Tweets
- Save an Edge List for Social Graph Visualization
- Closing

Twitter is a good starting point for social media analysis because people openly share their opinions to the general public. This is very different from Facebook, where social interactions are often private. In this post, I would like to share simple examples of sentiment analysis and social graph visualization using Twitter's Search and Streaming APIs.

The first part of this post discusses analysis with Twitter, and the latter part shows the code that computes and creates plots, like those shown earlier.

One of the very common analyses you can perform on a large number of tweets is sentiment analysis. Sentiment is scored based on the words contained in a tweet. If you manage a brand or political campaign, for example, it may be important to keep track of your popularity, and sentiment analysis provides a convenient way to take the pulse of the tweeting public. Here is an example of sentiment analysis between Amazon and Hachette as of this writing, based on 100 tweets collected via the Twitter Search API.

The sentiment distributions are nearly identical between the two brands, but you can see that tweets mentioning both have clearly skewed to the negative, since the news is about a war between Amazon and a publisher over ebook profit margin. Is there a single metric we can use to make this comparison easier? That's where *Net Sentiment Rate (NSR)* comes in.

NSR = (Positive Tweets-Negative Tweets)/Total

Here is the result. You could keep taking this measurement periodically for ongoing sentiment monitoring, if interested. Perhaps you may discover that NSR is correlated to their stock prices!

Amazon NSR : 0.84 Hachette NSR: 0.58 Both NSR : -0.30

And lastly, but not in the least, did sentiment scoring actually work? Check out the top 5 positive and negative tweets for Hachette for your own assessment.

Top 5 positive tweets ___________________________________________________________________________

'@deckchairs @OccupyMyCat @aworkinglibrary but I think Hachette artists...' '@emzleb Hachette has Rowling so they hold a lot of cards (A LOT of car...' 'Amazon Confirms Hachette Spat Is To "Get a Better Deal" http://t.co/Ka...' '@shaunduke @DarkMatterzine Yeah, Gollancz is owned by Orion Publishing...' 'MUST READ Book publisher Hachette says working to resolve Amazon dispu...'

Top 5 negative tweets ___________________________________________________________________________

'Reading into the Amazon vs. Hachette battle - May 28 - The war between...' '#Vtech Reading into the Amazon vs. Hachette battle - May 28 - The war ...' '#Vbnss Reading into the Amazon vs. Hachette battle - May 28 - The war ...' 'RT @text_publishing: Amazon war with Hachette over ebook profit margin...' 'RT @text_publishing: Amazon war with Hachette over ebook profit margin...'

What were the main themes they tweeted about when those users mentioned both Amazon and Hachette? The word count plot shows that mostly those tweets repeated the news headlines like “Amazon admits dispute (with) Hachette”, perhaps with some commentary - showing that Twitter was being used for news amplification.

The 100 tweets collected came from 86 users. So on average each user only tweeted 1.16 times. Instead of frequency, let's find out who has a large number of followers (an indicator that they may be influential) and check their profile. It appears that 2 or 3 out of the 5 top users (based on follower count) are writers, and others are news syndication services.

Name Followers Description ________________ _________ ____________________________________________________

'Daton L Fluker' 73578 '#Horror #Novelist of Death Keeper's Biological Wast...' 'WellbeingVigor' 22224 'Writer - 10 years .here, Incurable music enthusiast #' 'E-Book Update' 10870 '' 'Michael Rosa' 10297 '' 'Net Tech News' 7487 'Latest internet and technology news headlines from ...'

In the previous section, we checked out the top 5 users based on their follower count. The assumption was that, if you have a large number of followers, you are considered more influential because more people may see your tweets.

Now let's test this assumption. For that I need more than 100 tweets. So I collected a new batch of data - 1000 tweets from 4 trending topics from the UK, and plotted the users based on their follower counts vs. how often their tweets got retweeted. The size (and the color) of the bubbles show how often those users tweeted.

It looks like you do need some base number of followers to make it to the national level, but the correlation between the follower counts to the frequency of getting retweeted looks weak. Those charts look like different stages of viral diffusion - the top two charts clearly show one user broke away from the rest of the crowd, and in that process they may have also gained more followers. The bottom two charts show a number of users competing for attention but no one has a clear breakout yet. If this was an animation, it may look like boiling water. Is anyone interested in analyzing whether this is indeed how a tweet goes viral?

Retweeting of one user's tweet by others creates a network of relationships that can be represented as a social graph. We can visualize such relationship with a popular social networking analysis tool Gephi.

"I Can't Sing" Social Graph Larger

"#InABlackHousehold" Social Graph Larger

You can see that, in the first case, two users formed large clusters of people retweeting their tweets, and everyone else was dwarfed. In the second case, we also see two dominant users, but they have not yet formed a large scale cluster.

Now that you have seen a simple analysis I did with Twitter, it is time to share how I did it in MATLAB. To get started with Twitter, you need to get your developer credentials. You also need Twitty by Vladimir Bondarenko. It is simple to use and comes with excellent documentation.

- Create a Twitter account if you do not already have one
- Create a Twitter app to obtain developer credentials
- Download and install Twitty from the FileExchange, along with the JSON Parser and optionally JSONLab
- Create a structure array to store your credentials for Twitty

Let's search for tweets that mention `'amazon'` and `'hachette'`.

% a sample structure array to store the credentials creds = struct('ConsumerKey','your-consumer-key-here',... 'ConsumerSecret','your-consumer-secret-here',... 'AccessToken','your-token-here',... 'AccessTokenSecret','your-token-secret-here'); % set up a Twitty object addpath twitty_1.1.1; % Twitty addpath parse_json; % Twitty's default json parser addpath jsonlab; % I prefer JSONlab, however. load('creds.mat') % load my real credentials tw = twitty(creds); % instantiate a Twitty object tw.jsonParser = @loadjson; % specify JSONlab as json parser % search for English tweets that mention 'amazon' and 'hachette' amazon = tw.search('amazon','count',100,'include_entities','true','lang','en'); hachette = tw.search('hachette','count',100,'include_entities','true','lang','en'); both = tw.search('amazon hachette','count',100,'include_entities','true','lang','en');

Twitty stores tweets in structure array created from the API response in JSON format. I prefer using a table when it comes to working with heterogeneous data containing a mix of numbers and text. I wrote some code, `processTweets`, to convert structure arrays into tables and compute sentiment scores. You can find the Amazon-Hachette data file here.

For sentiment analysis, I used AFINN, along with a list of English stop words so that we don't count frequent common words like "a" or "the".

% load supporting data for text processing scoreFile = 'AFINN/AFINN-111.txt'; stopwordsURL ='http://www.textfixer.com/resources/common-english-words.txt'; % load previously saved data load amazonHachette.mat % process the structure array with a utility method |extract| [amazonUsers,amazonTweets] = processTweets.extract(amazon); % compute the sentiment scores with |scoreSentiment| amazonTweets.Sentiment = processTweets.scoreSentiment(amazonTweets, ... scoreFile,stopwordsURL); % repeat the process for hachette [hachetteUsers,hachetteTweets] = processTweets.extract(hachette); hachetteTweets.Sentiment = processTweets.scoreSentiment(hachetteTweets, ... scoreFile,stopwordsURL); % repeat the process for tweets containing both [bothUsers,bothTweets] = processTweets.extract(both); bothTweets.Sentiment = processTweets.scoreSentiment(bothTweets, ... scoreFile,stopwordsURL); % calculate and print NSRs amazonNSR = (sum(amazonTweets.Sentiment>=0) ... -sum(amazonTweets.Sentiment<0)) ... /height(amazonTweets); hachetteNSR = (sum(hachetteTweets.Sentiment>=0) ... -sum(hachetteTweets.Sentiment<0)) ... /height(hachetteTweets); bothNSR = (sum(bothTweets.Sentiment>=0) ... -sum(bothTweets.Sentiment<0)) ... /height(bothTweets); fprintf('Amazon NSR : %.2f\n',amazonNSR) fprintf('Hachette NSR: %.2f\n',hachetteNSR) fprintf('Both NSR : %.2f\n\n',bothNSR) % plot the sentiment histogram of two brands binranges = min([amazonTweets.Sentiment; ... hachetteTweets.Sentiment; ... bothTweets.Sentiment]): ... max([amazonTweets.Sentiment; ... hachetteTweets.Sentiment; ... bothTweets.Sentiment]); bincounts = [histc(amazonTweets.Sentiment,binranges)... histc(hachetteTweets.Sentiment,binranges)... histc(bothTweets.Sentiment,binranges)]; figure bar(binranges,bincounts,'hist') legend('Amazon','Hachette','Both','Location','Best') title('Sentiment Distribution of 100 Tweets') xlabel('Sentiment Score') ylabel('# Tweets')

Amazon NSR : 0.84 Hachette NSR: 0.58 Both NSR : -0.30

`processTweets` also has a function `tokenize` that parses tweets to calculate the word count.

% tokenize tweets with |tokenize| method of |processTweets| [words, dict] = processTweets.tokenize(bothTweets,stopwordsURL); % create a dictionary of unique words dict = unique(dict); % create a word count matrix [~,tdf] = processTweets.getTFIDF(words,dict); % plot the word count figure plot(1:length(dict),sum(tdf),'b.') xlabel('Word Indices') ylabel('Word Count') title('Words contained in the tweets') % annotate high frequency words annotated = find(sum(tdf)>= 10); jitter = 6*rand(1,length(annotated))-3; for i = 1:length(annotated) text(annotated(i)+3, ... sum(tdf(:,annotated(i)))+jitter(i),dict{annotated(i)}) end

Twitty also supports the 'users/show' API to retrieve user profile information. Let's get the profile of the top 5 users based on the follower count.

% sort the user table by follower count in descending order [~,order] = sortrows(bothUsers,'Followers','descend'); % select top 5 users top5users = bothUsers(order(1:5),[3,1,5]); % add a column to store the profile top5users.Description = repmat({''},height(top5users),1); % retrieve user profile for each user for i = 1:5 userInfo = tw.usersShow('user_id', top5users.Id(i)); if ~isempty(userInfo{1}.description) top5users.Description{i} = userInfo{1}.description; end end % print the result disp(top5users(:,2:end))

Name Followers ________________ _________ 'Daton L Fluker' 73578 'WellbeingVigor' 22224 'E-Book Update' 10870 'Michael Rosa' 10297 'Net Tech News' 7487 Description ___________________________________________________________________________ '#Horror #Novelist of Death Keeper's Biological Wasteland, Finished Cri...' 'Writer - 10 years .here, Incurable music enthusiast #' '' '' 'Latest internet and technology news headlines from news sources around...'

If you need more than 100 tweets to work with, then your only option is to use the Streaming API which provides access to the sampled Twitter fire hose in real time. That also means you need to access the tweets that are currently active. You typically start with a trending topic from a specific location.

You get local trends by specifying the geography with WOEID (Where On Earth ID), available at WOEID Lookup.

uk_woeid = '23424975'; % UK uk_trends = tw.trendsPlace(uk_woeid); uk_trends = cellfun(@(x) x.name, uk_trends{1}.trends, 'UniformOutput',false)';

Once you have the current trends (or download them from here), you can use the Streaming API to retrieve the tweets that mention the trending topic. When you specify an output function with Twitty, the data is store within Twitty. Twitty will process incoming tweets up to the sample size specified, and process data by the batch size specified.

tw.outFcn = @saveTweets; % output function tw.sampleSize = 1000; % default 1000 tw.batchSize = 1; % default 20 tic; tw.filterStatuses('track',uk_trends{1}); % Streaming API call toc uk_trend_data = tw.data; % save the data

% reload the previously saved search result for 4 trending topics in the UK load('uk_data.mat') % plot figure for i = 1:4 % process tweets [users,tweets] = processTweets.extract(uk_data(i).statuses); % get who are mentioned in retweets retweeted = tweets.Mentions(tweets.isRT); retweeted = retweeted(~cellfun('isempty',retweeted)); [screen_names,~,idx] = unique(retweeted); count = accumarray(idx,1); retweeted = table(screen_names,count,'VariableNames',{'Screen_Name','Count'}); % get the users who were mentioned in retweets match = ismember(users.Screen_Name,retweeted.Screen_Name); retweetedUsers = sortrows(users(match,:),'Screen_Name'); match = ismember(retweeted.Screen_Name,retweetedUsers.Screen_Name); retweetedUsers.Retweeted_Count = retweeted.Count(match); [~,order] = sortrows(retweetedUsers,'Retweeted_Count','descend'); % plot each topic subplot(2,2,i) scatter(retweetedUsers.Followers(order),... retweetedUsers.Retweeted_Count(order),retweetedUsers.Freq(order)*50,... retweetedUsers.Freq(order),'fill') if ismember(i, [1,2]) ylim([-20,90]); xpos = 2; ypos1 = 50; ypos2 = 40; elseif i == 3 ylim([-1,7]) xlabel('Follower Count (Log Scale)') xpos = 1010; ypos1 = 0; ypos2 = -1; else ylim([-5,23]) xlabel('Follower Count (Log Scale)') xpos = 110; ypos1 = 20; ypos2 = 17; end % set x axis to log scale set(gca, 'XScale', 'log') if ismember(i, [1,3]) ylabel('Retweeted Count') end title(sprintf('UK Tweets for: "%s"',uk_data(i).query.name)) end

Gephi imports an edge list in CSV format. I added a new method `saveEdgeList` to `processTweet` that saves the screen names of the users as `source` and the hashtags and screen names they mention in their tweets as `target` in a <https://gephi.org/users/supported-graph-formats/csv-format/ Gephi-ready CSV file.

```
processTweets.saveEdgeList(uk_data(1).statuses,'edgeList.csv');
```

File "edgeList.csv" was successfully saved.

It is quite easy to get started with Twitter Analytics with MATLAB and hopefully you got the taste of what kind of analyses are possible.

We only scratched the surface. Twitter offers many of the most interesting opportunities for data analytics. How would you use Twitter Analytics? Check out some examples from this search result from PLOS ONE that list various papers that used Twitter for their study. Tell us about your Twitty experiences here.

Get
the MATLAB code

Published with MATLAB® R2014a

Here’s a quick example to show you how it works. Let’s suppose you have some data. Say, the per capita consumption of margarine in the US (in pounds). Mmmm. Margarine.

`yrs = 2000:2009;`

margarineConsumption = [8.2 7 6.5 5.3 5.2 4 4.6 4.5 4.2 3.7];

plot(yrs,margarineConsumption)

As soon as you see the margarine data, you think to yourself “Wait a second! Where have I seen that trend before?” A moment of reflection, and suddenly it hits you: Maine divorce rates! That margarine plot is a dead ringer for divorce rates in Maine. Let’s do a plotyy.

`maineDivorceRate = [5 4.7 4.6 4.4 4.3 4.1 4.2 4.2 4.2 4.1];`

plotyy(yrs,margarineConsumption,yrs,maineDivorceRate)

Sure enough, the fit is remarkable*. Now let’s say we want to share our plot as an interactive web page. After adding the Plotly API code to your path, just type

`fig2plotly`

and you get a link to a web page like this:

Cool! Follow the link to try out the interaction.

* The remarkable margarine-divorce correlation was unearthed by Tyler Vigen on his fascinating Spurious Correlations page.

]]>The functions `ode23` and `ode45` are the principal MATLAB and Simulink tools for solving nonstiff ordinary differential equations.

The two functions `ode23` and `ode45` are single step ODE solvers. They are also known as Runge-Kutta methods. Each step is almost independent of the previous steps. Two important pieces of information are passed from one step to the next. The step size $h$ expected to achieve a desired accuracy is passed from step to step. And, in a strategy known as FSAL, for First Same as Last, the final function value at the end of a successful step is used at the initial function value at the following step.

The BS23 algorithm is due to Larry Shampine and his student Przemyslaw Bogacki. Przemyslaw is now a Professor at Old Dominion University. The " *23* " in the function name indicates that two simultaneous single-step formulas, one of second order and one of third order, are involved.

The method has three stages, but there are four slopes $s_i$ because, after the first step, the $s_1$ for one step is the $s_4$ from the previous step. The essentials are

$$ s_1 = f(t_n,y_n) $$

$$ s_2 = f\left(t_n+{h \over 2},y_n+{h \over 2} s_1\right) $$

$$ s_3 = f\left(t_n+{3 \over 4} h, y_n+{3 \over 4} h s_2\right) $$

$$ t_{n+1} = t_n + h $$

$$ y_{n+1} = y_n + {h \over 9} (2 s_1 + 3 s_2 + 4 s_3) $$

$$ s_4 = f(t_{n+1},y_{n+1}) $$

$$ e_{n+1} = {h \over 72} (-5 s_1 + 6 s_2 + 8 s_3 - 9 s_4) $$

Here is a graphical view of the steps.

ode23steps

These graphics show the starting situation and the three stages. We start at a point $(t_n,y_n)$ with an initial slope $s_1 = f(t_n,y_n)$ and an estimate of a good step size, $h$. Our goal is to compute an approximate solution $y_{n+1}$ at $t_{n+1} = t_n+h$ that agrees with the true solution $y(t_{n+1})$ to within the specified tolerances.

The first stage uses the initial slope $s_1$ to take an Euler step halfway across the interval. The function is evaluated there to get the second slope, $s_2$. This slope is used to take an Euler step three-quarters of the way across the interval. The function is evaluated again to get the third slope, $s_3$. A weighted average of the three slopes is used to step all the way across the interval to get a tentative value for $y_{n+1}$.

$$ s = {1 \over 9} (2 s_1 + 3 s_2 + 4 s_3) $$

The function is evaluated once more to get $s_4$. The error estimate then uses all four slopes.

$$ e_{n+1} = {h \over 72} (-5 s_1 + 6 s_2 + 8 s_3 - 9 s_4) $$

If the error is within the specified tolerance, the step is successful, the tentative value of $y_{n+1}$ is accepted, and $s_4$ becomes the $s_1$ of the next step. If the error is too large, then the tentative $y_{n+1}$ is rejected and the step must be redone. In either case, the error estimate $e_{n+1}$ provides the basis for determining the step size $h$ for the next step.

Notice that $y_{n+1}$ is computed before $s_4$. The final function evaluation is used for the error estimate, but not for the step itself. But this $s_4$ is the $s_1$ in the next step.

The coefficients in the error estimate $e_{n+1}$ result from the difference between the third order formula that is used to compute $y_{n+1}$ and an independent second order formula that involves the same function values. That second order result is not actually computed because its value is not needed.

All the solvers in the suite provide an interpolant that can generate values approximating the solution to the differential equation to the desired accuracy anywhere in the interval without requiring further evaluation of the function defining the ode. In the case of `ode23` this interpolant happens to be "free". In turns out that the Hermite cubic polynomial defined by the four values

$$ y_n, \ F(t_n,y_n), \ y_{n+1}, \ F(t_{n+1}, \ y_{n+1}) $$

does the job. For other solvers in the suite, providing the accompanying interpolant is an important aspect of the algorithm derivation.

Before today's version of `ode45`, there was an earlier one. In a 1969 NASA report, Erwin Fehlberg introduced a so-called six stage Runge-Kutta method that requires six function evaluations per step. These function values can be combined with one set of coefficients to produce a fifth-order accurate approximation and with another set of coefficients to produce an independent fourth-order accurate approximation. Comparing these two approximations provides an error estimate and resulting step size control.

Notice that it takes six stages to get fifth order. It is not possible to get fifth order with only five function evaluations per step. The combinatorial complexity of the Taylor series in two variables for $F(t,y)$ overpowers the information available from the function evaluations. In fact, if you continue to investigate the development of Runge-Kutta methods, you will find that, for example, with ten stages it is only possible to achieve seventh order.

In the early 1970's Shampine and his colleague H. A. (Buddy) Watts at Sandia Laboratories published a Fortran code, RKF45, based on Fehlberg's algorithm. In 1977, we made RKF45 the ODE solver in our text book *Computer Methods for Mathematical Computations*, by Forsythe, Malcolm and Moler. This book is now out of print so I can't provide a decent link to it. But the Fortran source code for RKF45 is still available from netlib.

One thing that I will always remember about RKF45 is ${12 \over 13}$ The fourth function evaluation, $s_4$, is made at the point

$$ t_n + {12 \over 13} h $$

I doubt that you will ever come across ${12 \over 13}$ anyplace else in mathematical software.

RKF45 became the basis for the first version of ODE45 in MATLAB in the early 1980s and for early versions of Simulink. The Felhberg (4,5) pair did a terrific job for almost fifteen years until the late 1990s when Shampine and MathWorker Mark Reichelt modernized the suite and introduced a more efficient algorithm.

The new `ode45` introduced in the late 1990s is based on an algorithm of Dormand and Prince. It uses six stages, employs the FSAL strategy, provides fourth and fifth order formulas, has local extrapolation and a companion interpolant.

The new `ode45` is so effective that it is the only function in the suite where the default value of the `refine` argument is set to 4. This means that the step size the algorithm naturally wants to choose is so large that the output is more widely spaced than most people prefer. So the interpolant is called up to produce more finely spaced output at no additional cost measured in function evaluations.

The codes for the two routines `ode23` and `ode45` are very similar. The most important place they differ is the portion that sets the key parameters. Here are the parameters in `ode23`.

dbtype 200:209 ode23

200 % Initialize method parameters. 201 pow = 1/3; 202 A = [1/2, 3/4, 1]; 203 B = [ 204 1/2 0 2/9 205 0 3/4 1/3 206 0 0 4/9 207 0 0 0 208 ]; 209 E = [-5/72; 1/12; 1/9; -1/8];

The parameter `pow` is used in the step size calculation. We see that `ode23` is a third order method. The array `A` gives the fractions for each partial step. The array `B` provides the weights for the slopes. And the array `E` provides the coefficients in the error estimate. Here are the corresponding parameters for the Dormand-Prince algorithm used in `ode45`.

dbtype 201:213 ode45

201 % Initialize method parameters. 202 pow = 1/5; 203 A = [1/5, 3/10, 4/5, 8/9, 1, 1]; 204 B = [ 205 1/5 3/40 44/45 19372/6561 9017/3168 35/384 206 0 9/40 -56/15 -25360/2187 -355/33 0 207 0 0 32/9 64448/6561 46732/5247 500/1113 208 0 0 0 -212/729 49/176 125/192 209 0 0 0 0 -5103/18656 -2187/6784 210 0 0 0 0 0 11/84 211 0 0 0 0 0 0 212 ]; 213 E = [71/57600; 0; -71/16695; 71/1920; -17253/339200; 22/525; -1/40];

`ode23` is a three-stage, third-order, Runge-Kutta method. `ode45` is a six-stage, fifth-order, Runge-Kutta method. `ode45` does more work per step than `ode23`, but can take much larger steps. For differential equations with smooth solutions, `ode45` is often more accurate than `ode23`. In fact, it may be so accurate that the interpolant is required to provide the desired resolution. That's a good thing.

`ode45` is the anchor of the differential equation suite. The MATLAB documentation recommends `ode45` as the first choice. And Simulink blocks set `ode45` as the default solver. But I have a fondness for `ode23`. I like its simplicity. I particularly like it for graphics. The natural step size that `ode23` chooses is frequently just right for display purposes. For example, the plot of the Lorenz chaotic attractor at the end of my previous post is done with `ode23` choosing the step size.

So give `ode23` a try.

Get
the MATLAB code

Published with MATLAB® R2014a

I'd like to introduce this week's guest blogger Alan Weiss. Alan writes documentation for mathematical toolboxes here at MathWorks.

Do you use `GlobalSearch` or `MultiStart` for finding multiple local solutions to your optimization problems? Both of these solvers can report the locations and objective function values of the local solutions they find, as well as the starting points that led to each local solution.

Sometimes, though, the solvers report too many local solutions. Their definition of what constitutes "distinct" solutions can differ from yours. This article shows the problem, and two solutions. One solution involves setting `GlobalSearch` or `MultiStart` tolerances. But this can be difficult if you don't know beforehand how close solutions can be. The other involves the `clumpthem` function; download it here.

For example, look at the six-hump camelback function described in the documentation. Get the `sixhumps` code.

sixmin = sixhumps

sixmin = @(x)(4*x(:,1).^2-2.1*x(:,1).^4+x(:,1).^6/3+x(:,1).*x(:,2)-4*x(:,2).^2+4*x(:,2).^4)

You can see in the contour plot that there are six local minima. The colored asterisks mark the minima.

Take a look at the number of local minima reported by running `MultiStart` with the `fmincon` local solver and active-set algorithm.

ms = MultiStart; opts = optimoptions(@fmincon,'Algorithm','active-set'); problem = createOptimProblem('fmincon','x0',[-1,2],... 'objective',sixmin,'lb',[-3,-3],'ub',[3,3],... 'options',opts); rng default % for reproducibility [xminm,fminm,flagm,outptm,manyminsm] = run(ms,problem,50); % run fmincon 50 times size(manyminsm)

MultiStart completed the runs from all start points. All 50 local solver runs converged with a positive local solver exit flag. ans = 1 50

How could there be 50 separate minima reported when you know that there are only six points that are local minima? The answer is that many of these minima are close to each other. For example, the first three points look the same.

manyminsm(1).X manyminsm(2).X manyminsm(3).X

ans = -0.089843 0.71266 ans = -0.089844 0.71265 ans = -0.089834 0.71265

Of course, these reported minima aren't really the same.

norm(manyminsm(1).X - manyminsm(2).X)

ans = 7.6691e-06

The point is, its default tolerances cause `MultiStart` to report different minima if they differ by more than 1e-6 in value or position. And these minima are more than 1e-6 apart.

You can loosen the default `MultiStart` tolerances to have `MultiStart` automatically combine similar minima.

rng default % for reproducibility ms.TolX = 1e-3; ms.TolFun = 1e-4; [xminloose,fminloose,flagloose,outptloose,manyminsloose] = run(ms,problem,50); size(manyminsloose)

MultiStart completed the runs from all start points. All 50 local solver runs converged with a positive local solver exit flag. ans = 1 6

Loosening the tolerances caused `MultiStart` to give just the six truly different local minima. But this process involved rerunning the solver.

Wouldn't it be better to clump the minima without running the solver again? That is the point of the `clumpthem` function. It takes `MultiStart` or `GlobalSearch` solutions and clumps them to the tolerances you want. For example

```
clumped = clumpthem(manyminsm,1e-3,1e-4); % the first tolerance is TolX, then TolFun
size(clumped)
```

ans = 1 6

The answers you get either way are identical.

isequal(manyminsloose(1).X, clumped(1).X)

ans = 1

How does `clumpthem` work? It uses the fact that `MultiStart` and `GlobalSearch` return solutions in order, from best (lowest objective function value) to worst. It takes the first solution and compares it to the second, both in terms of objective function values and distance between the solutions. If both comparisons are below the tolerances, `clumpthem` removes the second solution, adding its starting point to the list of `X0` starting points. If the objective function value of the second solution is too high, then `clumpthem` starts a new clump. If the objective function difference is small enough, but the distance between solutions is too large, `clumpthem` proceeds to compare the first solution with the third. It continues in this way until it runs out of solutions to clump.

This procedure is quite speedy, as you will see in the next section.

Check the speed of `clumpthem`.

tic; clumped = clumpthem(manyminsm,1e-3,1e-4); toc

Elapsed time is 0.006322 seconds.

`MultiStart` is much slower, even with this fairly simple function:

rng default % for reproducibility tic; [xminloose,fminloose,flagloose,outptloose,manyminsloose] = run(ms,problem,50); toc

MultiStart completed the runs from all start points. All 50 local solver runs converged with a positive local solver exit flag. Elapsed time is 0.863509 seconds.

Do you use `MultiStart` or `GlobalSearch` to search for multiple local solutions? Have you been annoyed by finding spurious differences in solutions? Tell us about it here.

Get
the MATLAB code

Published with MATLAB® R2014a

I look up at the sky just after sunset and I see an especially bright star. It's probably a planet. But which one?

This question gives me a good opportunity to play around with MATLAB. Let's do a visualization that shows where the planets are relative to the earth and the sun. In the process, we'll use JSON services, the File Exchange, MATLAB graphics, and 3-D vector mathematics cribbed from Wikipedia.

Here is the basic grade-school illustration of the solar system, the one that shows the planets rolling around the sun like peas on a plate. For simplicity, we're just showing the sun, the earth, the moon, Venus, and Mars.

But we never see anything like this with our own eyes. Instead, we see bright spots on a dark background somewhere "up there." So let's simplify our problem to determining what direction each naked-eye planet is in. This leads to an image like this.

Our goal is to make an accurate up-to-date version of this diagram. Specifically, relative to the sun, where should we look to find the moon and the naked-eye planets (Mercury, Venus, Mars, Jupiter, and Saturn)? To do this, we need to solve a few problems.

- Find the planets
- Find the unit vector pointing from earth to each planet
- Squash all these vectors onto a single plane
- Visualize the resulting disk

First of all, where are the planets? There's a free JSON service for just about everything these days. I found planetary data hosted on Davy Wybiral's site here:

http://davywybiral.blogspot.com/2011/11/planetary-states-api.html

```
url = 'http://www.astro-phys.com/api/de406/states?bodies=sun,moon,mercury,venus,earth,mars,jupiter,saturn';
json = urlread(url);
```

Now we can use Joe Hicklin's JSON parser from the File Exchange. It returns a well-behaved MATLAB structure.

data = JSON.parse(json)

data = date: 2.4568e+06 results: [1x1 struct] unit: 'km'

The payload is in the "results" field. Each entry has three position components and three velocity components.

data.results

ans = mercury: {{1x3 cell} {1x3 cell}} sun: {{1x3 cell} {1x3 cell}} moon: {{1x3 cell} {1x3 cell}} jupiter: {{1x3 cell} {1x3 cell}} mars: {{1x3 cell} {1x3 cell}} earth: {{1x3 cell} {1x3 cell}} venus: {{1x3 cell} {1x3 cell}} saturn: {{1x3 cell} {1x3 cell}}

The distances are in kilometers, and I'm not even sure how this representation is oriented relative to the surrounding galaxy. But it doesn't really matter, because all I care about is the relative positions of the bodies in question.

Side note: We could also have used the Aerospace Toolbox to get the same information.

`[pos,vel] = planetEphemeris(juliandate(now),'Sun','Earth')`

% List of bodies we care about ssList = {'sun','moon','mercury','venus','earth','mars','jupiter','saturn'}; ss = []; for i = 1:length(ssList) ssObjectName = ssList{i}; ss(i).name = ssObjectName; ssData = data.results.(ssObjectName); ss(i).position = [ssData{1}{:}]; ss(i).velocity = [ssData{2}{:}]; end

% Plot in astronomical units au = 149597871; k = 5; cla for i = 1:length(ss) p = ss(i).position/au; line(p(1),p(2),p(3), ... 'Marker','.','MarkerSize',24) text(p(1),p(2),p(3),[' ' ss(i).name]); end view(3) grid on axis equal

This is accurate, but not yet very helpful. Let's now calculate the geocentric position vectors of each planet. To do this, we'll put the earth at the center of the system. Copernicus won't mind, because A) he's dead, and B) we admit this reference frame orbits the sun.

We're also going to use another file from the File Exchange. Georg Stillfried's mArrow3 will help us make nice 3-D arrows in space.

clf pEarth = ss(5).position; for i = 1:length(ss) % pe = position relative to earth % (i.e. a vector pointing from earth to body X) pe = ss(i).position - pEarth; % pne = normalized position relative to earth pne = pe/norm(pe); ss(i).pne = pne; mArrow3([0 0 0],pne, ... 'stemWidth',0.01,'FaceColor',[1 0 0]); text(pne(1),pne(2),pne(3),ss(i).name, ... 'HorizontalAlignment','center'); hold on end light hold off axis equal axis off axis([-1.2 1.2 -0.8 1.1 -0.2 0.8])

These are unit vectors pointing out from the center of the earth towards each of the other objects. It's a little hard to see, but these vectors are all lying in approximately the same plane.

If we change our view point to look at the system edge-on, we can see the objects are not quite co-planar. For simplicity, let's squash them all into the same plane. For this, we'll use the plane defined by the earth's velocity vector crossed with its position relative to the sun. This defines "north" for the solar system.

pEarth = ss(5).position; pSun = ss(1).position; vEarth = ss(5).velocity; earthPlaneNormal = cross(vEarth,pSun - pEarth); % Normalize this vector earthPlaneNormalUnit = earthPlaneNormal/norm(earthPlaneNormal); mArrow3([0 0 0],earthPlaneNormalUnit, ... 'stemWidth',0.01,'FaceColor',[0 0 0]); view(-45,15) axis([-1.2 1.2 -0.8 1.1 -0.2 0.8])

Now we project the vectors onto the plane defined by earth's motion around the sun. I learned what I needed from Wikipedia here: Vector Projection.

Since we are working with the normal, we are technically doing a vector rejection. Using Wikipedia's notation, this is given by

$$ \mathbf{a_2} = \mathbf{a} - \frac{\mathbf{a} \cdot \mathbf{b}}{\mathbf{b} \cdot \mathbf{b}} \mathbf{b} $$

hold on for i = 1:length(ss) pne = ss(i).pne; pneProj = pne - dot(pne,earthPlaneNormalUnit)/dot(earthPlaneNormalUnit,earthPlaneNormalUnit)*earthPlaneNormalUnit; ss(i).pneProj = pneProj; mArrow3([0 0 0],pneProj, ... 'stemWidth',0.01,'FaceColor',[0 0 1]); end hold off axis equal

We're close to the end now. Let's just calculate the angle between the sun and each element. Then we'll place the sun at the 12:00 position of our planar visualization and all the other planets will fall into place around it.

Calculate the angle between the sun and each of the bodies. Again, from the Wikipedia article, we have

$$ cos \theta = \frac{\mathbf{a} \cdot \mathbf{b}}{|\mathbf{a}||\mathbf{b}|} $$

sun = ss(1).pneProj; ss(1).theta = 0; for i = 1:length(ss) pneProj = ss(i).pneProj; cosTheta = dot(sun,pneProj)/(norm(sun)*norm(pneProj)); theta = acos(cosTheta); % The earth-plane normal vector sticks out of the plane. We can use it % to determine the orientation of theta x = cross(sun,pneProj); theta = theta*sign(dot(earthPlaneNormalUnit,x)); ss(i).theta = theta; end

cla k1 = 1.5; k2 = 1.2; for i = 1:length(ss) beta = ss(i).theta + pi/2; x = cos(beta); y = sin(beta); mArrow3([0 0 0],[x y 0], ... 'stemWidth',0.01, ... 'FaceColor',[0 0 1]); line([0 k1*x],[0 k1*y],'Color',0.8*[1 1 1]) text(k1*x,k1*y,ss(i).name,'HorizontalAlignment','center') end t = linspace(0,2*pi,100); line(k2*cos(t),k2*sin(t),'Color',0.8*[1 1 1]) line(0,0,1,'Marker','.','MarkerSize',40,'Color',[0 0 1]) axis equal axis(2*[-1 1 -1 1])

And there you have it: an accurate map of where the planets are in the sky for today's date. In this orientation, planets "following" the sun through the sky are on the left side of the circle. So Jupiter will be high in the sky as the sun sets.

And that is a very satisfying answer to my question, by way of vector math, JSON feeds, MATLAB graphics, and the File Exchange.

Get
the MATLAB code

Published with MATLAB® R2014a

That’s awesome. I was always hoping for that. I just linked my Github Repository to File Exchange :)

This is a nice feature, and thanks for the information. Github is friendly for development and bug fixing and MATLAB Central is professional for publishing MATLAB packages.

This is great news! Updating and maintaining the code was an issue with File Exchange. Thank you for making this connection!

That’s great news and a very logical step towards more open source collaboration in MATLAB. Good job!

Since we launched the feature, we’ve received an average of two GitHub links per day, for a total of 28 repositories so far. Thanks to all of you who have added a link to your GitHub project.

If you want to keep track of the progress, File Exchange guru Josh Natanson has created a trend for it over on Trendy. Watch the number grow! And if you want to see the actual list of GitHUb projects, follow this link:

http://www.mathworks.com/matlabcentral/fileexchange/?term=is_github:true

]]>We’ve added a direct connection between File Exchange and GitHub. In practice, this means that if your code already lives on GitHub, you can leave it there. The File Exchange is now smart enough to distribute the code directly from GitHub to anyone who asks for it from MATLAB Central. You get all the benefits of collaborative development in GitHub while community members get access to the latest version of your projects. No need to maintain the files in two different locations. Push your changes up to GitHub, and seconds later that’s what gets served from the File Exchange.

We’re very excited about this feature, because we know there are already a lot of excellent MATLAB-related repos on GitHub. With this feature, we join two strong communities and open a new era on MATLAB Central.

]]>