Jiro's pick this week is MATLAB Plot Gallery by Plot Gallery.

This gallery, which hosts a collection of File Exchange entries showing various plotting examples, has been highlighted once before. I want to bring this up once again, since the examples are now updated with the new graphics system in R2014b. The example code is backward compatible, so you can still run them in older releases, but running them in the current release (R2014b) will give you the new look.

So what's changed with the new graphics system? ** A lot!!** Rather than, talking about them here, let me point you to a couple of great resources. The first is a series of blog posts
in Loren's blog:

- Part 1: Features of the New Graphics System
- Part 2: Using Graphics Objects
- Part 3: Compatibility Considerations in the New Graphics System (
)**coming soon**

These posts will give you a nice introduction to the new graphics system.

The other resource is the newly introduced blog, Mike on MATLAB Graphics. Mike is a MathWorks developer working on MATLAB's graphics system, and he will be writing about computer graphics concepts and MATLAB graphics techniques.

**Comments**

Take a look at these resources and let us know what you think here.

Get
the MATLAB code

Published with MATLAB® R2014b

Coursera is a technology platform that kickstarted the current MOOCs boom. Even though there are more MOOCs players now, it still remains one of the leading companies in this space. But how are they doing these days for delivering higher education to the masses online?

Today's guest blogger, Toshi Takeuchi, would like to share an analysis using Courera's data.

I am a big fan of MOOCs and I benefited a lot from free online courses on Coursera, such as Stanford's Machine Learning course. Like many websites these days, Coursera offers its data through REST APIs. Coursera offers a number of APIs, but Catalog APIs are available without OAuth authentication. We can find out the details of courses offered by Coursera with these APIs.

We can try to answer questions like *"how do STEM and non-STEM courses break down among universities?*"

JSON is a very common data format for REST APIs, and Coursera's APIs also returns results in JSON format. MATLAB now supports JSON out of the box in R2014b. You could always use JSON from within MATLAB by taking advantage of user contributed MATLAB programs on File Exchange, but built-in JSON support makes it easy for us to share scripts that use JSON, because we don't have to worry about dependencies.

Let's try the new feature using Coursera APIs. Calling a REST API is very simple with `webread`.

restApi='https://api.coursera.org/api/catalog.v1/courses'; params = 'sessions,universities,categories'; resp=webread(restApi,'includes',params,weboptions('Timeout',60));

`webread` returns the JSON response as a structure array. The data is further processed in a separate script `processData.m` - check out the details if interested.

We need to decide which categories represent STEM subjects. When there are multiple categories assigned to a given course, we treat it as a STEM course as long as one of them is included in STEM categories.

processData

STEM categories 'Computer Science: Theory' 'Economics & Finance' 'Medicine' 'Mathematics' 'Physical & Earth Sciences' 'Biology & Life Sciences' 'Computer Science: Systems & Security' 'Computer Science: Software Engineering' 'Engineering' 'Statistics and Data Analysis' 'Computer Science: Artificial Intelligence' 'Physics' 'Chemistry' 'Energy & Earth Sciences'

As a sanity check, let's plot the number of courses vs. number of sessions by university. A single course can be offered repeatedly in multiple sessions. Therefore you can determine the longevity or age of a given course by the count of sessions.

If it is a new course, or it was not repeated, then you only have one session per course. We can use this as the baseline, and check how universities scaled up their courses relative to this baseline.

R2014b comes with new MATLAB Graphics System, but you can still use the familiar commands for plotting.

% group by number of courses grouping = ones(height(universities),1)*2; grouping(universities.courses > 25) = 1; grouping(universities.courses <= 10) = 3; % plot figure gscatter(universities.courses,universities.sessions,grouping) h = refline(1,0); set(h,'Color','m','LineStyle',':') h = refline(2,0); set(h,'Color','m','LineStyle',':') h = refline(3,0); set(h,'Color','m','LineStyle',':') h = refline(6,0); set(h,'Color','m','LineStyle',':') xlabel('Number of Courses'); ylabel('Number of Sessions'); title('\fontsize{14} Courses by Sessions by University'); legend('Universities with 25+ courses','Universities with 10+ courses',... 'Universities with 1-10 courses','Ref line: 1 session per course',... 'Ref line: 2 sessions per course','Ref line: 3 sessions per course',... 'Ref line: 6 sessions per course','Location','NorthWest') % add university names for i = 1:height(universities) if universities.courses(i) > 10 && universities.sessions(i) > 20 text(universities.courses(i),universities.sessions(i),... universities.shortName{i},'FontSize',12) end end

You can see that Stanford, Penn (University of Pennsylvania), JHU (Johns Hopkins), and Duke are leading the pack. They are the early adopters, based on the number of sessions. It is interesting to see PKU (Peking University) leading international institutions. They offer a number of courses in Chinese. Coursera didn't start international partnership until recently, so it is quite remarkable the PKU has broadened their online content in relatively short time. More recent entrants are on the left with fewer courses and sessions.

Established players are trying to scale up by repeating the sessions. JHU seems to be particularly aggressive in terms of the number of courses they offer and how they are repeated as sessions.

Let's plot the number of courses by ratio of STEM courses by university. This will tell us which schools are making investments in online education content, and whether they focus on STEM or non-STEM subjects. The size of the marker indicates the total number of sessions they are associated with, so it also gives us how long they have been involved in Coursera. Notice the `parula` colormap used in the colorbar, the new default colormap in R2014b.

% use sesion count for setting marker sizes markerSize = universities.sessions; % we need to scale the marker size markerSize = (markerSize - min(markerSize))/(max(markerSize)-min(markerSize)); markerSize = markerSize * 1000; markerSize(markerSize == 0) = 1; % change the tick labels to reflect the original values barticks = num2cell(20:20:200); % create a scatter plot figure scatter(universities.courses,universities.stem_ratio,markerSize,markerSize,'fill') xlim([0 40]) h = colorbar('TickLabels',barticks); h.Label.String = '\fontsize{11}Number of Sessions'; title('\fontsize{14} Ratio of STEM courses by University on Coursera') xlabel('\fontsize{11}Number of Courses'); ylabel('\fontsize{11}Ratio of STEM Courses'); % add university names for i = 1:height(universities) if universities.stem_ratio(i) ~= 0 && universities.stem_ratio(i) ~= 1 && universities.courses(i) >= 5 text(universities.courses(i),universities.stem_ratio(i),universities.shortName{i},'FontSize',12) end end % add reference lines line([25 25],[0 1],'LineStyle',':') line([10 10],[0 1],'LineStyle',':') line([0 40],[0.5 0.5],'LineStyle',':')

Stanford is very heavy on STEM subjects, while others are more balanced. More recent entrants on the left have a wider variance in how STEM heavy their courses are. Perhaps rate of adaption is different among different academic disciplines?

We can plot the ratio of courses per category in order to see the relative representation of academic disciplines on Coursera. A course can belong to multiple categories, and in such cases a count is split equally across the included categories. Note that you can now rotate axis tick labels in R2014b.

% get the count of categories by university catByUniv = zeros(height(universities),height(categories)); for i = 1:length(T.categories) row = ismember(universities.id,T.universities(i)); col = ismember(categories.id,T.categories{i}); catByUniv(row,col) = catByUniv(row,col) + 1/length(T.categories{i}); end % segment the universities by number of courses catByTiers = [sum(catByUniv(grouping == 1,:));... sum(catByUniv(grouping == 2,:)); sum(catByUniv(grouping == 3,:))]; % get the ranking of categories by number of courses [~,ranking] = sort(sum(catByUniv(universities.courses > 25,:)),'descend'); % get the ratio of courses by category catByTiers = bsxfun(@rdivide,catByTiers,sum(catByTiers,2)); % plot a bar graph figure xticks = [{''};categories.name(ranking);{''}]; h = bar(catByTiers(:,ranking)'); xlim([0 26]); ax = gca; set(ax,'XTick',0:26);set(ax,'XTickLabel',xticks);set(ax,'XTickLabelRotation',270); title('\fontsize{14} Ratio of Courses Per Category') legend('Universites with 25+ courses','Universites with 10+ courses','Universites with 1-10 courses','Location','Best')

It looks like there was more STEM bias among the early adopters (universities with a lot of courses) but new entrants (universities with fewer courses) tend to have more non-STEM courses. Categories like Social Sciences, Humanities, Business and Management, Education, Teacher Professioal Development, Music, Film and Audio are on the rise.

Why do we see this non-STEM shift? There are a number of possible explanations.

- In the beginning, Coursera courses relied on autograders. They were well suited for quantitative STEM subjects, but not for non-STEM subjects.
- Autograders were custom built for respective courses and they are in fact full SaaS applications. It was difficult to scale the number of courses if you needed to build for each course a custom SaaS app that can withstand substantial peak traffic near the deadline -
*this human behavior is pretty universal* - Later, Coursera introduced a crowd sourced essay grading system that can be used across multiple courses. This freed universities from the burden of creating custom SaaS apps.
- This led to rapid expansion of course offerings and made non-STEM subjects viable. In fact, I took a number of STEM courses from JHU, and they tend to use essay grading system rather than autograders.

There are questions we cannot answer with the data at hand. For example, is the shift driven by the convenience of supply side (universities) or by the demand for non-STEM subjects by the public?

There are no strict prerequisites for Coursera courses, but the bar is still high for STEM courses. Therefore it is quite possible that the potential market size is larger for non-STEM subjects.

You also saw how easy it is to use REST API with JSON response within R2014b, and got a quick look at some of the new features of updated MATLAB Graphics System. Download the new release, try those new features yourself and share what you find here!

Get
the MATLAB code

Published with MATLAB® R2014b

**Differential Equation Editor**

Try typing `dee` in MATLAB. This will launch an example model that looks like:

If you open one of the demo and double-click on the block, you will see a nice little user interface:

In this interface, you can type any equation you want, using the format of the Fcn block.

When I first saw this model and user interface, I thought *"This thing looks really old, has it always been there, unnoticed?"*. So I looked into our source code history and found out this example ships with Simulink and has been untouched since the middle of the 90's!

**How does that work?**

I strongly encourage you to look under the mask and observe how this block is implemented. It is a pretty good example of how to create models programmatically. Once you click the rebuild button in the user interface, functions like `delete_block`, `add_block` and `add_line` draw the differential equation in a systematic manner.

For any set of equation, the resulting model will look like this, where one Fcn block is used for each state equation and for each output equation:

To help you visualize it better, I formatted the above model to make it easier to read:

**Now it's your turn**

Give this cute little example a try and let us know what you think by leaving a comment here.

]]>Today, David Garrison, our guest blogger, will continue his series on the new graphics system in R2014b.

- Part 1: Features of the New Graphics System
**Part 2: Using Graphics Objects**- Part 3: Compatibility Considerations in the New Graphics System

Here is Part 2 of the series.

In Part 1 of this series, I provided an introduction to the new MATLAB graphics system in R2014b. I described a number of new features and alluded to one of the big changes in R2014b -- *graphics functions return MATLAB objects, not numeric handles*.

When I use MATLAB, I usually don't think about the internal graphics system. Usually, I'm just trying to visualize some data. I might call one of the MATLAB charting functions, like `plot` or `bar` to understand something important about my data. I also create user interfaces using functions like `uicontrol` or `uipanel`. These are functions that are part of the MATLAB graphics system.

Let's suppose that I am creating a plot and I want the line in the plot to be red. I can do that in one of two ways. I can call the `plot` function with some extra arguments like this.

x = 0:0.1:10; y = sin(x); plot(x,y,'Color','red')

I can also use the output argument of the `plot` command to modify my plot after I've created it. Here I created the plot first, then used the `set` command to change the color of the line.

p = plot(x,y) ; set(p, 'Color', 'red')

Prior to R2014b, a call to a graphics function would return a number like this

p = plot(x,y)

p = 174.346

The value of the variable `p` was a special kind of number called a *handle*. The handle was a reference to a graphics object. The object itself was not available to the user but you could use the handle to retrieve or change properties of the object using the `set` and `get` commands. In the example above, we used the handle of the line object to set its color. In fact, when this capability first became available in MATLAB, this approach was so new and useful it was given a special name -- *Handle Graphics*.

All graphics handles were just numeric values. Those values were integers for figures and fractional values for everything else. You could use a handle anywhere you could use a number. I've seen MATLAB code where people stored their handles with other data in a double array or used handles in functions where a number is expected as the argument (e.g. math functions).

As you probably know, graphics objects are stored as a tree structure with parents and children. There is a special object at the top of the tree called the *graphics root*. Prior to R2014b, the handle to the graphics root was always `0`. For example to get the list of all open figures, you could type:

```
myFigures = get(0, 'Children');
```

In R2014b, graphics functions do not return numeric handles any more. They now return MATLAB objects. Now when you call a graphics function with an output argument, you see something like this:

p = plot(x,y)

p = Line with properties: Color: [0 0.4470 0.7410] LineStyle: '-' LineWidth: 0.5000 Marker: 'none' MarkerSize: 6 MarkerFaceColor: 'none' XData: [1x101 double] YData: [1x101 double] ZData: [1x0 double] Use GET to show all properties

The `plot` function now returns a `Line` object. That line object has properties. When MATLAB displays the line object in the command window, it shows the object's most common properties. The `all properties` text is a hyperlink that you can click to see all the properties of the object.

If you use the `whos` function to see more information about `p`, you see something like this:

```
whos p
```

Name Size Bytes Class Attributes p 1x1 112 matlab.graphics.chart.primitive.Line

The *Class* column in the `whos` output indicates what kind of object if is. What you see here is the full class name of the object. Don't worry too much about the details of the class name. Most of you will never need to use it.

In R2014b, use the `groot` function to refer to the graphics root.

groot

ans = Graphics Root with properties: CurrentFigure: [1x1 Figure] ScreenPixelsPerInch: 96 ScreenSize: [1 1 1920 1200] MonitorPositions: [1 1 1920 1200] Units: 'pixels' Use GET to show all properties

One of the benefits of having graphic objects is that you can now get and set their values using dot notation. Dot notation has the form `object.Property`. It is the same syntax that you use when you want to refer to the field of a structure. For example, in R2014b, you can get the color of the line above like this:

p.Color

ans = 0 0.4470 0.7410

Similarly, you can set the line width of the line using dot notation.

p.LineWidth = 2;

Dot notation is also useful if you want to set properties of two objects to be the same. Prior to R2014b, you might write code that looks like this:

set(p2, 'Color', get(p1, 'Color'))

With dot notation, you can do the same thing with

p2.Color = p1.Color;

With dot notation you can also use tab completion when you are trying to access a graphics object property.

You can still use the `set` and `get` functions but I find dot notation a much more convenient way to refer to object properties. One note of caution, however. When you use dot notation you must use the correct capitalization of the property name. For example,

p.LineWidth

will return the value of the line width but

p.linewidth

will cause an error.

As I mentioned above, you can still use the `set` and `get` functions and there are cases where `set` and `get` are useful.

You can use the `get` command if you want to know all the properties of a graphics object. This is the same list of properties you see if you click the `all properties` link in the example above.

get(p)

AlignVertexCenters: 'off' Annotation: [1x1 matlab.graphics.eventdata.Annotation] BeingDeleted: 'off' BusyAction: 'queue' ButtonDownFcn: '' Children: [] Clipping: 'on' Color: [0 0.4470 0.7410] CreateFcn: '' DeleteFcn: '' DisplayName: '' HandleVisibility: 'on' HitTest: 'on' Interruptible: 'on' LineStyle: '-' LineWidth: 2 Marker: 'none' MarkerEdgeColor: 'auto' MarkerFaceColor: 'none' MarkerSize: 6 Parent: [1x1 Axes] PickableParts: 'visible' Selected: 'off' SelectionHighlight: 'on' Tag: '' Type: 'line' UIContextMenu: [] UserData: [] Visible: 'on' XData: [1x101 double] XDataMode: 'manual' XDataSource: '' YData: [1x101 double] YDataSource: '' ZData: [1x0 double] ZDataSource: ''

The `set` function is useful if you want to know what options are available for a given property. In this case you can call `set` with the property name but no value for the property.

```
set(p,'LineStyle')
```

'-' '--' ':' '-.' 'none'

Another use of `set` occurs when you need to reference a property from an array of graphics objects. Suppose you have an array of line objects created by the `plot` command:

x = 0:0.1:10; y1 = sin(x); y2 = cos(x); lineArray = plot(x,y1,x,y2)

lineArray = 2x1 Line array: Line Line

You can use the `set` function to change the colors of all the lines in the array in a single command.

set(lineArray, 'Color', 'blue')

What do you think about the change to graphics objects in R2014b? Have you tried using dot notation for getting and setting graphics object properties? We'd love to hear your thoughts here.

There are some other changes in R2014b that will impact some existing MATLAB code. These changes mostly impact advanced graphics users and people who build more complicated user interfaces. Next time, I will describe the important changes and what steps you should take to make your code compatible with R2014b. I'll also give you some guidance on how to write code that works in multiple releases.

Get
the MATLAB code

Published with MATLAB® R2014b

I believe it was almost four years ago that we started kicking around the idea of changing the default colormap in MATLAB. Now, with the major update of the MATLAB graphics system in R2014b, the colormap change has finally happened. Today I'd like to introduce you to parula, the new default MATLAB colormap:

showColormap('parula','bar')

(Note: I will make `showColormap` and other functions used below available on MATLAB Central File Exchange soon.)

I'm going spend the next several weeks writing about this change. In particular, I plan to discuss why we made this change and how we settled on parula as the new default.

To get started, I want to show you some visualizations using jet, the previous colormap, and ask you some questions about them.

**Question 1:** In the chart below, as you move from left to right along the line shown in yellow, how does the data change? Does it trend higher? Or lower? Or does it trend higher in some places and lower in others?

fluidJet hold on plot([100 160],[100 135],'y','LineWidth',3) hold off colormap(gca,'jet')

**Question 2:** In the filled contour plot below, which regions are high and which regions are low?

filledContour15 colormap(gca,rgb2gray(jet))

The next questions relate to the three plots below (A, B, and C) showing different horizontal oscillations.

**Question 3:** Which horizontal oscillation (A, B, or C) has the highest amplitude?

**Question 4:** Which horizontal oscillation (A, B, or C) is closest to a pure sinusoid?

**Question 5:** In comparing plots A and C, which one starts high and goes low, and which one starts low and goes high?

subplot(2,2,1) horizontalOscillation1 title('A') subplot(2,2,2) horizontalOscillation2 title('B') subplot(2,2,3) horizontalOscillation3 title('C') colormap(jet)

Next time I'll answer the questions above as a way to launch into consideration of the strengths and weaknesses of jet, the previous default colormap. Then, over the next few weeks, I'll explore issues in using color for data visualization, colormap construction principles, use of the L*a*b* color space, and quantitative and qualitative comparisons of parula and jet. Toward the end, I'll even discuss how the unusual name for the new colormap came about. I've created a new blog category (colormap) to gather the posts together in a series.

If you want a preview of some of the issues I'll be discussing, take a look at the technical paper "Rainbow Color Map Critiques: An Overview and Annotated Bibliography." It was published on mathworks.com just last week.

Get
the MATLAB code

Published with MATLAB® R2014b

MathWorks is the only company in the world with logo that satisfies a partial differential equation. Why is the region for this equation shaped like a capital letter L?

The wave equation describes how a disturbance travels through matter. If the units are chosen so that the wave propagation speed is equal to one, the amplitude of a wave satisfies

$$ {{\partial^2 u} \over {\partial t^2}} = \triangle u $$

The $\triangle$ denotes Laplace's operator

$$ \triangle = {\partial^2 \over {\partial x^2}} + {\partial^2 \over {\partial y^2}} $$

Geometry plays a crucial role here. Initial values of the amplitude and velocity of the wave are prescribed on a certain region. Values of the amplitude or its normal derivative are also prescribed on the boundary of the region. If the wave vanishes outside the region, these boundary values are zero.

Separating out periodic time behavior leads to solutions of the form

$$ u(x,y,t) = \cos{(\sqrt{\lambda}\,t)} v(x,y) $$

The functions $v(x,y)$ also depend upon $\lambda$ and the region. They satisfy the differential equation

$$ \triangle v + \lambda v = 0 $$

and are zero on the boundary of the region. The quantities $\lambda$ that lead to nonzero solutions are the *eigenvalues*, and the corresponding functions $v(x,y)$ are the *eigenfunctions* or *modes*. They are determined by the physical properties of the medium and the geometry of the region. The square roots of the eigenvalues are resonant frequencies. A periodic external driving force at one of these frequencies generates an unboundedly strong response in the medium.

Any solution of the wave equation can be expressed as a linear combination of these eigenfunctions. The coefficients in the linear combination are obtained from the initial conditions.

In one dimension, the eigenvalues and eigenfunctions are easily determined. The simplest example is a violin string, held fixed at the ends of an interval. The eigenfunctions are trig functions.

$$ v_k(x) = \sin{(k x)} $$

If the length of the interval is $\pi$, the eigenvalues are determined by the boundary condition, $v_k(k \pi) = 0$. Hence, $k$ must be an integer and

$$ \lambda_k = k^2 $$

If the initial condition is expanded in a Fourier sine series,

$$ u(x,0) = \sum_k a_k \sin{(k x)} $$

(And the initial velocity is zero), then the solution to the wave equation is

$$ u(x,t) = \sum_k a_k \cos{(\sqrt{\lambda_k}\,t)} v_k(x) $$

Here are graphs of the first nine eigenfunctions in one dimension. The corresponding eigenvalues are the squares of integers.

```
eigenvals = (1:9).^2
plot_modes('1d')
```

eigenvals = 1 4 9 16 25 36 49 64 81

The simplest region in two dimensions is a square. The eigenfunctions are again trig functions.

$$ v_{k,j}(x,y) = \sin{(k x)}\,\sin{(j y)} $$

If the sides have length $\pi$, the boundary conditions imply that $k$ and $j$ must be integers. Here are the first nine eigenvalues and eigenfunctions.

```
[k,j] = meshgrid(1:3); e = k.^2+j.^2; eigenvals = e(:)'
plot_modes('square')
```

eigenvals = 2 5 10 5 8 13 10 13 18

If the region is a circular disc, we switch to polar coordinates, $r$ and $\theta$. Trig functions are replaced by Bessel functions. The eigenfunctions become

$$ v_{k,j}(r,\theta) = B_j(\mu_k r) \,\sin{(j \theta)} $$

where $B_j$ is the $j$ -th order Bessel function and $\mu_k = \sqrt{\lambda_k}$. To find the eigenvalues we need to have the eigenfunctions vanish on the boundary of the disc. If the radius is one, we require

$$ B_j(\mu_k) = 0 $$

In other words, we need to compute zeros of Bessel functions. Here are the first nine eigenvalues and eigenfunctions of the circular disc. The violin string has become a tambourine.

```
eigenvals = [bjzeros(0,3) bjzeros(1,3) bjzeros(2,3)].^2
plot_modes('disc')
```

eigenvals = Columns 1 through 7 5.7832 30.4713 74.8870 14.6820 49.2185 103.4995 26.3746 Columns 8 through 9 70.8500 135.0207

Replace the full circular disc by a three-quarter circular sector. The angle at the origin is $270^\circ$ or $\frac{3}{2}\pi$ radians. We can make our eigenfunctions adapt to this angle. Take

$$ v_{k,j}(r,\theta) = B_{\alpha_j}(\mu_k r) \,\sin{(\alpha_j \theta)} $$

with

$$ \alpha_j = \frac{2}{3} j $$

and fractional order Bessel functions. The eigenfunctions satisfy their differential equation and also satisfy the boundary conditions on both sides of the angle.

$$ v_{k,j}(r,\theta) = 0 $$

at $\theta = 0$ and at $\theta = \frac{3}{2}\pi$.

By finding the zeros of the Bessel functions we can also have the eigenfunctions satisfy the boundary conditions on the outer circular portion of the boundary. Here are the first nine eigenvalues and eigenfunctions of the three-quarter circular sector.

```
eigenvals = [bjzeros(2/3,3) bjzeros(4/3,3) bjzeros(6/3,3)].^2
plot_modes('sector')
```

eigenvals = Columns 1 through 7 11.3947 42.6442 93.6362 18.2785 56.1131 113.6860 26.3746 Columns 8 through 9 70.8500 135.0207

These eigenfunctions have another important property. Most of them are singular; the derivatives of the fractional order Bessel functions are unbounded at the origin. You can see the black concentration of grid lines in the plots. If you tried to make a tambourine with this sector shape, it would rip at the sharp corner. This singular behavior is needed to model the solution to the wave equation on this region.

For all the regions we have discussed so far it is possible to express the eigenvalues as zeros of analytic functions. For the interval and the square, the eigenvalues are integers, which are the zeros of $\sin{\pi x}$. For the circular disc and sector, the eigenvalues are zeros of Bessel functions. Once we have the eigenvalues, it is easy to compute the eigenfunctions using sines and Bessel functions.

So, an L-shaped region formed from three unit squares is interesting for at least two reasons. It is the simplest geometry for which solutions to the wave equation cannot be expressed analytically; numerical computation is necessary. Furthermore, the 270 degree nonconvex corner causes a singularity in the solution. Mathematically, the gradient of the first eigenfunction is unbounded near the corner. Physically, a membrane stretched over such a region would rip at the corner. This singularity limits the accuracy of finite difference methods with uniform grids.

I used the L-shaped region as the primary example in my doctoral thesis fifty years ago. MathWorks has adopted a modified surface plot of the first eigenfunction as the company logo. I am going to devote a series of blog posts to the L.

Here are the first nine eigenvalues and eigenfunctions, computed by a function from Numerical Computing with MATLAB, which I will discuss in a future posting. Compare these eigenfunctions with the ones from the circular sector, which has the same reentrant corner and resulting singularity.

for k = 1:9 [~,eigenvals(k)] = membranetx(k); end eigenvals plot_modes('L')

eigenvals = Columns 1 through 7 9.6397 15.1973 19.7392 29.5215 31.9126 41.4745 44.9485 Columns 8 through 9 49.3480 49.3480

Simple model problems involving waves on an L-shaped region include an L-shaped membrane, or L-shaped tambourine, and a beach towel blowing in the wind, constrained by a picnic basket on one fourth of the towel.

A more practical example involves ridged microwave waveguides. One such device is a waveguide-to-coax adapter. The active region is the channel with the H-shaped cross section visible at the end of the adapter. The ridges increase the bandwidth of the guide at the expense of higher attenuation and lower power-handling capability. Symmetry of the H about the dotted lines shown in the contour plot of the electric field implies that only one quarter of the domain needs to be considered and that the resulting geometry is our L-shaped region. The boundary conditions are different than our membrane problem, but the differential equation and the solution techniques are the same.

The photo is courtesy of Advanced Technical Materials, Inc. See their website, <http://atmmicrowave.com/>, for lots of devices like this.

waveguide

Get
the MATLAB code

Published with MATLAB® R2014b

Greg's pick this week is Quad-Sim by David.

There have been a bunch of models for multiple-rotor flying machines floating around the File Exchange, but this entry is far and above the most comprehensive entry I have seen.

It has neat, sophisticated Simulink models of the dynamic system behavior; graphical user interfaces for data analysis, data formatting, and visualization; and finally, it includes the notion of model development and verification.

To me this entry represents the power MATLAB and Simulink can provide to the design of complex dynamic systems such as flying machines.

This video summarizes it very nicely.

Okay, maybe not everything. At some point you need to have substance in order to be useful. And documentation helps create the context in which to operate the tools. The Quad-Sim entry has all of these. The content of this entry is well organized in a way which helps detail what exactly is available within the content of the Quad-Sim tools.

In addition, the Simulink simulation models are well organized in terms of functional hierarchy. In other words, elements contained within the Simulink model are placed in well-named subsystems and libraries according to their functionality. There are clear partitions between inputs, controllers, plant dynamics, sensors, etc.

It sounds simple, but it makes all the difference. The organization of the content and the models alone made it very easy for me to understand, and start using the models.

I am definitely a proponent of Simulink as a platform for the development of dynamic systems and their control. Because I believe in Simulink as a design tool, it also means I spend a great deal of time and effort thinking about how to create and share models and supporting materials.

I have a self-imposed bias that makes it difficult for me to demonstrate or promote models that are messy, or otherwise difficult to understand. As such, I struggle to find File Exchange entries that involve Simulink, especially because Simulink is designed as a graphical tool, and therefore the form of the simulation model is at the forefront.

If David and his colleagues somehow believe that the world of development and design will always be well thought-out and documented, they will soon be in for a shock.

Generally deadlines of overlapping projects loom, and we must produce functionality to meet specific needs now, and worry about documentation later (if at all).

As all of us know, being a user of a tool we want clear, well though-out documentation that not only has details of how to use a particular tool or feature, but examples of its use. Finally understanding the underlying nature of a particular feature helps us better use features appropriately. But as many of us are some sort of developer or designer, we also are aware of the effort required to produce such rigorous documentation.

The Quad-Sim not only has well written documentation on specifically how to use the included tools (e.g. Modeling GUI):

It also has detailed explanations of various experimental processes so they can be reproduced, and the tools can be understood relative to a specific design context (e.g. Motor Modeling Hardware).

The fact that David and his colleagues bothered to attempt and verify their system models really excited me. I know it sounds like maybe I should reconsider my priorities in life, but I couldn’t help showing some of my colleagues in my hall the Quad-Sim entry. Nor could I hide my excitement in seeing users spend the time to make sure their models are correct.

A model is an invaluable tool for design and analysis of complex systems. But what good is the use of that model for design if it is “wrong”?

Certainly “wrong” is a relative term. A model is always wrong, otherwise it’s not a model. I am specifically referring to the notion that a model must capture the important attributes of a system sufficiently with enough accuracy to be useful for the purposes of the design and analysis.

Maybe I should say: what good is using a model that is “too wrong”?

It wasn’t clear to me that David and his colleagues used automated C-code generation from Simulink to implement the controller on the Arduino processor. Control systems like this are ideal candidates for our C-code generation technology from Simulink.

In my opinion, the Quad-Sim entry covers the more important aspects of the design process. It’s more important to get it right, without damaging people or equipment, than to implement quickly.

Seeing as that part of the process has been taken care of, it seems natural to me to automate the process of generating the control algorithm from the Simulink model to be implemented on the Arduino processor.

This enables two key design process attributes:

- Prevent the introduction of errors when translating designs from Simulink to C-code
- Permit design iterations to be tested in simulation, and then quickly implemented

Both of these help save demand on software development resources, especially as designs become more complex, and design cycle iterations become shorter. We do offer an Arduino Hardware Support Package that enables you to include connections of the model to hardware peripherals such as analog-to-digital converters (ADC), and go all the way from the Simulink model to the executable running on the hardware with a single button click.

This is fine if you can represent the entire system software in Simulink. But in most cases, engineers want to integrate the control algorithm with some existing software framework. For this, you can leverage Embedded Coder to generate the C-code and then integrate the resulting code into your software project.

I have lots of other comments and notions about this entry, but I’ll save that for other discussions. Let us know what you think.

**Comments**

If you would like to leave any comments regarding this post, please click here.

Get
the MATLAB code

Published with MATLAB® R2014b

We are very excited to announce that MATLAB r2014b introduces a new graphics system. Graphics developer, Mike Garrity, is launching a graphics specific blog. Here is the landing page for this release of MATLAB and for the new graphics system in general.]]>

I wanted to show you a glimpse of some of the new math functionality available in R2014b.

Recently on the MATLAB newsgroup, Christoph asked this question:

*I have a vector A shown below, which has 6 elements. the elements are already sorted in descending order. now i want to create vector C by deleting elements from A, starting with element a1, until the sum of the vector equals or is smaller the value B*

A= 26 23 20 19 15 14

B=70

*So, the output should be*

C= 20 19 15 14

*Any idea how to do this?*

*I would do something like this. Get the numbers in increasing order and perform a cumulative sum.*

Aflip = flipud(A); Asums = cumsum(Aflip);

*Find the first sum that is GREATER than b. The next element is where you want to start.* startIndex = find(Asums > b, 1, 'first');

*But things are reversed now. So we need to find the starting index into the original array.* Aind = length(A) - startIndex + 1;

*Now delete the leading values you don’t want.*

Afiltered = A((Aind+1):end);

In R2014b, the function `cumsum` has a new option, `direction`. (So do some other functions in the "|cum|" category.) You can specify to do the calculation in the default or `'forward'` direction, or in `'reverse'`. Here's what the new solution looks like.

A = [ 26 23 20 19 15 14]; B = 70; tmpA = cumsum(A,'reverse') ind = find(tmpA > B, 1,'last') C = A; C(1:ind) = []

tmpA = 117 91 68 48 29 14 ind = 2 C = 20 19 15 14

Have you had a chance to explore some of the math functionality in R2014b? Which features did you find helpful? Let me know here.

Get
the MATLAB code

Published with MATLAB® R2014b

I'm terrible at rote memorization.

I thought about that recently when I happened to see a plot (I think it was in a documentation example) that compared $sin(\theta)$, $cos(\theta)$, and $sin(\theta) cos(\theta)$.

theta = linspace(0,6*pi,500); plot(theta,sin(theta)) hold on plot(theta,cos(theta)) plot(theta,sin(theta) .* cos(theta)) hold off legend({'sin(\theta)','cos(\theta)','sin(\theta) cos(\theta)'})

Hmm, I thought. I guess $sin(\theta) cos(\theta)$ is just a sinusoid with twice the frequency and half the amplitude.

How is that related to memorization, you might ask? Well, I always had a very hard time remembering trig identities in high school and college. But, as an electrical engineering student, I had the following formula engraved on my brain cells:

$$e^{j\theta} = cos(\theta) + j sin(\theta)$$

(Note that I am following the proud electrical engineering tradition of using *j* instead of *i* as the imaginary unit. Because *i* obviously stands for current.)

This is called Euler's formula, and it's used all over the place in electrical engineering. I did not know at the time that Richard Feynman called it "the most remarkable formula in mathematics."

Here are a couple of closely related formulas:

$$cos(\theta) = \frac{1}{2} (e^{j\theta} + e^{-j\theta})$$

$$sin(\theta) = \frac{1}{2j} (e^{j\theta} - e^{-j\theta})$$

At some point during my undergraduate days, it dawned on me that I could derive many of the common trig identities directly from these formulas. Here's how it goes for $sin(\theta) cos(\theta)$:

$$ sin(\theta) cos(\theta) = \frac{1}{2j} (e^{j\theta} - e^{-j\theta}) \frac{1}{2} (e^{j\theta} + e^{-j\theta}) $$

$$ sin(\theta) cos(\theta) = \frac{1}{4j} (e^{j2\theta} - e^{-j2\theta} + 1 - 1) $$

$$ sin(\theta) cos(\theta) = \frac{1}{2} \frac{1}{2j} (e^{j2\theta} - e^{-j2\theta}) $$

$$ sin(\theta) cos(\theta) = \frac{1}{2} sin(2\theta) $$

Presto! Like magic.

I used to think that I must be kind of weird to prefer this manipulation of Euler's formula to simply memorizing the trig identities. But while writing this blog today, I noticed that the Wikipedia article on Euler's formula describes this concept exactly. "Complex exponentials can simplify trigonometry, because they are easier to manipulate than their sinusoidal components."

OK!

PS. In my plot creation code above, I exploited a subtle improvement in MATLAB R2014b graphics that customers have been requesting for many years. Can you spot it?

Get
the MATLAB code

Published with MATLAB® R2014b

**Fast Restart**

If you are dealing with large models, you are definitely aware that every time you click play to simulate a model, a good amount of time can be spent in initialization. In R2014b, if you know you will simulate your model multiple times without making structural changes, you can click the Fast Restart button:

At the end of the simulation, the model will remain in an Initialized state, and the toolbar will indicate this, telling you: "I am ready for fast restart".

At this point you can change the value of any tunable parameter and click play. The simulation will start instantly.

**Quick Insert**

Those who prefer typing over using the mouse will love the Quick Insert feature in R2014b. Type whatever you need in the canvas and the blocks magically appear! Once the block is there, a pop up offers to set the value of the most common parameter for this block:

**Model Templates**

You can now create models from templates:

This will launch the Simulink Template Gallery:

Of course, you can add your own templates to the gallery using the **File** menu of any model:

**Now it's your turn**

As you can see, R2014b contains many new features designed especially to speed up your workflow.

What is your favorite feature in R2014b? Let us know by leaving a comment here

]]>- Connect MATLAB Mobile to your computer with the MATLAB Connector. For more information on how to do this, refer to the getting started instructions here. This feature is only supported on MATLAB R2014a and later, so make sure you are on a compatible version.
- Install the MATLAB Support Package for Android Sensors. Choose
*Add-ons*from the MATLAB Toolstrip, and then choose*Get Hardware Support Packages*. This will open the support package installer. Choose*Android Sensors*from the list and follow the instructions. - To establish communication between the sensors on your device and MATLAB, create a mobiledev object, as follows:

m = mobiledev;

Once you have completed the 3 steps from the above section, go to MATLAB Mobile and turn on the accelerometer. You should see something akin to this:

You can also enable the sensor directly from MATLAB, by executing the following command:

m.AccelerationSensorEnabled = 1;

Did you notice the enabled

m.Logging = 1;You can verify this in MATLAB, note the Current Sensor Values in the result:

m

m = mobiledev with properties: Connected: 1 Logging: 1 InitialTimestamp: '02-Oct-2014 21:53:26.707' AccelerationSensorEnabled: 1 (20 Logged values) AngularVelocitySensorEnabled: 0 MagneticSensorEnabled: 0 OrientationSensorEnabled: 0 PositionSensorEnabled: 0 Current Sensor Values: Acceleration: [0.2631 5.9226 8.1850] (m/s^2)

Walk around your campus/home/floor with your device. Once you are satisfied, stop sending this data to MATLAB. You can either tap on the

m.Logging = 0;To retrieve the data, use the accellog variable:

[a, t] = accellog(m);

Once you have retrieved the logged acceleration data, you can plot it in MATLAB:

plot(t, a); legend('X', 'Y', 'Z'); xlabel('Relative time (s)'); ylabel('Acceleration (m/s^2)');

Calculate the magnitude to convert your X, Y and Z vectors to scalar values. Then, plot it.

x = a(:,1); y = a(:,2); z = a(:,3); % Calculate and plot magnitude. mag = sqrt(sum(x.^2 + y.^2 + z.^2, 2)); plot(t, mag); xlabel('Time (s)'); ylabel('Acceleration (m/s^2)');

To remove constant effects such as gravity, you can subtract the mean from this data.

% Accounting for gravity. magNoG = mag - mean(mag); % Plot magnitude. plot(t, magNoG); xlabel('Time (s)'); ylabel('Acceleration (m/s^2)');

The plotted data is now centered on zero, and shows peaks which correspond to a step taken while walking.

To determine the number of steps taken, you can use to FINDPEAKS function from Signal Processing Toolbox. In this example, we are treating only peaks with a minimum height above one standard deviation as a step. This threshold should be tuned experimentally to match a person’s level of movement while walking, hardness of floor surfaces etc.

% Use FINDPEAKS to determine the local maxima. minPeakHeight = std(magNoG); [pks, locs] = findpeaks(magNoG, 'MINPEAKHEIGHT', minPeakHeight);The number of steps taken is the number of peaks:

numSteps = numel(pks)

numSteps = 15Finally, you can also identify these locations on your plot of acceleration magnitude data:

hold on; % Place a red marker on the locations that correspond to peaks. plot(t(locs), pks, 'r', 'Marker', 'v', 'LineStyle', 'none'); title('Counting Steps'); xlabel('Time (s)'); ylabel('Acceleration Magnitude, Gravity Removed (m/s^2)'); hold off;

Once you are done, make sure you turn off the acceleration sensor and clear the mobiledev object.

```
m.AccelerationSensorEnabled = 0;
clear m;
```

- Acquiring Data from Sensors - MATLAB Mobile
- Android Sensor Support from MATLAB
- Documentation for mobiledev Object

Let us know by leaving a comment below.

Published with MATLAB® R2014b

]]>Mike is a computer graphics guru, well informed on the state of the art and knee-deep in the operational knowledge of How It All Works. He helped develop many of the features he’ll be describing, and he is a thoroughly entertaining writer. Beyond all this goodness, MATLAB graphics have taken a tremendous leap forward with the latest release, and Mike’s blog will give you an ideal perch to learn all about it. I recommend heading over there and seeing what he has to say.

]]>In the meantime, I'd like to invite you to head over to see Mike Garrity's new blog about MATLAB graphics. The MATLAB graphics system received a substantial update in R2014b, and Mike has a lot of great stuff to show you.

Welcome to the MATLAB Central blogs, Mike!

]]>Sean's pick this week is makeinstall and Make Install Technology by Norbert Marwan.

My pick this week is a tool that makes it easy to package your own toolboxes into a single installer file to distribute. For 12.5 years this file has served a great purpose.

Starting with MATLAB R2012b, we introduced Apps. Apps packaging allow you to take your user interface and package it into a single file. A dependency analysis is done and you can include any auxiliary files that might be necessary such as data. The generated app has version control and can be installed into another user's Apps tab. Apps, however, only have one entry point and are thus not good for toolbox packaging.

Now, beginning with R2014b, you can package your own toolbox from the MATLAB toolstrip and it will build a single install file like Norbert's makeinstall.

Let's look at the two approaches: here is a directory of a few of my utilities I'd like to package up.

currentdir = cd('myutils'); % navigate makeinstall % Build install cd(currentdir) % restore

Three files are generate in my directory:

I copied these to a new directory and ran `install`. It took me through some steps, including how to add it to path at startup, a few tips on what it does and how to use it.
At the end it recommends running the following to see the contents:

`helpwin myutils`

Which reveals the H1 lines for my functions!

I'm impressed. In addition to this, there's also the Make Install Technology white paper that explains how this all works in great detail.

Now let's use the R2014b functionality.

I'll select the folder and it will grab all of the files and their dependencies. Then I can put in any other meta information, documentation example etc. that I might have:

This builds a single install file which I can give to anyone:

This can be installed into the custom toolboxes.

It's automatically added to the path so those utilities can be used anywhere. I do hazard a word of caution that this may
make it easier to have shadowed files. For example: I now have three versions of those functions, which one is actually going
to be called? If I make an update to one, how do I ensure it's the right one? `which -all` will be your friend here.

which -all FEX

C\Documents\MATLAB\FEX.m C\Documents\MATLAB\myutils\FEX.m % Shadowed C\Documents\MATLAB\Toolboxes\myutils\FEX.m % Shadowed

So thank you Norbert for being ahead of the curve and providing us with the make install technology as well as for maintaining it over the last decade.

Give both approaches a try and let us know what you think here or leave a comment for Norbert.

Get
the MATLAB code

Published with MATLAB® R2014b

Today I’d like to introduce a guest blogger, David Garrison, who is a MATLAB Product Manager here at MathWorks. This is the first in a series of blogs over the next few weeks describing the new graphics system in R2014b and how some of the changes will affect you.

**Part 1: Features of the New Graphics System**- Part 2: Using Graphics Objects
- Part 3: Compatibility Considerations in the New Graphics System

Here is Part 1 of the series.

- Big Changes in R2014b
- The New MATLAB Graphics System
- The New Look of MATLAB Graphics
- Rotatable Tick Labels
- Automated Updating of
`datetime`Tick Labels - Animated Plots
- Multilingual Text and Symbols in Plots
- User Interfaces with Tab Panels
- Improved Histograms
- Have you tried the new graphics system in R2014b?
- Next up -- Part 2: Using Graphics Objects

There are a number of big changes in R2014b. Some of the new features include:

- New MATLAB graphics system
- Date and time data types with time zone and display options
- Git and Subversion source control integration and access to projects on GitHub from File Exchange
- MATLAB toolbox packaging as single, installable files for easy sharing and downloading of user-developed tools
- MATLAB MapReduce™ data analysis that scales Hadoop for big data
- Arduino and Android hardware support for interacting with motors and actuators, and for accessing sensor data

You can learn more about all these features in the MATLAB R2014b Release Notes. This post is all about the new graphics system.

R2014b includes a new MATLAB graphics system. The new graphics system includes many new features which we will describe in this blog post. In Part 2 of this series, we will describe how graphics handles have changed in R2014b and how to use graphics objects. Finally, there are some changes in the new system which may require changes in some existing graphics related code. We'll discuss compatibility considerations in Part 3 of this series.

When you create a plot in R2014b, you'll see that MATLAB graphics look different than previous versions.

The first thing that you will notice is that lines are plotted using a different set of colors. New line colors were selected to make it easier to distinguish lines from one another and to help people with certain types of color blindness. There is also a new default colormap in R2014b called *parula*. The colors in the parula colormap are ordered from dark to light and are perceptually uniform. Smooth changes in the data appear as smooth changes in color, while sharp changes in the data appear as sharp changes in color. Grid lines are now gray to make data stand out visually. Axis labels and titles are larger and more prominent. Lines and text are now anti-aliased (smoothed) to remove jagged edges.

If you're like me, you sometimes need to use long labels for the ticks in your plot. In previous versions of MATLAB, tick labels were always displayed horizontally as shown on the left in the example below. In the new graphics system, ticks can be rotated so you can create a plot like that shown on the right. Tick labels on both the x and y axes can be rotated.

For example, to rotate the X tick labels, use the `XTickLabelRotation` property of the `Axes` object:

ax = gca; ax.XTickLabelRotation = -40;

R2014b includes a new date and time data type called a `datetime` array. When you create a plot with a `datetime` array, the tick labels will automatically update as you pan or zoom in the plot. Tick labels change from days to hours to minutes to seconds as shown below.

MATLAB R2014b graphics introduces a new function called `animatedline`. The code below shows how to create an `animatedline` and add points to it.

x = 1:100; y = rand(1,100); myLine = animatedline; myLine.Color = blue; xlim([1 100]) ylim([0 1]) for i = 1:100 addpoints(myLine,x(i),y(i)) pause(0.1) end

The result is a plot that changes over time. The picture below shows how the plot looks at iterations 20, 50, and 80 as points are added.

R2014b graphics also supports the use of Unicode characters to show multilingual text in axis labels and titles and in user interface controls. Here is an example. See the `native2unicode` function for more information about Unicode strings.

Creating a user interface with tab panels has been a common request among MATLAB UI builders. In the past, people had done it using undocumented functions. In R2014b, those functions have been updated and are now fully documented and supported.

You can easily create a user interface like the one shown above using the new `uitabgroup` and `uitab` functions.

myFig = figure('Toolbar', 'none', ... % Create the figure 'Menubar', 'none', 'Name', 'Using Tab Panels'); tgroup = uitabgroup('Parent', myFig); % Create the tabgroup tab1 = uitab('Parent', tgroup, 'Title', 'Loan Data'); % Create the tabs tab2 = uitab('Parent', tgroup, 'Title', 'Amortization Table'); tab3 = uitab('Parent', tgroup, 'Title', 'Principal/Interest Plot');

The new `histogram` function plots histograms with data-dependent bin picking. It provides capabilities that the traditional `hist` function does not including options for bin control, normalization, and visualization.

The code below shows how to create two overlapping histograms and modify their properties after they are created.

data1 = randn(5000,1); data2 = randn(5000,1)+ 2;

h1 = histogram(data1); hold on h2 = histogram(data2); hold off legend show

h1.BinMethod = 'sturges'; h2.Normalization = 'countdensity';

Have you installed MATLAB R2014b? Have you tried the new graphics system? We'd love to hear your thoughts here.

Well, that's all for now. Be sure to check out Mike Garrity's new blog, Mike on MATLAB Graphics, for some cool ideas about what you can do with the new graphics system.

In my next post I'll talk about how graphics handles have changed in R2014b and how to use graphics objects.

Get
the MATLAB code

Published with MATLAB® R2014b

A fascinating question came to the Image Processing Toolbox development team recently from Brett Shoelson, a MathWorks application engineer, prolific File Exchange contributor, and former MATLAB Central blogger. Brett was trying to solve a problem for a customer. Part of the problem came down to this question:

Suppose you have a label matrix that represents a number of different regions. Are any of the regions disconnected?

Let me elaborate. A label matrix looks something like this:

```
L = [ ...
1 1 1 2 5 6
1 1 2 2 5 5
3 4 4 4 4 4
3 3 7 7 7 8
3 3 7 7 8 8
9 9 9 9 9 4]
```

L = 1 1 1 2 5 6 1 1 2 2 5 5 3 4 4 4 4 4 3 3 7 7 7 8 3 3 7 7 8 8 9 9 9 9 9 4

For a positive integer k, the set of elements of `L` equal to k is the kth region. So, for example, the 5th region is:

L == 5

ans = 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Or, displayed as an image:

imshow(L == 5,'InitialMagnification','fit')

So my sample label matrix here represents 9 different regions. One of them, region 4 is disconnected, meaning it consists of more than one connected set (or connected component) or pixels.

L == 4

ans = 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

imshow(L == 4,'InitialMagnification','fit') title('Region 4')

How can we programmatically determine whether any of the regions is disconnected? A brute force method is to compute, one at a time, the binary image associated with each region and then call `bwconncomp` to determine the number of connected components associated with it.

for k = 1:9 mask_k = (L == k); cc = bwconncomp(mask_k); if cc.NumObjects > 1 fprintf('Region %d is disconnected.\n',k) end end

Region 4 is disconnected.

If a brute force method gives you the desired answer and does it fast enough, then sometimes you just declare success and move on to solve the next problem of the day. Was this method fast enough? "No," said Brett. "My customer's images are about 25,000-by-25,000, and this method is way too slow."

"Hmm," I replied, trying to sound thoughtful while really just playing for time.

"OK, ask your customer this question: Is it known whether any of the regions in question contain any holes?"

Soon, the answer came back: "No, the labeled regions do not contain holes."

That was the answer I had hoped for. I started thinking about how to compute something called the Euler number for every labeled region, more or less simultaneously.

No, not that Euler number. I was thinking of the Euler number for a binary image, which is the number of objects in the image (or the number of connected components) minus the number of holes. (I don't know why this property of a binary image is called Euler number. If you know, please let us know in the comments.) The Image Processing Toolbox function `bweuler` computes this number for a binary image.

For a topological property of an image, the Euler number is unusual because it can be computed by one sweep of a purely local, 2-by-2 neighborhood operation. I first saw the method in the book *Robot Vision*, Berthold K. P. Horn, MIT Press, 1989. In this method, you count the number of occurrences of this 2-by-2 neighborhood pattern (called an *upstream convexity*):

0 0 0 1

and then subtract the number of occurrences of this 2-by-2 neighborhood pattern (called an *upstream concavity*):

0 1 1 1

(Note: this variation of the method works for 4-connected objects, not for 8-connected objects.)

But Brett has a label matrix representing many different regions, not a single binary image mask representing one region. How can we apply this technique?

To answer the question, let's look at the 2-by-2 neighborhood to the northwest of element (3,2):

L32 = L(2:3,1:2)

L32 = 1 1 3 4

I'll take the lower-right pixel as my reference point for the neighborhood. I can look at just the binary mask formed by comparing `L(3,2)` to its three neighbors to the north and west.

L(3,2) == L32

ans = 0 0 0 1

So, for region 4, we have an upstream convexity at this location. We could repeat this calculation in one loop over all the pixel locations, looking for (and counting) upstream convexities and upstream concavities associated with the label at the reference point for each 2-by-2 neighborhood.

I want to show you how to do this just with matrix operations and indexing. No loop is required. The technique is fairly general and can be applied to other local neighborhood operations.

Consider these two submatrices:

L1 = L(1:end-1,1:end-1)

L1 = 1 1 1 2 5 1 1 2 2 5 3 4 4 4 4 3 3 7 7 7 3 3 7 7 8

and

L2 = L(2:end,2:end)

L2 = 1 2 2 5 5 4 4 4 4 4 3 7 7 7 8 3 7 7 8 8 9 9 9 9 4

Note that the elements in `L1` are the northwest neighbors of the corresponding elements in `L2`. This basic idea gives us a general approach for doing certain kinds of neigbhorhood operations without explicit per-pixel loops.

Here's a short function I wrote that uses this technique to compute 4-connected Euler numbers for a label matrix:

```
type eulerNumbers4
```

function e = eulerNumbers4(L) % e = eulerNumbers4(L) % % For a label matrix L containing nonnegative integers, returns a vector e % such that e(k) is the 4-connected Euler number of the binary image (L == % k), for k = 1:max(L(:)). Lp = padarray(L,[1 1],0,'pre'); num_regions = max(Lp(:)); I_NW = Lp(1:end-1,1:end-1); I_N = Lp(1:end-1,2:end); I_W = Lp(2:end,1:end-1); is_upstream_convexity = (L > 0) & (L ~= I_N); is_upstream_convexity = is_upstream_convexity & (L ~= I_NW); is_upstream_convexity = is_upstream_convexity & (L ~= I_W); is_upstream_concavity = (L > 0) & (L ~= I_NW); is_upstream_concavity = is_upstream_concavity & (L == I_N); is_upstream_concavity = is_upstream_concavity & (L == I_W); upstream_convexity_labels = L(is_upstream_convexity); upstream_concavity_labels = L(is_upstream_concavity); total_upstream_convexities = accumarray(upstream_convexity_labels, ... 1,[num_regions 1]); total_upstream_concavities = accumarray(upstream_concavity_labels, ... 1,[num_regions 1]); e = total_upstream_convexities - total_upstream_concavities;

Under the assumption that the regions do not contain holes, the Euler number for each region is exactly the number of 4-connected components belong to that region.

eulerNumbers4(L)

ans = 1 1 1 2 1 1 1 1 1

So, we can finally answer the question: Are any of the labeled regions disconnected?

any(eulerNumbers4(L) > 1)

ans = 1

Yes.

True confessions, though. After I sent this version to Brett, and it didn't work for his problem, we realized that he needed the answer for 8-connected regions, not 4-connected.

It turns out that there is a local neighborhood-based Euler number algorithm for 8-connected regions, but it is significantly more complex to implement in this form. I did write it up, but I thought it would be too much uninteresting code to show here. If you are really interested, the method is described in Pratt, *Digital Image Processing*, 2nd edition, John Wiley & Sons, 1991, pp. 632-634. Pratt, in turn, credits the paper S. B. Gray, "Local Properties of Binary Images in Two Dimensions," *IEEE Transactions on Computers*, C-20, 5, May 1971, pp. 551-561.

The function `bweuler` is based on this algorithm, plus the binary image neighborhood lookup table approach that I have described here previously.

Have you used the Euler number measurement in your own work? Let us know in the comments.

Get
the MATLAB code

Published with MATLAB® R2014a

This is the third in a series of posts on the finite Fourier transform. The Fourier matrix produces an interesting graphic and has a surprising eigenvalue distribution.

The Fourier matrix of order $n$ is the $n$ -by- $n$ complex Vandermonde matrix $F$ whose elements $f_{k,j}$ are powers of the $n$ th root of unity

$$ \omega = e^{-2 \pi i/n} $$

$$ f_{k,j} = \omega^{k j} $$

The matrix can be generated with the MATLAB statements

k = (0:n-1)'; j = (0:n-1); F = exp(-2*pi*i*k*j/n);

Or, by taking the FFT of the identity matrix

F = fft(eye(n))

The statement

plot(F)

connects points in the complex plane whose coordinates are the real and imaginary parts of the elements of the columns of `F`, thereby generating a subgraph of the graph on `n` points. If `n` is prime, connecting the elements of all columns generates the complete graph on `n` points. If `n` is not prime, the sparsity of the graph of all columns is related to the computational complexity, and hence the speed, of the fast FFT algorithm. The graphs for `n` = 8, 9, 10, and 11 are generating and plotted by the following code.

for n = 8:11 subplot(2,2,n-7) F = fft(eye(n)); plot(F) axis square axis off title(['n = ' int2str(n)]) end

Because `n = 11` is prime, the corresponding graph shows all possible connections. But the other three values of $n$ are not prime. Some of the links in their graphs are missing, indicating that the FFT of a vector with that many points can be computed more quickly.

The remainder of this post examines `F12` in more detail. Here is its graph.

clf n = 12; F = fft(eye(n)); plot(F) axis square axis off title(['n = ' int2str(n)])

The program `fftmatrix`, available here, or included with the NCM App, allows you to investigate these graphs. `fftmatrix(n)` plots all the columns of the Fourier matrix of order `n`. `fftmatrix(n,j)` plots just one column.

Let's plot the individual columns of `F12`. The first column of `F12` is all ones, so its plot is just a single point.

for j = 1:12 p = mod(j-1,4)+1; subplot(2,2,p); fftmatrix_mod(12,j-1) title(['j = ' int2str(j)]) if p == 4, snapnow, end end

To see typical behavior, look at the third subplot, the red graph labeled `j = 3`, generated by the third column

F(:,3)

ans = 1.0000 + 0.0000i 0.5000 - 0.8660i -0.5000 - 0.8660i -1.0000 + 0.0000i -0.5000 + 0.8660i 0.5000 + 0.8660i 1.0000 + 0.0000i 0.5000 - 0.8660i -0.5000 - 0.8660i -1.0000 + 0.0000i -0.5000 + 0.8660i 0.5000 + 0.8660i

These are the powers of $\omega^2$. Because 2 divides 12 evenly these powers hit all the even-numbered points around the circle twice and miss all the odd-numbered points. Now look at the cyan graph labeled `j = 11`. These are the powers of $\omega^{10}$, which are the complex conjugates of the powers of $\omega^2$. So the two graphs lie on top of each other.

Six of the twelve columns in the graph of `F12` connect only a subset of the nodes and ten of the columns lie on top of their complex conjugate columns. As a result, when all of the columns are combined to form the full graph, it is sparse. This sparsity, in turn, makes it possible to construct a fast finite Fourier transform algorithm for `n = 12`.

I've always been curious about the eigenvalues and vectors of the Fourier matrices. In 1979, three friends of mine at the University of New Mexico, Gus Efroymson, Art Steger, and Stan Steinberg, posed the question in the SIAM Review problem section. They didn't know that Jim McClellan and Tom Parks had actually solved their problem seven years earlier, when Jim was a grad student working under Tom at Rice. I didn't learn about the McClellan-Parks paper until fairly recently.

The Wikipedia page on the discrete Fourier transform says the facts about the eigenvalues are well known, but that the eigenvectors are the subject of ongoing research.

Let's rescale $F$ so that its columns have unit length.

F = F/sqrt(n);

This makes $F' \ F = I$

round(F'*F)

ans = 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1

Now $F$ is *unitary*, the complex generalization of *orthogonal*. This implies that all of its eigenvalues lie on the unit circle in the complex plane.

Furthermore, it turns out that $F^4 = I$.

round(F^4)

ans = 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1

This implies that any eigenvalue $\lambda$ must satisfy $\lambda^4$ = 1. Consequently the only possible eigenvalues are 1, -1, i, and -i. You might guess that guess that if $n$ is divisible by 4, the eigenvalues would be equally distributed among these four values. But, surprisingly, to me at least, that doesn't happen.

lambda = eig(F)

lambda = 1.0000 + 0.0000i -1.0000 + 0.0000i -0.0000 + 1.0000i -1.0000 + 0.0000i -1.0000 - 0.0000i -0.0000 + 1.0000i 0.0000 - 1.0000i 1.0000 + 0.0000i -0.0000 - 1.0000i 1.0000 - 0.0000i 1.0000 - 0.0000i 0.0000 - 1.0000i

It's hard to pick through this unsorted output, but there are four +1's, three -1's, three -i's, and only two +i's.

Here is a tricky piece of code that uses `angle` and the counting feature of sparse indexing to count the number of each of the four possible eigenvalues.

```
type eigfftmat
```

function c = eigfftmat(n) % EIGFFTMAT Count eigenvalues of the Fourier matrix. % c = eigfftmat(n) is a 4-vector with counts for +1, -1, -i, +i. % Compute the eigenvalues. e = eig(fft(eye(n))); % Sparse repeated indexing acts as a counter. c = full(sparse(mod(round(angle(e')/(pi/2)),4)+1,1,1))'; c([3 2]) = c([2 3]); end

When we run this code for a sequence of values of `n`, we see the pattern predicted by the McClellan-Parks analysis. The number of each of the four possible eigenvalues depends upon `floor(n/4)` and `mod(n,4)`.

disp(' n +1 -1 -i +i') for n = 4:20 disp([n eigfftmat(n)]) end

n +1 -1 -i +i 4 2 1 1 5 2 1 1 1 6 2 2 1 1 7 2 2 2 1 8 3 2 2 1 9 3 2 2 2 10 3 3 2 2 11 3 3 3 2 12 4 3 3 2 13 4 3 3 3 14 4 4 3 3 15 4 4 4 3 16 5 4 4 3 17 5 4 4 4 18 5 5 4 4 19 5 5 5 4 20 6 5 5 4

The proofs in the McClellan and Parks paper involve the eigenvectors and are quite complicated. It turns out that this MATLAB expression

floor((n+[4 2 1 -1])/4)

generates a 4-vector of the multiplicities of the +1, -1, -i, and +i eigenvalues for any given value of `n`.

J. H. McClellan and T. W. Parks, "Eigenvalues and eigenvectors of the discrete Fourier transformation", IEEE Trans. Audio Electroacoust 20, 66-74, <http://dx.doi.org/10.1109/TAU.1972.1162342>, 1972.

G. Efroymson, A. Steger, and S.Steinberg, "A Matrix Eigenvalue Problem", SIAM Review, 21, 139-139, <http://dx.doi.org/10.1137/1021013>, 1979.

Wikipedia, "Discrete Fourier Transform", <http://en.wikipedia.org/wiki/Discrete_Fourier_transform>, retrieved 09/03/2014.

Get
the MATLAB code

Published with MATLAB® R2014a

Let’s see how this works using a simple example!

**The Problem**

Let's say we have a setup where a DC motor is used to move a load between two positions. The position is measured using an encoder, and a PID is used to control the angle of the DC motor. The model looks like:

And when it simulates, the load tracks the desired position in that manner:

How could we detect if something breaks in the motor, or if the dynamics of the load change?

**Online Identification**

When starting the system we will spend a few seconds to identify its dynamics online (while the system is running). Once we have a model estimated, we will compare the encoder reading with the output of the model. If the error between the real system and the model becomes too large, it's a sign something undesired is happening... and we want to detect that!

I wrapped the control loop shown above in a subsystem and connected to it the Recursive Polynomial Model Estimator block from the System Identification Toolbox library.

Using the enable input port, we enable estimation for the first few seconds (only a few periods of the motor motion are necessary to get a good model). Once the estimation is completed, we monitor the estimation error, using very simple logic blocks to make a signal pass from zero to one as soon as the error level becomes larger than an acceptable level.

**The Results**

To test the logic in simulation, I faked the failure by doubling the friction in my DC motor model. As you can see, when the system dynamics change around 10 seconds, the error flag instantly signals the problem:

**What's next?**

The online estimation block also supports code generation. To see that in action, I recommend watching the video Online Fault Detection for a DC Motor by my colleague Karthik Srinivasan.

In this video, Karthik uses a model similar to the one I describe above, except that instead of a simulated DC Motor, he used driver blocks from the Arduino Simulink Suppport Package, to detect a failure using a real DC motor.

**Now it's your turn**

As you can imagine, fault detection is only one application of online parameters estimation. For example the identified model could be used in any sort of adaptive control.

Let us know what you think of the new online estimation features of the System Identification Toolbox by leaving a comment here.

]]>Jiro's pick this week is Colormouse by Patrick.

In addition to examining the actual values or creating line plots, you can get a lot of insight by visualizing data with color.
MATLAB has a number of functions that let you represent your data with color, including `image`, `surf`, `mesh`, and `contourf`. Many of these functions make use of a `colormap` that represents a range of colors for the values.

contourf(peaks) colorbar

`colorbar` above shows the range of colors represented in the figure. You can programmatically change this color range by using the
`caxis` function.

caxis([-4 4])

Patrick's `colormouse` allows you to change the colormap range interactively using the mouse. Calling `colormouse` adds a toolbar button to the current figure.

Once you're in colormouse mode, you can click-n-drag to move (by moving the mouse up and down) or widen/narrow (left and right) the color range.

You can also select a specific colormap from a context menu.

**Comments**

Let us know what you think here or leave a comment for Patrick.

*Note that the actual sourcecode resides on GitHub, so please check the GitHub page for the license information.*

Get
the MATLAB code

Published with MATLAB® R2014a

I'd like to introduce this week's guest blogger Alan Weiss. Alan writes documentation for mathematical toolboxes here at MathWorks.

An economist I know (well, he's my son) asked me a question the other day. I was able to answer him quickly by using Symbolic Math Toolbox™ calculations. You might find the steps to be an interesting case of a somewhat nonstandard use of symbolic mathematics

My son was working through a paper that described a computational algorithm. He got stuck on one step of the algorithm:

If $A$ is a positive definite (symmetric) matrix, find a **lower triangular** matrix $G$ such that

$$GAG^{-1} = D, \mbox{ where }D\mbox{ is diagonal.}$$

He called me asking if there was some linear algebra that he just never learned. I knew how to find a matrix $G$ that would diagonalize $A$ using the `eig` function. But a lower triangular version? I didn't know how to do that. I told him I would call him back after some investigation.

After spending a couple of minutes fruitlessly looking through a linear algebra book, I decided to see what Symbolic Math Toolbox had to say about the equations that any such matrix $G$ would have to satisfy. Here were my steps.

syms a b c g1 g2 g3 real % Create symbolic variables G = [g1,0;g2,g3] % Lower triangular matrix H = inv(G) % Its inverse A = [a,b;b,c] % Symmetric matrix A D = G*A*H % D is supposed to be diagonal

G = [ g1, 0] [ g2, g3] H = [ 1/g1, 0] [ -g2/(g1*g3), 1/g3] A = [ a, b] [ b, c] D = [ a - (b*g2)/g3, (b*g1)/g3] [ (a*g2 + b*g3)/g1 - (g2*(b*g2 + c*g3))/(g1*g3), (b*g2 + c*g3)/g3]

Look at the upper right entry in D. If D is diagonal, then b*g1 = 0. This means b = 0 or g1 = 0. But if b = 0 then A is already diagonal. And if g1 = 0 then the first row of G is zero, so G is not invertible.

In other words, I had just proved that there is no such thing as a lower triangular matrix that diagonalizes a symmetric matrix in this way.

I called my son back and explained the calculation. We decided that the paper he was reading had a misstatement, and he went ahead and used `eig` to diagonalize the matrix in his application. He will get in touch with the author of the paper to explain the misstatement in the algorithm.

I was quite happy with this result. Symbolic math calculations had saved us both a lot of time searching for an algorithm that doesn't exist.

Do you use symbolic math calculations in cute or innovative ways? Tell us about it here.

Get
the MATLAB code

Published with MATLAB® R2014a

*Note: I really didn't expect to be writing about the FFT two weeks in a row. But ready or not, here we go!*

My friend Joe walked into my office the other day and announced, "Steve, I've got a question about the FFT."

(Joe was one of my MathWorks interviewers 21 years ago this month. His question for me then was, "So what do you bring to the party?" I don't remember how I answered.)

Joe is one of the biggest MATLAB experimenters I know. He's always trying to use MATLAB for some new kind of analysis or to automate something dubious (such as changing the hue of the lights in his house. Hmm.)

Anyway, lately he's been into automatic analysis of music. And something was bothering him. If you have a pure sinusoid signal and take the FFT of it, why does it spread out in the frequency domain? How do you interpret the output of `fft` when the frequency is in between two adjacent frequency bins of the FFT? And is that situation fundamentally different somehow that when the sinusoid's frequency is exactly one of the FFT bin frequencies?

Oh boy. I told Joe, "Well, you don't really have a sinusoid. A sinusoid is an infinitely long signal. You have a finite portion of a sinusoid, which I think of as a rectangularly windowed sinusoid. Go away, let me think more about how to explain all that, and then I'll come find you tomorrow."

Because of my background in digital signal processing, I tend to think about this in terms of the relationships between three kinds of Fourier transforms:

I have written before (23-Nov-2009) about the various kinds of Fourier transforms. It is this last kind, the DFT, that is computed by the MATLAB `fft` function.

It's hard to tell exactly where to start in any discussion about Fourier transform properties because the use of terminology and the mathematical convensions vary so widely. Let me just dive in and walk through Joe's example. If you have questions about what's going on, please post a comment.

*Note: In what follows, I am making a very big simplifying assumption. Specifically, I'm assuming that the sampling frequency is high enough for us to be able to discount the possibility of aliasing in the sampling of the sinusoid. This isn't a completely valid assumption because, unlike a real sinusoid, the truncated or windowed sinusoid doesn't have a finite bandwidth.*

Here some sample parameters that Joe gave me.

Sinusoid frequency in Hz:

f0 = 307;

Sampling frequency in Hz:

Fs = 44100;

Sampling time:

Ts = 1/Fs;

How many samples of the sinusoid do we have? From a theoretical perspective, think of this as the length of a rectangular window that multiplies our infinitely long sinusoid sequence.

M = 1500;

n = 0:(M-1); y = cos(2*pi*f0*Ts*n);

Compute M-point discrete Fourier transform (DFT) of `f`. This is what the function `fft` computes.

Y = fft(y);

Plot the DFT. The DFT itself doesn't have physical frequency units associated with it, but we can plot it on a Hz frequency scale by incorporating our knowledge of the sampling rate. (I'm plotting the DFT as disconnected points to emphasize its discrete-frequency nature. Also, I'll be plotting magnitudes of the Fourier transforms throughout.)

f = (0:(M-1))*Fs/M; plot(f,abs(Y),'*') xlim([0 Fs/2]) title('Magnitude of DFT') xlabel('Frequency (Hz)')

Zoom in on the peak.

p = 2*Fs/(M+1); f1 = f0 - 10*p; f2 = f0 + 10*p; xlim([f1 f2])

What's with the "spread" around the peak? The spread is caused by the fact that we do not have a pure sinusoid. We have only a finite portion of a pure sinusoid.

To explore this further, let's relate the DFT to what some might call the "real" Fourier transform of our windowed sinusoid. If we can discount aliasing, then the discrete-time Fourier transform (DTFT) is the same as the "real" Fourier transform up to half the sampling frequency. And, for a signal that is nonzero only in the interval 0 <= n < N, an M-point DFT exactly samples the DTFT as long as M >= N. The DTFT of an N-point windowed sinuoid has the form a bunch of copies of $sin(\omega (N+1)/2) / sin(\omega/2)$, where the copies are spaced apart by multiples of the sampling frequency.

Can we get a better idea of the true shape of the DTFT? Yes, by zero-padding the input to the DFT. This has the effect of sampling the DTFT more closely. We can use this relationship to plot a pretty good approximation of the DTFT and then superimpose the DFT (the output of `fft`).

M10 = length(y)*10; Y10 = fft(y,M10); f10 = (0:(M10-1))*Fs/M10; plot(f10,abs(Y10)); hold on plot(f,abs(Y),'*') hold off xlim([f1 f2]) legend({'DTFT','DFT'}) xlabel('Frequency (Hz)')

The DFT samples the Fourier transform at a spacing of Fs/M. Let's pick a frequency that lines up exactly on a bin.

f0 = 294; w0 = 2*pi*f0/Fs; n = 0:(M-1); y = cos(w0*n); Y = fft(y); p = 2*Fs/(M+1); f1 = f0 - 10*p; f2 = f0 + 10*p; plot(f,abs(Y),'*') xlim([f1 f2]) title('DFT') xlabel('Frequency (Hz)')

The DFT samples (output of `fft`) **appears** to be perfect above. It has just one nonzero value at the sinusoid frequency.

But the appearance of perfection is just a kind of illusion. You can see what's really happening by superimposing the DFT samples on top of the DTFT, which we compute as before using the zero-padding trick.

Y10 = fft(y,M10); plot(f10,abs(Y10)); hold on plot(f,abs(Y),'*') hold off xlim([f1 f2]) title('DTFT with DFT superimposed') xlabel('Frequency (Hz)')

You can see that most of the DFT samples happen to line up precisely at the zeros of the DTFT.

Finally, let's look at what happens when you use a longer signal. (Joe said, "Can't I make the spread go away by using a sufficiently large number of samples of the sinusoid?" The answer, as we'll see demonstrated, is that you can make the spread get smaller, but you can't make it go away entirely as long as you have a finite number of samples.) Let's go up a factor of 10, to 15,000 samples of the sinusoid. (And let's go back to the original frequency of 307 Hz.)

f0 = 307; M = 15000; w0 = 2*pi*f0/Fs; n = 0:(M-1); y = cos(w0*n); Y = fft(y); M10 = length(y)*10; Y10 = fft(y,M10); f10 = (0:(M10-1))*Fs/M10; f = (0:(M-1))*Fs/M; plot(f10,abs(Y10)); hold on plot(f,abs(Y),'*') hold off xlim([f1 f2]) title('DTFT with DFT superimposed') xlabel('Frequency (Hz)')

The plot above is shown at the same scale as the one for M = 1500 further up. You can see that the peak is much narrower, but it's hard to see the details. Let's zoom in closer.

p = 2*Fs/(M+1); f1 = f0 - 10*p; f2 = f0 + 10*p; xlim([f1 f2])

Now you can see that the DTFT shape and DFT samples look very much like what we got for the 1500-point windowed sinusoid. The width of the peak, as well as the spacing of the sidelobes, has been reduced by a factor of 10 (15000 / 1500). And the spacing (in Hz) of the DFT samples has been reduced by the same amount.

In summary, the Fourier transform of a finite portion of a sinusoid has a characteristic shape: a central peak (the "main lobe") with a bunch of "side lobes" in either direction from the peak. The DFT simply samples this shape. Depending on the exact frequency, the output of the DFT might look a little different, depending on where the frequency happens to line up with the DFT frequency bins.

Get
the MATLAB code

Published with MATLAB® R2014a

Idin's pick for this week is the SimRF Models of Analog Devices RF Agile Transceivers by MathWorks RF Team.

This is probably one of the best documented submissions on the File Exchange. The Simulink models in this submission can be used to accurately simulate a commercial radio frequency (RF) chip (Analog Devices AD9361). The work was the result of collaboration between MathWorks and Analog Devices engineers, and the final result is a software model that matches the behavior of the physical device on the lab bench.

You can get a lot of details on the hardware and these software models here.

I will not go into any further technical details on this page, but I want to point out why I believe this was a useful and interesting project.

First, this is just plain cool! Seeing any software model predict (or match) hardware behavior at the level of detail that we find in these models is always interesting (having run simulations for 15+ years, I’m still amused when what I see in the lab matches what I predicted in simulation; it feels magical every time!)

Second, it helps understand and program the chip. An RF chip like the AD9361 (or most any chip for that matter) is a complex piece of hardware, and you can't see inside of it. This is an "agile" RF transceiver, which means it can be programmed to perform many different tasks for many different radio applications. Understanding the functioning of such a chip, and programming it correctly are rather difficult tasks in hardware, especially when the results of one's work can't be seen immediately (at least not without the use of expensive lab equipment such as spectrum analyzers). Having a software model that accurately mimics the hardware allows me to understand the behavior of the hardware such that I can program it better.

Third, debugging. If you have programmed anything but the simplest piece of hardware, you know that things rarely do what you want on the first try. Understanding apparent anomalies and hardware "misbehavior" is made infinitely easier when we can go to a simulation model and trace different signals to our heart"s content! With this Simulink model I can look at intermediate signals that I could never access on the physical chip. These intermediate signals often hold the key to understanding why the system’s outputs look the way they do.

Fourth (this should really be 3.5), studying the effects of an RF chip on digital baseband signals. These effects are generally something that we observe once we put everything together in the lab. Being able to simulate the digital baseband with the RF front-end will be quite useful in designing a robust communication system before we put any hardware together. Remember that as cool as physical prototypes are, they're typically quite expensive and time-consuming to build.

Finally (or fifth!), it builds trust in Simulink and SimRF. This one explains why we invested the time to do this. The fact that we could construct this model in a relatively short amount of time, and that it matches lab measurements, makes us confident that the simulation software tools are both capable and accurate.

I encourage you to take a look at the documentation and videos on this page to get more information on this package.

As always, your thoughts and comments here are greatly appreciated.

Get
the MATLAB code

Published with MATLAB® R2014b

Since I am really enjoying the book, I decided that for this week's post I would use Simulink to validate some of the numbers provided by the author to one of the question in the book: If an asteroid was very small but supermassive, could you really live on it like the Little Prince?

**What would the gravity feel like on this little planet?**

For a tiny asteroid of 1.75m radius like the one in Le Petit Prince, by Antoine de Saint-Exupéry to have a gravitational field comparable to the one on Earth, it would need to weight about 500 million tons.

But how would this gravitational field change as you get farther form the surface? Since the radius is so small, the effect of the gravity would quickly go down, as illustrated by Randall:

*Source: http://what-if.xkcd.com/68/*

To validate that, I used a new R2014a feature in SimMechanics: The Gravitational Field block. To test the effect of the gravity, I simply linked together the Gravitational Field of 500 million tons and a Solid of 65kg with a Prismatic Joint, and slowly increased the distance between them. As you can see, the SimMechanics results match the book pretty well!

**What About the Escape Velocity?**

The next thing mentioned in the book is that if you were able to run fast enough (~5m/s) and jump, you could escape the asteroid gravity field:

*Source: http://what-if.xkcd.com/68/*

In SimMechanics, I replaced the Prismatic Joint by a Rectangular Joint:

I specified appropriate initial conditions for the joint:

I clicked play and I could see in the Mechanics Explorer that for the chosen speed, I would end up in a pretty steady almost circular orbit:

**How stable would that orbit be?**

The book mentions that the orbit would be *weird*...

*Source: http://what-if.xkcd.com/68/*

To test that, I replaced the Rectangular Joint by a Planar Joint, to allow rotation. Then I dug out the skater I had built in a previous post about the Olympics and without any modification made him move his arms while in orbit... definitely, the trajectory is interesting:

**Now it's your turn**

I guess this blog post highlights the fact that you should always trust what you read on XKCD!

Thanks to the Gravitational Field block of SimMechanics, I can now begin to plan space missions without digging out my notes from the orbital mechanics class I took during grad school!

Let us know what you think by leaving a comment here.

]]>My last meeting of the day ended just a little while ago, and I was starting to think seriously about figuring out a blog post for this week. What to write about? That's when I happened to see that Cleve posted just yesterday about the FFT. Ah ha! I'll write about that, too (in honor of National Blog About the FFT Week).

I have three little sort-of stories to tell, the first of which was inspired directly by Cleve's post. Cleve described a very compact, elegant implementation of the divide-and-conquer idea that's central to fast Fourier transform algorithms. Here's the key code fragment of Cleve's textbook function `ffttx`:

% Recursive divide and conquer k = (0:n/2-1)'; w = omega .^ k; u = ffttx(x(1:2:n-1)); v = w.*ffttx(x(2:2:n)); y = [u+v; u-v];

Function `ffttx` calls itself recursively, twice. Each recursive call is on half of the original input values. Then `ffttx` does a litte extra computation at the end to produce the final result.

Like Cleve, I really love this little bit of code. But it makes me think back to all the FFT code that I saw back when I was first learning digital signal processing. (That would be the mid- to late-1980s.) Here's a typical textbook sample from the 1989 edition of *Discrete-Time Signal Processing* by Oppenheim and Schafer):

This code is typical for the time. The focus is on doing the computation in-place and using as few multiplies as possible. One thing you don't see here: explicit recursion! The function `DITFFT` does not call itself, either directly or indirectly. That was typical for the time. No one wanted to use the memory to make copies of the two half-length arrays, and no one wanted to incur the speed penalty of many recursive function calls. Another typical thing you see in this code is something charmingly called bit-reversed sorting, but that's a topic for another day (maybe).

Anyway, my point is that modern FFT implementations built for today's computer architectures often do make use of explicit recursion. The additional cost of dividing the input array into two or more smaller copies is more than offset by the gains of more cache-friendly execution as the divide-and-conquer problems get smaller and smaller.

My second little story was inspired by one line of code that Cleve posted:

y = [u+v; u-v];

In the case where `u` is `x(1)` and `v` is `x(2)`, then the computation above is just the 2-point DFT of `x`.

A signal flow diagram for it looks like this (also from *Discrete-Time Signal Processing*):

Digital signal processing experts have for decades referred to this little graph as a "butterfly diagram" because of its appearance.

Well, to make a little extra money for myself in grad school, I wrote a couple of chapters of the solutions manual for *Discrete-Time Signal Processing*. One of the chapters covered FFTs. This effort left me completely fed up with drawing butterfly diagrams and counting multiplies. One day I posted the following chart outside my cubicle:

Yes, digital signal processing people even call that mess above a "butterfly diagram." We're a weird bunch.

That butterfly diagram indirectly leads me to my third little FFT story.

About ten years ago, a tech support case was forwarded to me. A customer complained that MATLAB was very slow in computing the FFT of one specific vector. Other vectors of the same length would go really fast, but not this one.

So I loaded the customer's MAT-file containing the vector and started running FFTs on it. To my surprise, it really was slow! I scratched my head. Then I looked at the output values, and I was even more surprised to see that they were all NaNs! Strange! What was the point? And was this related to the speed issue?

I looked at the long input vector provided by the customer, and I found that just one of the input values was NaN.

Light bulb!

Take a look again at that complicated butterfly diagram above. Notice that every output value depends on every input value. That's true in general for FFT computations of any length. With the way NaNs propagate through arithmetic operations, that implies that just one NaN in the input vector means that the output of `fft` will always be completely filled with NaNs.

That was related to the speed issue, as it turned out. NaN calculations often are executed in a less-well optimized part of the CPU, or even in software. Lots and lots of NaN calculations bogged down this particular customer's problem.

FFTs are a never-ending source of fun and mystery!

I have written quite a few blog posts about Fourier transform theory and algorithms. You can see them all in the Fourier transforms category.

PS. If you have ideas for even more fun ways to create a vector of NaNs, be sure to add them to the comments on this post.

Get
the MATLAB code

Published with MATLAB® R2014a

This is the second in a series of three posts about the Finite Fourier Transform. This post is about the fast FFT algorithm itself. A recursive divide and conquer algorithm is implemented in an elegant MATLAB function named `ffttx`.

The acronym FFT is ambiguous. The first F stands for both "fast" and "finite". A more accurate abbreviation would be FFFT, but nobody wants to use that. In MATLAB, the expression `fft(x)` computes the finite Fourier transform of any vector `x`. The computation is fastest if the integer `n = length(x)` is the product of powers of small primes.

Direct application of the definition

$$ Y_k = \sum_{j=0}^{n-1} \omega^{jk} y_j, \ k = 0, \ldots, n-1 $$

requires $n$ multiplications and $n$ additions for each of the $n$ components of $Y$ for a total of $2 n^2$ floating-point operations. This does not include the generation of the powers of $\omega$. A computer capable of doing one multiplication and addition every microsecond would require a million seconds, or about 11.5 days, to do a million-point FFT.

Several people discovered fast FFT algorithms independently and many people have since contributed to their development, but it was a 1965 paper by John Tukey of Princeton University and John Cooley of IBM Research that is generally credited as the starting point for the modern usage of the FFT.

Modern fast FFT algorithms have computational complexity $O(n \mbox{ log}_2 n)$ instead of $O(n^2)$. If $n$ is a power of 2, a one-dimensional FFT of length $n$ requires fewer than $3 n \mbox{ log}_2 n$ floating-point operations. For $n = 2^{20}$, that's a factor of almost 35,000 faster than $2 n^2$. Even if $n = 2^{10}$, the factor is about 70.

With MATLAB 8.3 and a 2.7 GMHz Intel Core i7 laptop, the time required for `fft(x)` if `length(x)` is `2^23 = 8388608` is about 0.5 second. The built-in `fft` function is based on FFTW, "The Fastest Fourier Transform in the West," developed at M.I.T. by Matteo Frigo and Steven G. Johnson in 1998.

The key to the fast FFT algorithms is the fact that the square of the $2n$ th root of unity is the $n$ th root of unity. Using complex notation,

$$ \omega = \omega_n = e^{-2 \pi i/n} $$

we have

$$ \omega_{2n}^2 = \omega_n $$

The derivation of the fast algorithm starts with the definition of the finite Fourier transform:

$$ Y_k = \sum_{j=0}^{n-1} \omega^{jk} y_j, \ k = 0, \ldots, n-1 $$

Divide the sum into terms with even subscripts and terms with odd subscripts.

$$ Y_k = \sum_{even \ j} \omega^{jk} y_j \ + \ \sum_{ odd \ j} \omega^{jk} y_j $$

Assume that $n$ is even, that $k \leq n/2-1$ and repurpose the summation subscripts.

$$ Y_k = \sum_{j=0}^{n/2-1} \omega^{2jk} y_{2j} \ + \ \omega^k \sum_{j=0}^{n/2-1} \omega^{2jk} y_{2j+1} $$

The two sums are components of the FFTs of length $n/2$ of the portions of $y$ with even and odd subscripts. In order to get the entire FFT of length $n$, we have to do two FFTs of length $n/2$, multiply one of these by powers of $\omega$, and concatenate the results.

The relationship between an FFT of length $n$ and two FFTs of length $n/2$ can be expressed compactly in MATLAB.

omega = exp(-2*pi*i/n); k = (0:n/2-1)'; w = omega .^ k; u = fft(y(1:2:n-1)); v = w.*fft(y(2:2:n));

then

fft(y) = [u+v; u-v];

Now, if $n$ is not only even but actually a power of 2, the process can be repeated. The FFT of length $n$ is expressed in terms of two FFTs of length $n/2$, then four FFTs of length $n/4$, then eight FFTs of length $n/8$, and so on until we reach $n$ FFTs of length one. An FFT of length one is just the number itself. If $n = 2^p$, the number of steps in the recursion is $p$. There is $O(n)$ work at each step, so the total amount of work is

$$ O(np) = O(n \mbox{ log}_2 n) $$

If $n$ is not a power of two, it is still possible to express the FFT of length $n$ in terms of several shorter FFTs. An FFT of length 100 is two FFTs of length 50, or four FFTs of length 25. An FFT of length 25 can be expressed in terms of five FFTs of length 5. If $n$ is not a prime number, an FFT of length $n$ can be expressed in terms of FFTs whose lengths divide $n$. Even if $n$ is prime, it is possible to embed the FFT in another whose length can be factored. We do not go into the details of these algorithms here.

The textbook function `ffttx` combines two basic ideas. If $n$ is a power of 2, it uses the $O(n \mbox{ log}_2 n)$ fast algorithm. If $n$ has an odd factor, it uses the fast recursion until it reaches an odd length, then sets up the discrete Fourier matrix and uses matrix-vector multiplication.

`ffttx` is one of my favorite MATLAB functions. Here is the entire code.

function y = ffttx(x) %FFTTX Textbook Fast Finite Fourier Transform. % FFTTX(X) computes the same finite Fourier transform % as FFT(X). The code uses a recursive divide and % conquer algorithm for even order and matrix-vector % multiplication for odd order. If length(X) is m*p % where m is odd and p is a power of 2, the computational % complexity of this approach is O(m^2)*O(p*log2(p)).

x = x(:); n = length(x); omega = exp(-2*pi*i/n);

if rem(n,2) == 0 % Recursive divide and conquer k = (0:n/2-1)'; w = omega .^ k; u = ffttx(x(1:2:n-1)); v = w.*ffttx(x(2:2:n)); y = [u+v; u-v]; else % The Fourier matrix j = 0:n-1; k = j'; F = omega .^ (k*j); y = F*x; end

Here is a plot of the Fourier matrix of order 16. Plots like this are the subject of the next post in this series. For now, just notice that this is *not* a complete graph with 16 nodes.

plot(fft(eye(16))) axis equal axis off

Steve Eddins has written extensively about the FFT in his blog. Select "Categories more" and then click on "Fourier transforms". <http://blogs.mathworks.com/steve>

James W. Cooley and John W. Tukey, "An algorithm for the machine calculation of complex Fourier series", Math. Comput. 19: 297-301, 1965. <http://dx.doi.org/10.2307%2F2003354>

Matteo Frigo and Steven G. Johnson, "FFTW, Fastest Fourier Transform in the West", <http://www.fftw.org>.

Get
the MATLAB code

Published with MATLAB® R2014a

*Marta is a senior application engineer at MathWorks UK, helping MATLAB users to design, develop and deploy production-quality code. Toshi is a Senior Marketing Manager at MathWorks.*

*by Marta Wilczkowiak and Toshi Takeuchi*

In the previous post, you learned how we built an RFID-based people tracking system from scratch in three months, a “technology stack” that spanned multiple hardware and software disciplines. We ended the post with how we managed to survive the moment of the truth: the first integration test, live at a conference with hundreds of delegates.

Now we had raw data to analyze to improve our system.

**The Twist – Making Sense of RFID Data**

Once we examined the real word data collected during the event more closely, we faced a new dilemma – we had originally assumed we could use signal strength to estimate distances. However, in reality there was so much fluctuation in signal strength that you could not make a confident estimate this way. This, incidentally, is why RFID-based localization is a hot research topic today.

We were determined to work on an engineering solution, but it would take more time and effort. In the meantime, was there a way to get meaningful insights from the messy data at hand?

**Web Analytics Meets Engineering **

As we discussed the conference data internally at MathWorks, this issue came to the attention of Toshi Takeuchi, a US-based analyst. He immediately saw a similarity with the familiar problems he faced in analyzing massive volume of page views and clicks – always messy. What would happen if you applied web analytics techniques to this engineering problem? We embraced this suggestion and immediately shared our code and data with Toshi. Our collaboration was possible because we all spoke the same language – MATLAB.

**Use of Proxy Metric**

Non-physical measurements are often used in web analytics. For example, the recommendations you see on online shopping or movie streaming websites are computed using distance measured in terms of similarity or affinity. Was there anything in the dataset, besides the signal strength, that we could turn into a proxy metric? The new approach Toshi came up with was to recast connection frequency as this virtual distance. If two tracking devices are connecting to one another more frequently than others in a given time interval, then we could think of them being “close” in the virtual space defined by connection frequency. This metric seemed far more reliable than signal strength. Our original goal was to understand the social interactions of the delegates, and physical distance metric was just a means to that end. If a virtual metric did the same job, then it seemed a good substitute.

**The Result**

With the support from Toshi, we were able to reconstruct the estimated movements of the delegates using the connection frequencies with stationary anchors such as demo stations (orange circles) and catering (green circles). Note that no personally identifiable information was collected or used for this analysis.

In the plot above, you see such an estimated movement for Delegate #120. This person appeared to have approached a nearby catering station first, then spent quite some time around BLOODHOUND Super Sonic Car in the middle, but in the afternoon stayed “closer” to software related demos than hardware-based demos as measured in this “virtual” space.

We can use this information to map the frequencies of connections between delegates and demo stations, and visualize it as a heat map. The vertical bands with lighter red stripes indicates which demos had more connections than others, and horizontal stripes show to which demo stations each delegate connected frequently. You can see that people who connected with “RT” (a real-time testing exhibit) and “HW” (a hardware testing exhibit) had very little overlap with the people who connected with “Robo” (a robotics exhibit).

**The Insight – the Three Factors of Delegate Behavior **

Ultimately, the data is only as good as the insights you get from it. When we computed the estimated movements of the delegates with a technique called Principal Component Analysis, we could summarize the complex data by a mere three factors.

- The “Web”: people who showed strong “connections” with the “Web” demo
- The “Robo”: people who showed strong “connections” with “Robo” demo
- The “Conventional”: people who showed strong “connections” with “HW” demo

Furthermore, “Web” people and “Robo” people tend to share similar connection patterns as compared to “Conventional” people – what is common to both? They are both what we might call “convergence” applications where software and hardware are being integrated in an unconventional way.

While this might not be a novel insight, it was still very useful for us to know that the data backed up what we had been intuitively sensing all along. It was also interesting to see which delegates showed which particular inclinations on an individual basis (each dot in the plot identify a specific delegate).

**Back to Reality – the Next Step**

It is tempting to speculate if the virtual proximity on this plot correlates to actual shared interest among those delegates. If so, it is also interesting that two of the main factors turned out to be the web and robotics, which are examples of convergence-oriented applications that cross the traditional virtual/physical divide.

If there was an exit survey that asked the participants which demos they found interesting, we could use predictive analytics techniques to validate how well this virtual metric predicts the actual interest. Alternatively, we could use additional stationary anchors to determine which sessions delegates attended. If we placed more anchors on the floors in a more evenly spaced grid, including the areas populated by third party exhibitors, it would give more data points and more accurate measurements.

If we validated this correlation, it could open up opportunities for new ways for our delegates to find like-minded people.

**The Journey Continues**

Our journey is not over yet – we still have challenges with the system. But our project proved how such an inter-disciplinary system could be built in three months using MATLAB, Simulink, and lots of team work.

]]>Sean's pick this week is Simscape Language: Nonlinear Rotational Spring by Steve Miller.

I spend a fair amount of time exploring the MathWorks product suite trying to learn new things. Exploring the File Exchange, I came upon this example of a nonlinear spring. Having not had much opportunity to use our Physical Modeling Tools this example caught my eye because it uses something us civil/structural engineers are familiar with: springs!

The example includes a README file to get started, a PDF file, and a link to a video. Not going to lie, I didn't watch the video. But I didn't have to! The model documents how things work and the README gave me all of the instructions necessary to get this off the ground.

Here is the model and the scope showing a linear spring compared to the nonlinear spring:

This is the design of the nonlinear spring in the Simscape Language.

Out of curiosity, I wanted to see what would happen for a much stiffer spring (that's what structural engineers use!), so
I changed the value of *spring_rate* to 50 and reran the model:

And that's the best part about simulation: I didn't have to go buy or test a bigger spring to see what would happen.

Give it a try and let us know what you think here or leave a comment for Steve.

Get
the MATLAB code

Published with MATLAB® R2014a

How can you find out what's in an image file before you read in the data? The answer is the function `imfinfo`. Today I want to explore the basics of using this function.

The function `imfinfo` returns a struct. The fields of the struct contain information about the image file.

Here's an example.

```
imfinfo('peppers.png')
```

ans = Filename: '/Applications/MATLAB_R2014a.app/toolbox/matla...' FileModDate: '02-Apr-2013 15:55:52' FileSize: 287677 Format: 'png' FormatVersion: [] Width: 512 Height: 384 BitDepth: 24 ColorType: 'truecolor' FormatSignature: [137 80 78 71 13 10 26 10] Colormap: [] Histogram: [] InterlaceType: 'none' Transparency: 'none' SimpleTransparencyData: [] BackgroundColor: [] RenderingIntent: [] Chromaticities: [] Gamma: [] XResolution: [] YResolution: [] ResolutionUnit: [] XOffset: [] YOffset: [] OffsetUnit: [] SignificantBits: [] ImageModTime: '16 Jul 2002 16:46:41 +0000' Title: [] Author: [] Description: 'Zesty peppers' Copyright: 'Copyright The MathWorks, Inc.' CreationTime: [] Software: [] Disclaimer: [] Warning: [] Source: [] Comment: [] OtherText: []

The fields returned by `imfinfo` aren't the same for all image files. Here's the output for a TIFF file.

```
imfinfo('forest.tif')
```

ans = Filename: '/Applications/MATLAB_R2014a.app/toolbox/im...' FileModDate: '25-Sep-2013 20:12:00' FileSize: 124888 Format: 'tif' FormatVersion: [] Width: 447 Height: 301 BitDepth: 8 ColorType: 'indexed' FormatSignature: [73 73 42 0] ByteOrder: 'little-endian' NewSubFileType: 0 BitsPerSample: 8 Compression: 'PackBits' PhotometricInterpretation: 'RGB Palette' StripOffsets: [1x17 double] SamplesPerPixel: 1 RowsPerStrip: 18 StripByteCounts: [1x17 double] XResolution: 72 YResolution: 72 ResolutionUnit: 'Inch' Colormap: [256x3 double] PlanarConfiguration: 'Chunky' TileWidth: [] TileLength: [] TileOffsets: [] TileByteCounts: [] Orientation: 1 FillOrder: 1 GrayResponseUnit: 0.0100 MaxSampleValue: 255 MinSampleValue: 0 Thresholding: 1 Offset: 122964 ImageDescription: 'Carmanah Ancient Forest, British Columbia,...'

Only the first nine fields of the returned struct are always the same:

`Filename``FileModDate``FileSize``Format``FormatVersion``Width``Height``BitDepth``ColorType`

`FileModDate` is the file modification date (as reported by the operating system). `FileSize` is the size of the file in bytes. The peppers.png file takes up 287,677 bytes on disk.

`Format` is an abbreviation indicating the type of image file (PNG, TIFF, JPEG, etc.). Although frequently the filename extension indicates the image format type, this isn't always the case. The `Format` field shows you the format type whether or not the filename extension indicates it.

`FormatVersion` has turned out to be less useful than I thought would be back in the mid-1990s. You generally don't need to pay attention to it.

`Width` is the number of image columns, and `Height` is the number of rows.

If I were designing `imfinfo` from scratch today, I'd probably call the next field `BitsPerPixel` instead of `BitDepth`. I might also add a field called `SamplesPerPixel`. Anyway, images such as peppers.png used to be frequently called "24-bit images" because each pixel is represented by three samples (one number for red, one for green, and one for blue), and each sample is represented by 8 bits. 3*8 is 24, hence 24-bit image.

The file forest.tif, on the other hand, uses only one sample for each pixel, and each sample is represented by 8 bits, so forest.tif is sometimes called an "8-bit image," and its `BitDepth` is 8.

`ColorType` tells us how the pixel values are interpreted as colors on the screen. The most common values for `ColorType` are `'grayscale'`, `'indexed'`, and `'truecolor'`. In a grayscale image, each pixel value represents a shade of gray. In an indexed image, each pixel value is really a lookup index used to get the pixel color from a colormap (also called a palette). Truecolor (in this context) originally referred to a 24-bit image with three samples per pixels, and with the samples representing red, green, and blue. In MATLAB, though, we use the term to refer to any RGB image represented by three samples per pixel, regardless of the bit depth.

Here's an example of using the information returned by `imfinfo` to compute the compression ratio for a JPEG file. Let's use ngc6543a.jpg, the first truecolor sample image shipped with MATLAB.

```
info = imfinfo('ngc6543a.jpg');
file_bytes = info.FileSize;
image_bytes = info.Width * info.Height * info.BitDepth / 8;
```

I divided by 8 in the line above because `BitDepth` is in bits instead of bytes.

compression_ratio = image_bytes / file_bytes

compression_ratio = 42.7210

So what about all those other fields after the first nine? Well, they vary depending on the file format and the specific information contained in the file. For example, TIFF files optionally contain horizontal and vertical resolution information.

```
info = imfinfo('forest.tif');
info.XResolution
```

ans = 72

info.YResolution

ans = 72

info.ResolutionUnit

ans = Inch

Most image file formats allow you to store text descriptions in them.

```
info = imfinfo('peppers.png');
info.Description
```

ans = Zesty peppers

These days, most cameras store a lot more information in image files than the older sample images that are in MATLAB and the Image Processing Toolbox. This information is also available using `imfinfo`.

imshow('IMG_1046.JPG','InitialMagnification',20)

```
info = imfinfo('IMG_1046.JPG')
```

info = Filename: '/Users/steve/Dropbox/MATLAB/Blogs/IMG_1046.JPG' FileModDate: '28-Jun-2014 22:07:32' FileSize: 1595137 Format: 'jpg' FormatVersion: '' Width: 2816 Height: 1880 BitDepth: 24 ColorType: 'truecolor' FormatSignature: '' NumberOfSamples: 3 CodingMethod: 'Huffman' CodingProcess: 'Sequential' Comment: {} ImageDescription: ' ' Make: 'Canon' Model: 'Canon PowerShot S95' Orientation: 1 XResolution: 180 YResolution: 180 ResolutionUnit: 'Inch' DateTime: '2014:06:28 19:07:32' YCbCrPositioning: 'Co-sited' DigitalCamera: [1x1 struct] ExifThumbnail: [1x1 struct]

That extra information is in the `DigitalCamera` field.

info.DigitalCamera

ans = ExposureTime: 0.0040 FNumber: 4.9000 ISOSpeedRatings: 80 ExifVersion: [48 50 51 48] DateTimeOriginal: '2014:06:28 19:07:32' DateTimeDigitized: '2014:06:28 19:07:32' ComponentsConfiguration: 'YCbCr' CompressedBitsPerPixel: 3 ShutterSpeedValue: 7.9688 ApertureValue: 4.5938 ExposureBiasValue: 0 MaxApertureValue: 4.5938 MeteringMode: 'Pattern' Flash: 'Flash did not fire, no strobe return detect...' FocalLength: 22.5000 MakerNote: [1x2764 double] UserComment: [1x264 double] FlashpixVersion: [48 49 48 48] ColorSpace: 'sRGB' CPixelXDimension: 2816 CPixelYDimension: 1880 InteroperabilityIFD: [1x1 struct] FocalPlaneXResolution: 9.6438e+03 FocalPlaneYResolution: 9.6575e+03 FocalPlaneResolutionUnit: 2 SensingMethod: 'One-chip color area sensor' FileSource: 'DSC' CustomRendered: 'Normal process' ExposureMode: 'Auto exposure' WhiteBalance: 'Auto white balance' DigitalZoomRatio: 1 SceneCaptureType: 'Standard' UnknownTags: [1x1 struct]

Do you have questions about `imfinfo`? Do you ever use it for writing batch processing scripts? I'd be interested to hear about it, so please leave a comment below.

Get
the MATLAB code

Published with MATLAB® R2014a

**The Problem**

If you ask me what is the difference between Simulink Coder and Embedded Coder, I would tell you that Simulink Coder allows you to generate code from a Simulink model, and Embedded Coder allows you to configure how the code looks like.

To illustrate that, we will start with a simple hand-written program, and see how to configure a model so that the code generated from it integrates without modification.

In this simple `main`, at every second, the program reads data from a text file and stores it in a global variable `u`. We want our auto-generated code to access this variable, and use it to compute value of another global variable `y` to be used later in the code.

**Configuring the Model**

For this example, let's use the following simple model:.

By default, if we generate code for this model we get something that looks like:

It is obvious that modifications are necessary to map the values of `u` and `y` in the hand-written code to the input and output of the generated code.

To configure the look of the code, the first step is to name the input signal `u` and the output `y`. Then we need to tell Embedded Coder that `u` and `y` are external variables, already defined outside of the generated code, in the hand-written code.

For that, right-click on the signal line and select Properties:

Go to the Code Generation tab, and set the storage class to `ImportedExtern` (Note that the storage class and a lot more could also have been specified using a data object).

**Generated Code**

Generate code for the model, and you should get something which looks like this:

Where `u` and `y` are declared as `extern`:

With this modification, the generated code integrates in the hand-written application, I can build the `main` program without errors.

**Now it's your turn**

There are many of ways to customize the code generated by Embedded Coder. This example is probably the simplest one possible, but I hope this gives you a good idea of how to get started. Let us know how you got started with the coders by leaving a comment here.

]]>Many of us carry around smartphones that can track our GPS positions and that's an interesting source of data. How can we analyze GPS data in MATLAB?

Today's guest blogger, Toshi Takeuchi, would like to share an analysis of a public GPS dataset from a popular ride sharing service Uber.

Uber is a ride sharing service that connects passengers with private drivers through a mobile app and takes care of payment. They are in fact so popular that you hear about them in the news due to their conflicts with local traffic regulations and taxi business interests.

Uber’s ride sharing GPS data was available publicly on infochimps.com, so I used it for this analysis (Unfortunately it is not available anymore). What can we learn from this dataset?

Let's start by downloading the dataset from the link above (a zipped TSV file), which contains the GPS logs taken from the mobile apps in Uber cars that were actively transporting passengers in San Francisco. The data have been anonymized by removing names, trip start and end points. The dates were also substituted. Weekdays and time of day are still intact.

For the purpose of this analysis, let's focus on the data captured in the city proper and visualize it with Mapping Toolbox.

Run the script to load data. Check `loadData.m` to see the details.

loadData

Overlay the GPS points on the map.

states = geoshape(shaperead('usastatehi', 'UseGeoCoords', true)); latlim = [min(T.Lat) max(T.Lat)]; lonlim = [min(T.Lon) max(T.Lon)]; ocean = [0.7 0.8 1]; land = [0.9 0.9 0.8]; figure ax = usamap(latlim, lonlim); setm(ax, 'FFaceColor', ocean) geoshow(states,'FaceColor',land) geoshow(T.Lat,T.Lon,'DisplayType','Point','Marker','.',... 'MarkerSize',4,'MarkerEdgeColor',[0 0 1]) title('Uber GPS Log Data') xlabel('San Francisco') textm(37.802069,-122.446618,'Marina') textm(37.808376,-122.426105,'Fishermans Wharf') textm(37.797322,-122.482409,'Presidio') textm(37.774546,-122.412329,'SOMA') textm(37.770731,-122.440481,'Haight') textm(37.818276,-122.498546,'Golden Gate Bridge') textm(37.819632,-122.376065,'Bay Bridge')

Let's start with a basic question - how does the use of Uber service change over time. We can use `grpstats` to summarize data grouped by specific categorical values, such as `DayName` and `TimeOfDay`, which were added in the data loading process.

Get grouped summaries.

byDay = grpstats(T(:,{'Lat','Lon','DayName'}),'DayName'); byDayTime = grpstats(T(:,{'Lat','Lon','TimeOfDay','DayName'}),... {'DayName','TimeOfDay'});

Reshape the count of entries into a 24x7 matrix.

byDayTimeCount = reshape(byDayTime.GroupCount,24,7)';

Plot the data by day of week and by hours per day of week.

figure subplot(2,1,1) bar(byDay.GroupCount); set(gca,'XTick',1:7,'XTickLabel',cellstr(byDay.DayName)); subplot(2,1,2) plot(byDayTimeCount'); set(gca,'XTick',1:24); xlabel('Hours by Day of Week'); legend('Mon','Tue','Wed','Thu','Fri','Sat','Sun',... 'Orientation','Horizontal','Location','SouthOutside')

It looks like the usage goes up during the weekend (Friday through Sunday) and usage peaks in early hours of the day. San Francisco has a very active night life!

Is there a way to figure out where people go during the weekend? Even though the dataset doesn't contain the actual starting and ending points of individual trips, we may still get a sense of how the traffic flows by looking at the first and last points of each record.

We can extract the starting and ending location data for weekend rides. Click `getStartEndPoints.m` to see how it is done. If you would like to run this script, please download `districts.xlsx` as well.

% Here we load the preprocessed data |startEnd.mat| to save time and plot % their starting points. % getStartEndPoints % commented out to save time load startEnd.mat % load the preprocessed data instead figure ax = usamap(latlim, lonlim); setm(ax, 'FFaceColor', ocean) geoshow(states,'FaceColor',land) geoshow(startEnd.StartLat,startEnd.StartLon,'DisplayType','Point',... 'Marker','.','MarkerSize',5,'MarkerEdgeColor',[0 0 1]) title('Uber Weekend Rides - Starting Points') xlabel('San Francisco') textm(37.802069,-122.446618,'Marina') textm(37.808376,-122.426105,'Fishermans Wharf') textm(37.797322,-122.482409,'Presidio') textm(37.774546,-122.412329,'SOMA') textm(37.770731,-122.440481,'Haight') textm(37.818276,-122.498546,'Golden Gate Bridge') textm(37.819632,-122.376065,'Bay Bridge')

When you plot the longitude and latitude data, you just get messy point clouds and it is hard to see what's going on. Instead, I broke the map of San Francisco into rectangular blocks to approximate its districts. Here is the new plot of starting points by district.

dist = categories(startEnd.StartDist); cc = hsv(length(dist)); figure ax = usamap(latlim, lonlim); setm(ax, 'FFaceColor', ocean) geoshow(states,'FaceColor',land) for i = 1:length(dist) inDist = startEnd.StartDist == dist(i); geoshow(startEnd.StartLat(inDist),startEnd.StartLon(inDist),... 'DisplayType','Point','Marker','.','MarkerSize',5,'MarkerEdgeColor',cc(i,:)) end title('Uber Weekend Rides - Starting Points by District') xlabel('San Francisco') textm(37.802069,-122.446618,'Marina') textm(37.808376,-122.426105,'Fishermans Wharf') textm(37.797322,-122.482409,'Presidio') textm(37.774546,-122.412329,'SOMA') textm(37.770731,-122.440481,'Haight') textm(37.818276,-122.498546,'Golden Gate Bridge') textm(37.819632,-122.376065,'Bay Bridge')

This is a step in the right direction. Now that we have the starting and ending points grouped by districts, we can represent the rides as connections among different districts - this is essentially a graph with districts as nodes and rides as edges. To visualize this graph, we can use a popular social networking analysis tool Gephi, which was also used in another post, Analyzing Twitter with MATLAB.

You can export `StartDist` and `EndDist` as the edge list to Gephi in CSV format.

writetable(startEnd(:,{'StartDist','EndDist'}),'edgelist.csv',... 'WriteVariableNames',false)

Once you export the edge list, you can plot the connections (edges) between districts (nodes) in Gephi. Now it is much easier to see where people went during the weekend! To see a bigger image, check out the PDF version.

- The size of the district nodes represents their in-degrees, the number of incoming connections, and you can think of it as measure of popularity as destinations. SOMA, Haight, Mission District, Downtown, and The Castro are the popular locations based on this measure.
- The districts are colored based on their modularity, which basically means which cluster of nodes they belong to. It looks like people hang around set of districts that are nearby - SOMA, Downtown, Mission District are all located towards the south (green). The Castro, Haight, Western Addition in the center (purple) and it is strongly connected to Richmond and Sunset District in the west. Since those are residential areas, it seems people from those areas hang out in the other districts in the same cluster.
- The locals don't seem to go to Fisherman's Wharf or Chinatown in the north (red) very much - they are probably considered not cool because of tourists?

Now you know where to go in San Francisco during the weekend if you want to experience an active night life there. We just looked at the overall weekend data, but you can explore more by time slicing the data to see how the traffic pattern changes based on the time of day or day of the week. You may be able to find traffic congestions by calculating the speed using the timestamps. Try it yourself and share what you find here!.

Get
the MATLAB code

Published with MATLAB® R2014a

Brett's Picks this week are: Image Blur Metric, by Do Quoc Bao; Blind image quality assessment through anisotropy, by Salvador Gabarda; and Noise Level Estimation from a Single Image, by Masayuki Tanaka.

Brett returns to his roots as an image processing geek this week to pose a question that he gets asked by customers from time to time. Namely, *'How can I quantify the noise in my image?'*

In R2014a, we added two functions that facilitate the measurement of image noise: `ssim`, with which you can calculate the overall or pixelwise "structural similarity index"; and `psnr`, for computing the signal-to-noise ratio (peak or simple) of an image.

Both of these calculations are made by comparing a potentially noisy image to a "pristine" one; if you have a reference image and you need to calculate deviations from it, these functions may serve you well. (Note that there are many other files on the the File Exchange that facilitate different reference-image quality metrics. Do a quick search for "image quality"; you'll find several.)

But what if you don't have a reference image to which to compare your noisy image? That is, what if you need to do a "blind" image quality assessment? This can be simultaneously much more useful and much more difficult. Approaches to blind noise assessment often focus on quantification of blur or, conversely, determination of edge sharpness. I have on numerous occasions successfully used the "threshold" output of the various `edge` metrics to decide which images are of higher quality. For instance:

ax = zeros(3,1); % Pristine img = imread('rice.png'); ax(1) = subplot(1,3,1); [~,thresh] = edge(img,'sobel'); imshow(img) title(sprintf('Edge threshold = %0.2f',thresh)) % Some noise h = fspecial('motion', 5, 0); img2 = imfilter(img,h,'replicate'); ax(2) = subplot(1,3,2); [~,thresh] = edge(img2,'sobel'); imshow(img2); title(sprintf('Edge threshold = %0.2f',thresh)) % More noise h = fspecial('motion', 15, 0); img3 = imfilter(img,h,'replicate'); ax(3) = subplot(1,3,3); [~,thresh] = edge(img3,'sobel'); imshow(img3) title(sprintf('Edge threshold = %0.2f',thresh))

Todays Picks facilitate blind image quality assessment. The first is Do Quoc Bao's Image Blur Metric. While the code could use some commenting and error checking, it is well implemented; it cites the SPIE Procedings from which it was taken; and it is exceedingly easy to use; simply, the blur metric is reported to range from 0 to 1, with 0 indicating "sharp," and 1 representing "blurry":

title(ax(1),sprintf('Blur = %0.2f',blurMetric(img))) title(ax(2),sprintf('Blur = %0.2f',blurMetric(img2))) title(ax(3),sprintf('Blur = %0.2f',blurMetric(img3)))

Next, check out Blind image quality assessment through anisotropy, by Salvador Gabarda. Salvador's blind analysis "discriminates blur and Gaussian noise...given a set of registered images, the image showing the highest index is the best. It works on grayscale or color images, and provides lots of parameters to play with.

title(ax(1),sprintf('1000*Quality = %0.2f',1000*blindimagequality(img,2))) title(ax(2),sprintf('1000*Quality = %0.2f',1000*blindimagequality(img2,2))) title(ax(3),sprintf('1000*Quality = %0.2f',1000*blindimagequality(img3,2)))

Finally, consider Noise Level Estimation from a Single Image, by Masayuki Tanaka. I haven't read Masayuki's original paper on the topic; perhaps it explains why the output value appears to change inversely with blur. (Masayuki, care to chime in on that?) This implementation also works on grayscale or color images, though analysis is plane-by-plane for the latter case. Also, I'm not sure why the output value appears to change inversely with blur:

title(ax(1),sprintf('1000*Noise Estimate = %0.2f',1000*NLEstimate(img))) title(ax(2),sprintf('1000*Noise Estimate = %0.2f',1000*NLEstimate(img2))) title(ax(3),sprintf('1000*Noise Estimate = %0.2f',1000*NLEstimate(img3)))

What use cases do you have for estimating noise in images? Are you more likely to need "blind" approaches, or do methods that compare to a reference image typically suffice? I'd love to hear from you!

As always, I welcome your thoughts and comments.

Get
the MATLAB code

Published with MATLAB® R2014a

*The story is broken into two separate blog posts. This week’s installment is told by Marta Wilczkowiak. Marta is a senior application engineer at MathWorks UK, helping MATLAB users to design, develop and deploy production-quality code.*

*by Marta Wilczkowiak*

Technology can be the kiss of death to a conversation. But can we instead, through the thoughtful convergence of hardware, software, and real-time streaming data, actually encourage social interactions?

The barrier that existed between physical and virtual in engineering is disappearing in the world where internet companies buy robotics companies, fly drones, test self-driving cars and wearable devices, while the industrial players are moving into sensor networks and Internet of Things. While those new approaches are promising, such inter-disciplinary engineering collaboration is very difficult to achieve.

This is a story about our attempt to build such a convergent system in just three months, through cross-disciplinary collaboration. This blue-sky project is called Expo Conversations.

**EXPO Conversations: The Idea**

MATLAB EXPO is the pinnacle of MathWorks UK events. It is a unique opportunity for the MATLAB community to network with peers across a range of industries and academia. Planning starts early in the year and involves engineers from all technical groups at MathWorks. During one of those meetings, we set ourselves a challenge: to create a system bringing together all the new technologies we tell our customers about, including low cost hardware, data analytics and cloud computing. At the same time, we saw an opportunity to understand how ideas and influences flow throughout EXPO.

Our idea was to use low cost hardware to detect the interactions between the EXPO participants, and provide a web interface displaying the live analysis of the social activity. To encourage social interactions we would run a contest and give a prize to the most prolific networker at the event.

*Figure 1: Simulated social interactions in the EXPO Exhibition Hall*

This conversation took place in June. EXPO was scheduled for October 8th so we had little time to develop something from nothing!

To summarize, the objective was to:

- Demonstrate end-to-end technology stack integration, starting from low cost hardware to the cloud-based visualization
- Enable cross-disciplinary team collaboration
- Move from a blue sky idea to cloud deployment in 3 months

**July: First tag**

14 weeks to go

July started with discussion about the best approach to detect the attendee interactions: Cameras? Apps on mobile phones? RF tags? We wanted participants to remain anonymous so we opted for RF tags, with attendees opting-in simply by picking up a tag on their way into the exhibition hall. Conor, our 2nd year summer intern, with support from Mark (embedded systems) and Graham (communications) developed the first tags using Arduino Uno microcontrollers (ATmega328P) and RF12 433MHz transceivers. They showed that two early tag prototypes can communicate with each other and the received signal strength is somewhat related to the distance between tags. Characterizing this relationship remained a holy grail throughout the project! You can read more about this part of the project in Conor’s MakerZone blog post.

**August: Algorithms**

9.5 weeks to go

With promising early results, we kicked off the data analysis part of the project. We started by brainstorming the elements we would like to understand about the social interaction patterns. Some of our favorite suggestions: Can we quantify the relationship between caffeine intake and social activity? Can we detect wallflowers, conversation boosters… or conversation killers? Can we see if visitors to a demo station recommend it to others? We quickly realized that most of those questions needed a definition of a conversation. We decided that two tag holders are in a conversation if they stay near to each other for 30 seconds.

But how do we know whether two tags are close to each other? We decided to use Received Signal Strength Indication (RSSI) as a proxy for distance between two tags. On the simplicity-robustness scale, it is closest to simplicity and that’s what we needed given the project timeframe.

By detecting conversations between tags, we can learn the popularity of demo and catering stations (see Figure 2) by monitoring tags located at those places of interest. All the tag data would be transmitted to Raspberry Pi base stations, which would send it in batches to the analytics server every 10 seconds.

*Figure 2: EXPO Exhibition Hall floor plan with marked demos and catering stations (click to enlarge)*

We had 10 engineers helping on the data analysis challenge in August. However this posed its own challenges:

- Many short term contributors. Median time to spend on the project was a week including understanding the project, developing algorithms, integration with the system and testing.
- Simultaneous development. Regardless of which part of the system they would work on, people only had time in August, so we needed to find a way to work simultaneously rather than consecutively.
- Experience. In general, one’s time allocated to the project was inversely proportional to ones’ experience.
- Communication. The “low cost hardware” team did not understand the “analytics” team and vice versa.
- Data. We had no real data to test our algorithms on – the hardware was in development!

We needed to design a system that would allow numerous people with different backgrounds and levels of experience to:

- write algorithms simultaneously that would depend on each other
- integrate their algorithms with main system without disrupting other’s work
- test their algorithms without the data they relied upon

We put in place a 5-point plan to address these challenges:

- Strong core. Following the philosophy that a faulty design would hurt us more than a few bugs in the algorithmic code, we asked a couple of our Consultants to spend a few days designing the architecture and coding the core classes of our system.
- Small independent modules. Representing the system by small modules with clear responsibilities and interfaces was key to being able to absorb code contributions from a large team. See Figure 3 for the main system components. Creating sensible modules was facilitated by support in MATLAB for Object Oriented Programming. Keeping them small was possible thanks to the rich existing capability for reading, manipulating, analyzing and saving data. This includes built-in support for communications with databases, UDP, and different file formats as well as processing functions such as logical indexing, set operations, group statistics and graph analysis functions. All the modules were small: the largest file implementing a data processing algorithm contained 100 lines of code. Short code was obviously faster to write, but it was also faster to test.
- Self-service model repository. We designed the system as a series of “analysis agents”: classes tasked to perform a very specific analysis operation. Some agents would be responsible for cleaning entry data, some for tracking conversations; some would search for the most active networkers, etc. We combined Observer and Template Method patterns to create agents that had a relatively sophisticated way of interacting with the main infrastructure, while allowing team members with no Object Oriented experience to contribute code. In practice, this meant that the algorithm developer only had to write very simple procedural, code specific to her or his algorithm: how to get input data, process it and save output. Then with one line of code, she/he could set his agent to “observe” and act on changes on the database. As an example, on the appearance of new raw data an agent would clean it and save it back to the database.
- Data: fake it till you make it. What might be a disputable strategy in everyday social interactions was essential to enable the algorithm developers to write and test their code. Almost all parts of the analytics system were developed using simulated data. In the first few days of the project, in parallel with designing and implementing the analytics system, we created a high-level simulator that would create synthetic data representing a number of tags following journeys around the exhibition hall. Key to developing the first simulator in just a couple of days was that MATLAB offered all the required components in a single environment: random number generation, visualization and an expressive programming language. As we needed to switch to real data late in the project, we wrote our code accessing data (“Data Access Layer”) in such a way that it was possible to use data from the simulator, test logs and database without any change in the algorithmic code.
- Code generation. The next step was to add a layer to the simulator representing the logic of the system that would relay information received from the tags to the analytics server: how the tags synchronize to acquire and share the information, how they send this to base stations, and finally how base stations form and send packets to the analytics server. This “executable specification” enabled early detection of the differences between how the “hardware” and “analytics” team understood the requirements. But even more importantly once this logic has been expressed in MATLAB, we could use MATLAB Coder to automatically generate C code that would be used on the Raspberry Pi base stations. This saved us time and increased our confidence that the data we were using to test our algorithms was representative of what our data acquisition system would produce when ready. It also allowed us to test the embedded algorithms while the tag hardware was still being developed, since they were included in the simulator. This is a big benefit for embedded systems designers.

*Figure 3: High-level system architecture: uniform access to real and simulated data; analysis agents representing system logic and producing aggregate data about the event to be displayed in MATLAB or via web interface*

In the meantime, the hardware team was busy creating in reality what we had simulated: the networking system that would relay the tag data to a server.

By the end of August we had early versions of all the analysis agents, tested on data from the simulator. We also had a long list of known issues, including the fact that our prototype code could not keep up to the predicted data rate.

**September: “Production”**

5 weeks to go

September saw the transformation of our prototype system into an early version of what could be called a production system.

First, we harnessed our code in a series of unit tests (better late than never!) to allow faster iterations when fixing bugs and optimizing the code.

Second, we optimized the prototype implementations for speed. A couple of the biggest wins were:

Vectorization of code of calculation of the social network graph Implementation of some of the tag data filtering in SQL instead of MATLAB to reduce time overhead of data transfersBoth operations reduced the time needed to analyze a batch of data from 10s of seconds to fractions of seconds.

Third, we re-implemented our analysis engine to run each analysis agent on a separate thread. Out of many choices for doing so, we decided to recode the thin analysis engine skeleton in Java and deploy each analysis agent as a separate function on MATLAB Production Server. Thanks to using the loosely coupled agents in the first place, this required only changing one line in the agent’s code.

As a final step we used a machine on EC2 with an elastic IP to set up a Tomcat webserver with our webpage and a directory that would be regularly synchronized with our local directory containing all the results.

At the end of September we had a system analyzing the tags every 10 seconds and updating the webpage every minute but still working only with the simulator. Throughout September attempts to connect the two failed – due to an Endian disagreement, too slow a processing rate in the early version of the analysis engine, data overflow in the prototype data gathering stage, and more.

But we were not ready to cancel the project. In the end, we would not often have an opportunity to test it in a hall with 500 people, 8 catering stations, 9 demo stations and a BLOODHOUND SSC life-size model in the middle (yes, a big pile of metal in the middle of our RF experiment).

**October: EXPO or The First Integration Test**

1 week to go

First week of October has gone between integration attempts and bug fixing. You can see that the moods were mixed:

*Figure 4: Rainbow of emotions; Note as moods are improving when we move from bug chasing to handling some real hardware*

The day before EXPO (October 7th) three of us set off to Silverstone, the conference venue. Two hours after arriving everything was connected. The moment the webpage started to show the two test tags in “conversation” with each other was worth all those sleepless nights. We stared in awe. By 7pm all 8 base stations were connected and receiving. At that point we put the system to sleep and left the site.

The morning of EXPO passed in a flash. We started with a pile of 200 tags (see Figure 5). By 9.30 am all of them has been distributed. Attendees were curious about the system and wanted to participate. The webpage started to update with plausible results (see sample in Figure 7). All team members, most of whom so far have seen only a small part of the system, were hypnotized by the updating web page. Throughout the day the demo drew a big crowd who asked about everything from working with low cost hardware, through machine learning and “big data” to web deployment. We also highlighted parts of the system during all relevant talks: Machine Learning, Accelerating MATLAB algorithms, Deployment, Modelling, Simulation, and Real-Time Testing for Model-Based Design. Finally, in the wrap up talk we announced the tag number that won the networking competition prize. We knew the system was working when we discovered that the winner was our ex-colleague Michael, known for being a social butterfly.

So through this project we achieved:

- end-to-end technology stack integration, starting from low cost hardware to the cloud-based visualization
- cross-disciplinary team collaboration
- blue sky idea to cloud deployment in 3 months

But as EXPO was also the first time the system ran together, we still had a lot of work to understand what it was really doing.

Learn more as the story continues: Expo Conversations – Part 2

*Figure 5 Tags waiting to be handed to attendees*

*Figure 6: Queue to pick up the tags*

*Figure 7: Screenshot of the live social networking analysis*

We all use Fourier analysis every day without even knowing it. Cell phones, disc drives, DVDs, and JPEGs all involve fast finite Fourier transforms. This post, which describes touch-tone telephone dialing, is the first of three posts about the computation and interpretation of FFTs. The posts are adapted from chapter 8 of my book, *Numerical Computing with MATLAB* .

Touch-tone telephone dialing is an example of everyday use of Fourier analysis. The basis for touch-tone dialing is the Dual Tone Multi-Frequency (DTMF) system. *Synthesis* is the generation of analog tones to represent digits in phone numbers. *Analysis* is the decoding of these tones to retrieve the digits. The program `touchtone`, which is available here, or which is included with my NCM App, demonstrates both these aspects of DTMF.

The telephone dialing pad acts as a 4-by-3 matrix. Associated with each row and column is a frequency. These basic frequencies are

fr = [697 770 852 941]; fc = [1209 1336 1477];

These frequencies were chosen so that none is an integer multiple of any other. In fact the ratios are 21/19. This avoids interfering overtones.

If `s` is a character that labels one of the buttons on the keypad, the corresponding row index `k` and column index `j` can be found with

switch s case '*', k = 4; j = 1; case '0', k = 4; j = 2; case '#', k = 4; j = 3; otherwise, d = s-'0'; j = mod(d-1,3)+1; k = (d-j)/3+1; end

An important parameter in digital sound is the sampling rate.

Fs = 32768

A vector of points in one-fourth of a second at this sampling rate is

t = 0:1/Fs:0.25

The tone generated by the button in position `(k,j)` is obtained by superimposing the two fundamental tones with frequencies `fr(k)` and `fc(j)`.

y1 = sin(2*pi*fr(k)*t); y2 = sin(2*pi*fc(j)*t); y = (y1 + y2)/2;

If your computer is equipped with a sound card, the statement

sound(y,Fs)

plays the tone.

For example, this is the display produced by `touchtone` for the "1" button. The top subplot depicts the two underlying frequencies and the bottom subplot shows a portion of the signal obtained by averaging the sine waves with those frequencies.

How is it possible to determine a phone number by listening to the signal generated? This involves the FFT, the Finite Fourier Transform, or, more likely, some specialized version of the FFT adapted to this task. I will discuss the FFT in my next post.

The data file `touchtone.mat` contains a recording of a telephone being dialed. The file is available from

http://www.mathworks.com/moler/ncm/touchtone.mat

Go to the `Home` tab above the MATLAB command window, click on the `Open` tab and insert this URL into the `File name` slot. This will load a structure `y` into your MATLAB workspace. Then the statement

y

will produce

y = sig: [1x74800 int8] fs: 8192

This shows that `y` has two fields, an integer vector `y.sig`, of length 74800, containing the signal, and a scalar `y.fs`, with the value 8192, which is the sample rate.

max(abs(y.sig))

reveals that the elements of the signal are bounded in absolute value by 127. So the statements

Fs = y.fs; y = double(y.sig)/128;

save the sample rate, rescale the vector and convert it to double precision. The statements

n = length(y); t = (0:n-1)/Fs

reproduce the sample times of the recording. The last component of `t` is `9.1307`, indicating that the recording lasts a little over 9 seconds. The next figure is a plot of the entire signal.

This recording is noisy. You can even see small spikes on the graph at the times the buttons were clicked. It is easy to see that 11 digits were dialed, but, on this scale, it is impossible to determine the specific digits.

Here is the magnitude of the FFT of the signal, which is the key to determining the individual digits.

p = abs(fft(y)); f = (0:n-1)*(Fs/n); plot(f,p); axis([500 1700 0 600])

The *x* -axis corresponds to frequency. The `axis` settings limit the display to the range of the DTMF frequencies. There are seven peaks, corresponding to the seven basic frequencies. This overall FFT shows that all seven frequencies are present someplace in the signal, but it does not help determine the individual digits.

The `touchtone` program also lets you break the signal into 11 equal segments and analyze each segment separately. The next figure is the display from the first segment.

For this segment, there are only two peaks, indicating that only two of the basic frequencies are present in this portion of the signal. These two frequencies come from the "1" button. You can also see that the waveform of a short portion of the first segment is similar to the waveform that our synthesizer produces for the "1" button. So we can conclude that the number being dialed in `touchtone` starts with a 1. We leave it as an exercise to continue the analysis with `touchtone` and identify the complete phone number.

`touchtone` has a hidden "Easter egg" that, as far as I know, has never been uncovered. If anyone finds it, submit a comment and let us know.

Cleve Moler, *Numerical Computing with MATLAB*, Electronic Edition, MathWorks, <http://www.mathworks.com/moler/index_ncm.html>,

Print Edition, SIAM Revised Reprint, SIAM, 2008, 334 pp., <http://bookstore.siam.org/ot87> .

Get
the MATLAB code

Published with MATLAB® R2014a

But what about Simulink? A lot of work is done using the mouse, so the touch typing configuration is less relevant. Are there guidelines or tips to build, edit, and navigate simulink models as efficiently as possible?

This week I share mine!

**Navigating**

When navigating, I typically keep my thumb on the **Alt** key, my middle finger on the **Esc** key, and the index on the **1** key.

This allows me to easily do:

**Esc**to go up one level with the middle finger**Atl+1**to zoom 100%, using thumb on**Alt**and index on**1****Space bar**to zoom to fit with the thumb**Alt+Tab**to switch between windows, typically the Library Browser and the model, using thumb on**Alt**and index on**Tab**.

Here is a typical example of me exploring a model. This typically ends up being a pattern of hitting the **Space bar** when I enter a new subsystem to get the big picture, then **Atl+1** to see the blocks zoomed at 100% and **Esc** to go up one level when I am done.

**Editing**

When editing, what I need the most are the **Ctrl** and **Ctrl+Shift** shortcuts. So I slightly move my left hand to keep my small finger on the **Ctrl** key, and my ring finger on the **Shift** key.

Here is the list of shortcuts I typically use in this configuration:

Bottom line:

**Ctrl+Z**to Undo**Ctrl+X**to Cut**Ctrl+C**to Copy**Ctrl+V**to Paste**Ctrl+B**to Build the model using Simulink Coder

Middle Line:

**Ctrl+A**to Select all blocks in the current system**Ctrl+S**to Save**Ctrl+D**to update the diagram (ALWAYS keep your model as close as possible to an updatable state, this will save you lots of debugging time!)**Ctrl+G**to Group selected blocks in one subsystem**Ctrl+H**to open the Model Explorer (am I the only one to pronounce it Model HHHexplorer because of that?)

Top line:

**Ctrl+E**to open the model configuration**Ctrl+R**to Rotate blocks**Ctrl+T**to start a simulation**Ctrl+Y**to redo something previously undone**Ctrl+U**to open a masked subsystem. With the introduction of the arrow badge, I use this shortcut less often since R2012b**Ctrl+I**to flip a block left-right. Ok, this one requires some intense stretching to be done with only one hand... be careful to not hurt yourself.

In addition go the **Ctrl** shortcuts, this hand position allows you to access a set of **Ctrl+Shift** shortcuts, by using the little finger for **Ctrl**, and the ring finger for **Shift**:

**Ctrl+Shift+X**to Comment out and uncomment blocks**Ctrl+Shift+Y**to Comment through blocks**Ctrl+Shift+H**to remove highlighting in the model**Ctrl+Shift+G**to Expand a subsystem**Ctrl+Shift+L**to open the Simulink Library Browser. (This one cannot be done only with the left hand, but I thought I should point out his existence. Most of the time, when I edit model I alternate between the Library Browser and the model using**Atl+Tab**)

You can see the full list of Simulink shortcuts here.

Since many **Ctrl+Shift** combinations do not have shortcuts assigned, I like to use those to assign custom shortcuts using an sl_customization file, as I previously highlighted in this post. One example for me is **Ctrl+Shift+Z** to display or not the name of blocks.

Here is an animation showing usage of the most common shortcuts:

**Now it's your turn**

Share your tricks to use Simulink as efficiently as possible by leaving a comment here.

]]>

I’d like to introduce you to this week’s guest blogger, Graham Dudgeon. Graham is with our Industry Marketing Team at MathWorks, and focuses on the Utilities & Energy Industry. In this blog, Graham shares a story about age being no barrier to exploring ideas and concepts in MATLAB and Simulink.

Hi Everyone, and thank you Loren for inviting me to be a guest blogger. This story begins with my dog, waiting patiently at the patio door expecting either of my two sons to let her in. She was out of luck. I am no dog psychologist, but I figure that because she could see the boys very clearly, that they in turn could see her. In her mind, she didn’t need to bark to raise their attention…. and so on she waited while my sons continued watching Phineas and Ferb. Once I was alerted to her predicament a mere fraction of a second after entering the family room (by seeing her hopeful eyes and her tail wagging with a vigor in direct proportion to how long she had been waiting), I let her in and then reinforced to my sons that they need to be more aware of their surroundings. This is where MATLAB and Simulink come into the story… To help pooch out, I made a slight modification to demoimaqsl_rgbhistogram_win.slx , aimed the web cam at the patio door and when pooch came to the door, we were alerted to here presence by a ‘gong’ that I took from ‘gong.mat’, and had a MATLAB callback to execute the sound when a color threshold was reached – it turns out red works best for a fox hound. Was pooch consequently let into the house by my boys? Unfortunately not. The boys decided that a much better use of my endeavor was to see if they could creep past the field of view without the gong going off. This kept them entertained for longer than I expected.

My eldest son, who is 9 years old, started asking questions about how the model worked, how the camera worked, what was doing what, how do you detect pooch, etc. etc. I must have answered with sufficient clarity and enthusiasm as he decided to take what I had done and modify it for his Science Expo.

My son watched the RGB plot as we passed different colors in front, and gained a feel for how to develop a detection algorithm. Our process was to let my son watch only the RGB plot, not the object in front of the camera, and tell me what color he thought was there. ‘Red!’, ’Blue!’, ’Kinda red and blue, and a bit of green!’. We would then confirm his observations by looking at the actual object and moving it again across the field of view to see the RGB response change with the movement. My son associated y-axis magnitude with detection and figured out that to detect, he need simply tell the computer to ‘gong’ when a color got to a certain magnitude. My son also figured that you could remain ‘invisible’ if the detector was tuned only for red and you wore another color, and that a black background would cause less noise on the color signals, as black caused only a small RGB signal. My youngest son also helped out, by wearing different colored shirts and walking (sometimes running) past the web cam. The list of insights my son gained go on and on. He presented his work at the Science Expo, and was met with great enthusiasm and interest by the High School Scientists who assessed his work.

So how does this story end? A few days ago my eldest son told me that when he grows up he wants to be an engineer… and that he wants to work for MathWorks. On hearing this, my 5 year old son said he wanted to be a scientist (as he ‘wants to see into stuff’)… and that he too wants to work for MathWorks. So the story continues, and our future is in good hands

And what of pooch? She’s standing outside the patio door (with hopeful eyes and tail wagging vigorously) waiting for me to finish this post.

Before I sign off, let me give you my main takeaway from this story… You can use MathWorks tools to help kids of any age explore and evolve their curiosity, by connecting them directly to the world of Science and Engineering. Give it a try!

Get
the MATLAB code

Published with MATLAB® R2014a

Jiro's pick this week is Reference Creator by Valerio Biscione.

Even though I have been away from the world of academia for quite some time, I can remember and appreciate the effort that people put into writing papers. And in that process, one of the more mundane tasks is putting together the references. Not only do I need to figure out the exact article and journal names, I need to put the information in the correct format. It may not be super difficult, but it's definitely time-consuming.

Valerio's Reference Creator makes this process extremely easy. It accesses Mendeley, an online research paper database, to gather the bibliographical data based on some search criteria, such as author name and year. You can use his tool either through the command line interface or the UI. I gave it a try using a paragraph from one of my papers.

It even gave me a warning when it found multiple matches with the same author(s) and year.

Thanks Valerio for this useful tool! I'm sure a lot of researchers out there will appreciate this.

*Warning: This entry uses some undocumented/unsupported capabilities, namely the displaying of colored and bold text in the
edit boxes. Use this entry with caution, as undocumented features can change in behavior or stop working at any time.*

**Comments**

Give this a try and let us know what you think here or leave a comment for Valerio.

Get
the MATLAB code

Published with MATLAB® R2014a

Since the video at the center of this post is about a rap battle between MATLAB and Simulink, I thought it would be interesting to bring that to a higher level and see how the video processing could have been implemented in Simulink instead of MATLAB.

Seriously... who wants to write code when you can simply connect blocks?

**Overview**

Since I have absolutely zero knowledge in image processing, I simply re-implemented the algorithm suggested by Matt in Simulink. The resulting model looks like:

Let's look at the details.

**Importing the video**

To begin, I used the From Multimedia File from the DSP System Toolbox to directly read the AVI-file with our rappers in front of a green background. Then I use the resize block from the Computer Vision System Toolbox. This allows me to work on images of a relatively small size. After, I normalize the data, as suggested by Matt.

**The Greenness factor**

I really like the simple equation the Matt figured out to calculate the "greenness" of each pixel in the image. In MATLAB, it looks like:

```
% Greenness = G*(G-R)*(G-B)
greenness = yd(:,:,2).*(yd(:,:,2)-yd(:,:,1)).*(yd(:,:,2)-yd(:,:,3));
```

**Thresholding**

The next piece of code is something that many users struggle to figure out:

`thresh = 0.3*mean(greenness(greenness>0));`

To implement logical indexing (e.g. `A(A>0)`) in Simulink, we need to combine the Find block with a Selector block. This results in a variable-size signal

**Combining the images**

The last line we need to implement is to replace the pixels identified as green in the foreground image but their values in the background image. In MATLAB, this is accomplished using this line:

`rgb2(isgreen) = rgb1(isgreen);`

In Simulink, we need to extend the logic seen at the previous step. We first select the desired pixels from the background image, and then assign them to the foreground image:

Note that the Computer Vision System Toolbox has a Compositing block implementing this functionality.

Here is what the final result looks like:

**Now it's your turn**

What do you think? Is it more convenient to implement this algorithm in MATLAB or Simulink? How do you typically decide between a *code-based* or a *block-based* implementation? Let us know by leaving a comment here.

At the party, we learned that Cleve is the winner of the 2014 IEEE John von Neumann Medal, which is awarded for outstanding achievements in computer-related science and technology.

Happy Birthday and Congratulations, Cleve! I've learned more just by working in the same hallway as you than in much of my formal education. Thanks for being so generous in sharing your knowledge and experience with so many of us for so many years.

Readers, be sure to check out Cleve's blog. Or listen to Cleve describe the three decades of mathematicians, computers, and technology (and a LINPACK license plate!) that led to the creation of MATLAB.

]]>For several years we thought Hadamard matrices showed maximum element growth for Gaussian elimination with complete pivoting. We were wrong.

I want to continue the discussion from my previous blog post of element growth during Gaussian elimination. Suppose we are solving a system of linear equations involving a matrix $A$ of order $n$. Let $a_{i,j}^{(k)}$ denote the elements in the matrix after the $k$ th step of the elimination. Recall that the *growth factor* for the pivoting process we are using is the quantity

$$ \rho_n = { \max_{i,j,k} |a_{i,j}^{(k)}| \over \max_{i,j} |a_{i,j}| } $$

Complete pivoting is intended to control the growth factor by using both row and column interchanges at each step in the reduction to bring the largest element in the entire unreduced matrix into the pivot position. In his 1962 analysis of roundoff error in Gaussian elimination, J. H. Wilkinson proved that the growth factor for complete pivoting satisfies

$$ \rho_n \le g(n) $$

where the *growth function* is

$$ g(n) = (n \ 2 \ 3^{1/2} 4^{1/3} \cdots n^{1/(n-1)})^{1/2} \approx 1.8 \ n^{1/2} n^{1/4 \log{n}} $$

Wilkinson's analysis makes it clear that the growth can never actually be anywhere near this large.

We saw that the growth factor for partial pivoting is $2^{n-1}$. For complete pivoting, it's much, much less. To get a feeling for now much, express $g(n)$ as a MATLAB one-liner

g = @(n) sqrt(n*prod((2:n).^(1./(1:(n-1)))))

g = @(n)sqrt(n*prod((2:n).^(1./(1:(n-1)))))

Check out $n = 100$

g(100)

ans = 3.5703e+03

So here $g(n) \approx 35n$.

Jacques Hadamard was a French mathematician who lived from 1865 until 1963. He made contributions in many fields, ranging from number theory to partial differential equations. The matrices named after him have entries +1 and -1 and mutually orthogonal rows and columns. The $n$ -dimensional parallelotope spanned by the rows of a Hadamard matrix has the largest possible volume among all parallelotopes spanned by matrices with entries bounded by one. We are interested in Hadamard matrices in the MATLAB world because they are the basis for Hadamard transforms, which are closely related to Fourier transforms. They are also applicable to error correcting codes and in statistics to estimation of variance.

So MATLAB already has a `hadamard` function. Here is the 8-by-8 Hadamard matrix.

H = hadamard(8)

H = 1 1 1 1 1 1 1 1 1 -1 1 -1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 -1 1 1 -1 -1 1 1 1 1 1 -1 -1 -1 -1 1 -1 1 -1 -1 1 -1 1 1 1 -1 -1 -1 -1 1 1 1 -1 -1 1 -1 1 1 -1

Check that the rows are perpendicular to each other.

H'*H

ans = 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 8

The orthogonality of the rows leads to element growth during Gaussian elimination. If we call for the LU factorization of `H`, no pivoting actually takes places, but the same result would be produced by complete pivoting that settles ties in favor of the incumbent.

[L,U] = lu(H)

L = 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 U = 1 1 1 1 1 1 1 1 0 -2 0 -2 0 -2 0 -2 0 0 -2 -2 0 0 -2 -2 0 0 0 4 0 0 0 4 0 0 0 0 -2 -2 -2 -2 0 0 0 0 0 4 0 4 0 0 0 0 0 0 4 4 0 0 0 0 0 0 0 -8

This, then, is one matrix at $n = 8$ where

$$ \rho_n = n $$

For almost 30 years everybody believed that Hadamard matrices were the extreme cases and that the growth factor $\rho_n$ for complete pivoting with real matrices was equal to $n$.

At various times in the 1960s and '70s I personally did numerical experiments looking for matrices with large pivot growth. The computers I had at the time limited the order of my matrices to a few dozen. I never found any cases of $\rho_n > n$.

One of the people who worked on the problem was Leonard Tornheim at Chevron Research in California. He also did a lot of numerical experiments. He found a complex 3-by-3 matrix with $\rho_3 > 3$. He also proved that for real matrices, $\rho_3 = 2 \ 1/4$.

In 1968, Colin Cryer made an official conjecture that $\rho_n$ was equal to $n$ for real matrices.

A 1988 paper by Jane Day and Brian Peterson provides a good survey of what was known up to that time about element growth with complete pivoting.

Then, in 1991, Nick Gould, from Oxford, surprised us all by announcing the discovery of real matrices for which the complete pivoting $\rho_n$ is greater than $n$. For an $n$ -by- $n$ matrix, with $n$ between 13 and 16, Nick set up a large, sparse, nonlinear optimization problem involving roughly $n^3/3$ variables, and tackled it with the LANCELOT package that he and his colleagues Andy Conn and Philippe Toint had developed. Most of his paper is about a 13-by-13 matrix with

$$ \rho_{13} = 13.0205 $$

This just barely exceeds the $\rho_n = n$ conjecture. But he also found some slightly larger matrice that do a better job.

$$ \rho_{14} = 14.5949 $$

$$ \rho_{15} = 16.1078 $$

$$ \rho_{16} = 18.0596 $$

The 16-by-16 is particularly dramatic because it is not a Hadamard matrix.

So, as far as I know, this is where the matter still stands today. Nobody is interested in running larger experiments. I have no idea what $\rho_{100}$ or $\rho_{1000}$ might be. Of course, we don't really need to know because we don't use complete pivoting in practice.

This reminds me of a story that I like to tell. I'm not sure it's actually true.

Peter Businger was a grad student at Stanford in Math and Computer Science at the same time I was in the early '60s. He worked with Gene Golub to write the first published Algol procedure for the QR matrix factorization using Householder reflections. Peter left Stanford to join Joe Traub's group at Bell Labs that was producing what they called the Port library of mathematical software. They were importing programs from everybody else, running the routines on Bell Labs' machines, thoroughly testing the software, deciding which codes were best, and producing a consolidated library.

Peter was in charge of matrix computation. He checked out all the imported linear equation solvers, including mine. He decided none of them was satisfactory, because of this pivoting growth question. So he wrote a program of his own that did partial pivoting, and monitor the growth. If the growth was too large, the program switched to complete pivoting.

At the time, Bell Labs was trying to patent software, so Peter named his new subroutine "P^8". This stood for "Peter's Pseudo Perfect Perfect Partial Pivot Picker, Patent Pending". In my opinion, that is one of the great all time subroutine names.

The experience inspired Peter to go to law school at night and change jobs. He switched to the Bell Labs patent department.

I've lost contact with Peter, so I can't ask him if he really did name the routine P^8. If he didn't, he should have.

Les Foster, who I mentioned in the last post for finding examples of exponential growth with partial pivoting, promotes what he calls "Rook Pivoting". Search down the column for the largest element, as in partial pivoting, but then also search along that row for the last element. This is intermediate between partial and complete pivoting. Les is able to prove some results about pivot growth. But the algorithm does not generalize well to a block form.

I was present at a milestone in the history of Hadamard matrices. The orthogonality conditions readily imply that the order $n$ of a Hadamard matrix must be 1, 2, or a multiple of 4. But the question remains, does a Hadamard matrix of order $n = 4k$ exist for every $k$? This is an open question in number theory.

It is fairly easy to create Hadamard matrices of some sizes. A nifty way to generate a Hadamard matrix of order a power of two is the recursion.

H = 1; for n = 2, 4, 8, ... H = [H H H -H]; end

If `A` and `B` are Hadamard matrices, then so is

kron(A,B)

By 1961, with these and other similar constructions, it turned out that it was known how to construct Hadamard matrices for all orders $n \le 100$ that are multiples of four except $n = 92$. In 1961 I had a summer job at JPL, Caltech's Jet Propulsion Lab. My office was in a temporary trailer and my trailer mate was Len Baumert. Len proudly showed me a color graphic that he had just made and that he was proposing for the cover of *Scientific American*. It was a 92-by-92 matrix made up from 26-by-26 blocks of alternating light and dark cells representing +1 and -1. The graphic didn't make the cover of *Scientific American*, but I kept my copy for a long time.

Len had done a computation on JPL's machines that filled in the last value of $n$ less than 100. His paper with his colleagues Sol Golomb, a professor at USC, and Marshall Hall, Jr., a professor at Caltech, was published in the prestigious Bulletin of the AMS. It turns out that I had taken a course on difference sets, the mathematics involved in generating the matrix, from Hall at Caltech.

Here is a MATLAB picture of `Baumert92`. You can get the function from this link.

H = Baumert92; pcolor92(H);

Let's check that we get the expected pivot growth.

[L,U] = lu(H); unn = U(end,end)

unn = -92

This reference to the book by Trefethen and Bau should have been included in my previous post about partial pivoting. Lecture 22 has an explanation the stability of partial pivoting in practice.

Lloyd N. Trefethen and David Bau, III, *Numerical Linear Algebra*, <http://bookstore.siam.org/ot50>, SIAM, 1997, 362pp.

Peter Businger, "Monitoring the numerical stability of Gaussian elimination", <http://link.springer.com/article/10.1007/BF02165006>, Numerische Mathematik 16, 4, 1971, 360-361.

Jane Day and Brian Peterson, "Growth in Gaussian elimination", <http://www.jstor.org/stable/2322755>, American Mathematical Monthly, 95, 6, 1988, 489-513.

Nick Gould, "On growth in Gaussian elimination with complete pivoting", <http://www.numerical.rl.ac.uk/people/nimg/pubs/Goul91_simax.pdf> SIAM Journal on Matrix Analysis and Applications 12, 1991, 351-364.

Leslie Foster, "The growth factor and efficiency of Gaussian elimination with rook pivoting", <http://www.math.sjsu.edu/~foster/gerpwithcor.pdf>, Journal of Computational and Applied Mathematics, 86, 1997, 177-194.

Leslie Foster, "LURP: Gaussian elimination with rook pivoting", matlabcentral/fileexchange/37721-rook-pivoting, MATLAB Central File Exchange, 2012.

Leonard Baumert, S. W. Golomb, and Marshall Hall, Jr, "Discovery of an Hadamard Matrix of Order 92", <http://dx.doi.org/10.1090%2FS0002-9904-1962-10761-7>, Bulletin of the American Mathematical Society 68 (1962), 237-238.

Get
the MATLAB code

Published with MATLAB® R2014a

If you want the distinguished honor of a Will Pick of the Week, here are some easy guidelines to follow:

- Choose an interesting or fun topic.
- Package the submission so that it's easy to figure out how to use.
- Ensure that it actually works for standard user inputs.

I'll probably want to get a second opinion, but PianoTuner tells me that my keys are all falling flat. As to how it figures this out, the algorithm is about 80 lines of code embedded into the UI callback functions. It looks like there's a fast Fourier transform and a polynomial curve fit in there.

Incidentally, you can run the tool against non-piano instruments as well. I played a random song on my computer speakers and found that musicians with professional instruments do much better.

Let us know what you think here or leave a comment for John.

]]>

Here is my answer.

]]>

*I'd like to welcome guest blogger and ace MATLAB training content developer Matt Tearle for today's post. Thanks, Matt!*

What do you do if you don't have an image processing background and your boss asks you to produce a music video for a MATLAB-vs-Simulink rap battle? It's probably not a question you've given much thought to. I admit that I hadn't either. But that's the awkward situation I found myself in a little while back.

A MATLAB-vs-Simulink rap battle -- obviously I **had** to be a part of this project! There was just one significant problem: I'm not a video producer. Yes, I have a nice digital camera. I have some very basic editing software. And I certainly have some... "creative" ideas. But I don't have any software designed to do serious video editing.

In particular, as soon as the concept was explained to me, I decided that we needed to do a "green screen" video:

- We'd record our rappers busting out their rhymes in front of a green backdrop
- I'd make an entertaining background video
- Then we'd put the two together, using the background video in place of the green screen:

[This background may not seem particularly "entertaining", but if I showed the real one we used, lawyers might get involved... And nobody wants that.]

With no purpose-made software to achieve this, and only a few days before it was due (thanks, boss!), I turned to my default standby: MATLAB!

A video is just a sequence of images, so all I needed to do was to write MATLAB code to "green-screen" two images, then read in two videos and apply my algorithm frame-by-frame. A quick hunt around in the documentation, and I had the framework code ready to go:

% Open the input video files v1 = VideoReader('background.mp4'); v2 = VideoReader('foreground.mp4'); % Determine the number of frames for the final video nFrames = min(v1.NumberOfFrames,v2.NumberOfFrames); % Open a video file to write the result to vOut = VideoWriter('mixedvideo.mp4','MPEG-4'); vOut.FrameRate = 24; open(vOut); % Loop over all the frames for k = 1:nFrames % Get the kth frame of both inputs x = read(v1,k); y = read(v2,k); % Mix them together z = magicHappensHere(x,y); % TODO: Write this... % Write the result to file writeVideo(vOut,z) end % And we're done! close(vOut);

Now all I needed to do was write the magic function that would do the greenscreening...

Actually, I soon discovered that I needed to do a bit more preprocessing first. I had created the background video on my computer, but filmed the foreground on my digital camera. Of course the dimensions didn't match... No worries! That's what Image Processing Toolbox is for. I added a line of code to define the desired dimensions of the output video. Then `imresize` took care of the rest:

% Loop over all the frames for k = 1:nFrames % Get the kth frame of both inputs x = imresize(read(v1,k),outDims); y = imresize(read(v2,k),outDims); % Mix them together z = magicHappensHere(x,y); % Write the result to file writeVideo(vOut,z) end

Wow, this image processing stuff is **easy**! What's next, then?

The first thing to try, of course, when developing an algorithm is to see if one already exists. Sadly, searching the Image Processing Toolbox documentation didn't turn up a `greenscreen` function or anything I could see that would do the job. So it looks like I have to do this from scratch.

If I can identify the green pixels, then I can just replace those pixels in the foreground image with the corresponding pixels in the background image. So all of this really just boils down to: how do I find the green pixels? I know that a true-color image in MATLAB is an m-by-n-by-3 array, where each of the three m-by-n "planes" represent red, green, and blue intensity, respectively:

"Green" would, ideally, be [0, 255, 0] (in `uint8`). So a simple way to find the green pixels would be to do some kind of thresholding:

isgreen = (y(:,:,1) < 175) & (y(:,:,2) > 200) & (y(:,:,3) < 175);

This didn't work too badly, but neither was it great. One of the problems is the use of magic threshold values -- three of them, no less! Playing around with three magic parameters to try to find the right combination doesn't sound like fun to me. And besides, using absolute values like this might not be the best way to judge "greenness". If you look at the example image above, you can see that there's a fair amount of variation in the green background. (This is mainly because we were using an unevenly lit conference room that happened to have a green wall -- this was clearly a very professional production!)

At this point, I had to ask myself "what does it really mean to be 'green'?". As a human, I can easily distinguish between the various shades of green on the wall, despite the difference in brightness and shading. At first, this lead me into thinking about color spaces and how a color is a point in some 3-D space (RGB, CYM, HSV, etc.). Perhaps I could define "green" as "sufficiently close to [0 1 0] in RGB space". And so followed a few experiments with rounding, distance calculations, even playing with `rgb2ind`. But to no great success:

yd = round(double(y)/255); isgreen = (yd(:,:,1)==0) & (yd(:,:,2)==1) & (yd(:,:,3)==0);

As this image shows, the problem with our makeshift green screen was that it wasn't particularly green -- more a yellow-green, with significant variations in brightness.

Just as I was started to get worried that this wasn't going to work at all, I realized that one obvious characteristic of the green pixels is that the green value is significantly **greater** than the blue and red values. This is, of course, embedded in my simple thresholding code above, but I can do away with at least some of the magic constants by looking at **relative** values. So how about calculating a "greenness" value from the three color channels:

```
yd = double(y)/255;
% Greenness = G*(G-R)*(G-B)
greenness = yd(:,:,2).*(yd(:,:,2)-yd(:,:,1)).*(yd(:,:,2)-yd(:,:,3));
```

Now **that** looks very promising. If I threshold that, I should get what I'm looking for. But rather than use an absolute threshold value maybe I can use something based on the distribution of the greenness values. In particular, I can exploit one nice feature of the images: they have a lot of green pixels. The histogram of greenness values looks like this:

I tried a simple thresholding based on the average greenness value, ignoring the negative values:

thresh = 0.3*mean(greenness(greenness>0)); isgreen = greenness > thresh;

Success! OK, so there's one magic constant left in there, but it was pretty quick and easy to tune, and seemed to be quite robust to minor changes. I'm calling that a good result. Now that I know where the green pixels are, I just have to replace them with the corresponding pixels from the background.

% Start with the foreground (essentially preallocation) z = y; % Find the green pixels yd = double(y)/255; % Greenness = G*(G-R)*(G-B) greenness = yd(:,:,2).*(yd(:,:,2)-yd(:,:,1)).*(yd(:,:,2)-yd(:,:,3)); thresh = 0.3*mean(greenness(greenness>0)); isgreen = greenness > thresh; % Blend the images % Loop over the 3 color planes (RGB) for j = 1:3 rgb1 = x(:,:,j); % Extract the jth plane of the background rgb2 = y(:,:,j); % Extract the jth plane of the foreground % Replace the green pixels of the foreground with the background rgb2(isgreen) = rgb1(isgreen); % Put the combined image into the output z(:,:,j) = rgb2; end

I love using logical indexing -- it's probably my favorite construct in the MATLAB language. Here I'm using it to replace the green pixels of the foreground (`rgb2(isgreen)`) with the corresponding background pixels (`rgb1(isgreen)`).

At this point I had a video that would have sufficed, but there was a noticeable green outline around my rappers. (It's not necessarily apparent in a single still frame.) It was only Sunday morning by this point, which meant that I still had a few hours left to tinker. Could I find a way to "shave" a pixel or two from the edge of the black silhouette in the above image of `isgreen`? To do that, I'd first need to find that edge. I don't know much about image processing, but I do know that "edge detection" is a thing that image people do, so it's time for me to search the doc... Sure enough, there's an `edge` function:

```
outline = edge(isgreen,'roberts');
```

Easy! Now if I can thicken that edge and combine it with the original `isgreen`, I'll have the slightly larger greenscreen mask that I need. Two more Image Processing Toolbox functions help me thicken the edge:

```
se = strel('disk',1);
outline = imdilate(outline,se);
```

The edge variable `outline` is a logical array, so I can easily combine it with `isgreen`, to get my final result:

isgreen = isgreen | outline;

Put it all together and I have a pretty simple script:

% Open the input video files v1 = VideoReader('background.mp4'); v2 = VideoReader('foreground.mp4'); % Determine the number of frames for the final video nFrames = min(v1.NumberOfFrames,v2.NumberOfFrames); % Set the output dimensions outDims = [400 640]; % Open a video file to write the result to vOut = VideoWriter('mixedvideo.mp4','MPEG-4'); vOut.FrameRate = 24; open(vOut); % Loop over all the frames for k = 1:nFrames % Get the kth frame of both inputs x = imresize(read(v1,k),outDims); y = imresize(read(v2,k),outDims); % Mix them together z = y; % Preallocate space for the result % Find the green pixels in the foreground (y) yd = double(y)/255; % Greenness = G*(G-R)*(G-B) greenness = yd(:,:,2).*(yd(:,:,2)-yd(:,:,1)).*(yd(:,:,2)-yd(:,:,3)); % Threshold the greenness value thresh = 0.3*mean(greenness(greenness>0)); isgreen = greenness > thresh; % Thicken the outline to expand the greenscreen mask a little outline = edge(isgreen,'roberts'); se = strel('disk',1); outline = imdilate(outline,se); isgreen = isgreen | outline; % Blend the images % Loop over the 3 color planes (RGB) for j = 1:3 rgb1 = x(:,:,j); % Extract the jth plane of the background rgb2 = y(:,:,j); % Extract the jth plane of the foreground % Replace the green pixels of the foreground with the background rgb2(isgreen) = rgb1(isgreen); % Put the combined image into the output z(:,:,j) = rgb2; end % Write the result to file writeVideo(vOut,z) end % And we're done! close(vOut);

This is not the best greenscreen code ever written (although it **is** the best greenscreen code **I've** ever written). But the real point is that it shouldn't even exist -- I had no prior image processing knowledge, I was working with videos of different dimensions, the greenscreen was unevenly lit (and not even particularly green), and I had a weekend to make it happen. The fact that I was able to do this is one of the main reasons I love MATLAB (and, of course, the associated toolboxes): I was able to spend my time on the heart of **my algorithm**, not all the coding details. Different-sized images? Fixed with one function call (`imresize`). Need to find edges of a region? One function call (`edge`). Want to thicken that edge? Two function calls (`strel` and `imdilate`). The final script is less than 50 lines.

My time was spent wrestling with the core of the problem: how I was going to figure out what "green" looked like in my images. Because I could spend my time there, I was able to tinker with different ideas. This, to me, is what MATLAB is all about: rapid prototyping -- ideas becoming working code.

Having made the video and feeling pretty good about what I managed to pull together in a few hours of MATLAB, I later realized that I got lucky. The way I calculated greenness (`G*(G-R)*(G-B)`) was intended to give negative results for anything that didn't have more green than red or blue. But I simply overlooked the possibility of getting a high positive value if **both** `G-R` and `G-B` were negative. According to my formula, a light magenta -- e.g. `[1 0.6 1]` -- would be very green! Luckily for me, my source videos didn't include anything like that. A more robust approach would have been something like this:

```
yd = double(y)/255;
% Greenness = G*(G-R)*(G-B)
greenness = yd(:,:,2).*max(yd(:,:,2)-yd(:,:,1),0).*max(yd(:,:,2)-yd(:,:,3),0);
```

Now any pixel with more red or more blue than green will have a greenness of 0. Everything else works the same.

I think there's a nice moral in this postscript, as well. I love using MATLAB to solve a problem -- a real, immediate problem that I need to solve right now. I don't necessarily need the best solution, or the solution to the most general problem. The greenscreen algorithm I hacked together wasn't the best, but it worked for my application. If my image had light magenta in it, I would have discovered this bug and fixed it, but that didn't happen and didn't matter.

If I had approached this assignment like a professional software developer, I'd still be working on it. But as a MATLAB hacker, I was done in a weekend!

Get
the MATLAB code

Published with MATLAB® R2014a

**A Look into the Future?**

We’ve been publically developing a Simulink model of the Hyperloop for some time now. We recently looked into using MATLAB interfaces to visualize that journey in GoogleEarth. To create a virtual cockpit, we constructed a MATLAB figure combining GoogleEarth windows and several other native graphics capabilities. We think the resulting video is pretty cool.

To understand how we arrived at this vision of the future, here’s a look back at our past Hyperloop posts.

**Getting Started**

We started out with a conversational post about Elon Musk’s concept and outlining some of the interesting technical points. This post was criticized for not brining any value to the discussion. I guess we were simulating trolling.

To get things moving, we introduced different approaches to architecting a simulation model for such a system. Guy suggested splitting the model into a plant (physical elements) and a controller (functional software) as in conventional model-based design. I proposed a more distributed architecture with several functional components, each containing its own plant and controller (image below). Guy’s proposal included subsystem variants while mine leveraged model reference.

**Initial dynamic analysis**

Last November, we published a two-dimensional analysis of the Hyperloop’s proposed route. We used the original proposal’s design constraints for curve radius to see how the track could be laid out over the California terrain.

The results highlighted the difficulties in maintaining tolerable g-levels for passengers while traveling at such high speeds along existing infrastructure. However, the analysis did not include several important factors; elevation and banking. Elevation promised to make route planning more difficult, while the effects of banking the vehicle in the tube would help reduce the g-forces and potentially improve the route.

*Anticipated hyperloop route (yellow) to avoid excessive g-forces (Image created using Google Earth)*

**Promoting Collaboration**

For Valentine’s Day, we used Simulink Projects to package a Simulink model on the MATLAB File Exchange that contained elements of each of our architectural approaches. My partitioning remained but now utilized Guy’s idea of using subsystem variants.

In April, R2014a was released with Simulink Projects supporting Git as a version control system. So, we quickly jumped on the bandwagon and placed an updated Hyperloop model on GitHub.

**Improved dynamics**

In May, we investigated the elevation profile of the route. We used optimization techniques to find the best combination of pillars and tunnels to meet cost and comfort demands. There were some great improvement suggestions in the comments of this post that we’ve yet to implement.

Last month, we enhanced the model to include the effects of banking within the tube. We blogged about a simple approach in SimMechanics to model those dynamics.

**Roll your own!**

In keeping with our collaborative mindset, the necessary models and m-scripts to generate this video are now available on the GitHub submission.

The project has even been set up to support new routes. Load your own KML file and see how the Hyperloop looks in your neighborhood.

]]>Have you ever needed to move a figure? Perhaps because it is offscreen>? Or at least the menus or close box are not visible? This happens to me from time to time when I get code from someone else who is programming specifically to the size of their display. Working on a laptop, I often have fewer available pixels. What to do?

I used the Function Browser to search for "move figure" and found `movegui`. It might be just the ticket for you.

As you can see from the documentation, `movegui` gives you a fair amount of latitude for moving your figure around. The two choices I find most useful and easiest to remember for my purposes: `onscreen` and `center`. I think you can guess what they each do :-)

Do you use the Function Browser to help you? Do you ever need to reposition "lost" figure windows, like I have? Let me know here.

Get
the MATLAB code

Published with MATLAB® R2014a

Sean's pick this week is ThingSpeak Support for MATLAB by MathWorks' Internet of Things Team.

Up until recently, I'd heard the phrase the "internet of things" thrown around only a few times and didn't really know what it meant. So I spoke with our Marketing Manager for the Maker Movement, Paul Kassebaum, to get a better understanding. Paul describes it briefly as "The confluence of technologies in communication (the internet as we know it), the energy internet (or energy from sensors, etc.) and the logistics internet (the ability to use this information for logistics)."

This week's pick uses all three. It allows you to pull data directly into MATLAB from ThingSpeak, a platform and cloud service where data can be uploaded from sensors and low cost hardware. After pulling this data, you can use it for whatever research, design or logistics you want.

I saw this blog post on our Maker Zone blog that described measuring soda consumption here at MathWorks. They posted the file on the FEX as well.

This made a challenge for me: Could I gather a few people together, drink as much Powerade as possible at 4pm Eastern Time on August 5th, and create an outlier data point?

We had a few detractors, like Loren, who did not properly appreciate the health benefits of this type of social gathering.

To passerbyers, it probably looked like we were part of some obscure MATLAB cult. Let's see how we did:

% Grab the data for July 6th through August 6th. [drinkData, timeStamps] = thingSpeakFetch(7040,'DateRange',{'06-July-2014','06-August-2014'}); Powerade = drinkData(:,3); % Third channel is Powerade % Index into the party hour idxParty = hour(timeStamps)==16 & day(timeStamps) == 5 & month(timeStamps)==8;

scatter(timeStamps(idxParty),Powerade(idxParty),400,'MarkerFaceColor',[0.03 0.6 0.8],'MarkerEdgeColor',[0.03 0.6 0.8]); hold on scatter(timeStamps,Powerade,15,'MarkerFaceColor',[0.5 0 0],'MarkerEdgeColor',[0.5 0 0]); datetick('x','mmmdd','keepticks','keeplimits'); legend('Powerade Party','Powerade Consumption','location','northwest')

So it does look like we were able to create a Powerade Consumption outlier. Go Team!

Give it a try and let us know what you think here or leave a comment for our Internet of Things team.

Get
the MATLAB code

Published with MATLAB® R2014a

Our latest update has two features that should be of interest to blog readers:

- Searching with Categories: Blog authors can tag their posts with Categories; keywords, or phrases that organize or describe a particular post. It’s a helpful tool for finding related content. Like a particular post? Just click the Category link at the bottom to find similar posts.
- Print a Post: Several blog readers requested this feature. Now you can easily print out posts by your favorite bloggers and read them when you’re off the grid. After spending all day staring into a monitor, there’s nothing better than being outside, reading a nicely formatted post on paper.Bliss.

We’re always looking to improve the blogs. Tell us what you think about these new features or let us know what we can do to make the blogs section better for you. While you’re thinking about all this, drop us a line in the Comments section below.

Thanks!

]]>Of course, my answer was that there are many ways to hold a value in Simulink. I made him a few examples that I am sharing today in this post.

**The Desired Behavior**

I want to create a subsystem with two inputs, `u` and `hold`, and one output `y`. When `hold` is true, the output `y` is equal to the input `u`. When `hold` is zero, the output `y` remains constant, holding the last output value.

Here are example input and output signals:

Let's see a few ways to obtain such behavior

**Method 1: Switch and Delay**

The most common way to hold a value that I observe in customers models is using a Switch and a Unit Delay, or Memory block

Nice, clean and simple!

**Method 2: Enabled Subsystem**

Personally, my favorite way to hold a the value of a signal is using an Enabled Subsystem, with the Outport property `Output when disabled` set to **held**.

I like this method because it takes virtually no blocks.

**Method 3: If Action Subsystems and Delay**

This version is similar to the first method, but we replace the Switch by an arrangement of If Action Subsystems and a Merge block.

Personally, I don't use this syntax very often, mostly because it requires more blocks.

**Method 4: Stateflow**

Another easy way to obtain the behavior we are looking for is using Stateflow

**What about the generated code?**

When choosing an implementation, many factors need to be taken into account. One of my favorite guidelines for was suggested by a reader of this blog: **Clearly show the intended functionality**. I will let you judge by yourself which implementation most clearly shows the intention of holding a signal.

Another concern that many users have is: What does the code look like?

In this case, all the methods generate similar code. to make the code easier to read for this post, I set the storage class of the three signals or interest (`hold`, `u`, `y`) to ExportedGlobal.

Switch and Delay:

****Note: To obtain the above code, I enabled inline parameters from the model configuration. Otherwise the threshold of the switch is a tunable parameter.*

Enabled Subsystem:

If Action Subsystem:

****Note: To obtain the above code, I manually set the Function Packaging option of the If Action Subsystem to Inline. Otherwise, since the two subsystems are identical, the coder generates a reusable function.*

Stafelow:

**Now it's your turn**

What is your preferred method to hold the value of a signal? Are there other functionalities for which you consider multiple implementations? Let us know by leaving a comment here.

]]>In rare cases, Gaussian elimination with partial pivoting is unstable. But the situations are so unlikely that we continue to use the algorithm as the foundation for our matrix computations.

I almost hesitate to bring this up. Gaussian elimination with partial pivoting is potentially unstable. We know of a particular test matrix, and have known about it for years, where the solution to simultaneous linear equations computed by our iconic backslash operator is less accurate than we typically expect. The solution is contaminated by unacceptably large roundoff errors associated with large elements encountered during the elimination process. The offending matrix is generated by the following function, which is a special case of the `gfpp` function in Nick Higham's collection of MATLAB test matrices.

```
type gfpp
```

function A = gfpp(n) % A = gfpp(n) Pivot growth of partial pivoting. A = eye(n,n) - tril(ones(n,n),-1); A(:,n) = 1;

Here is the 8-by-8.

n = 8 A = gfpp(n)

n = 8 A = 1 0 0 0 0 0 0 1 -1 1 0 0 0 0 0 1 -1 -1 1 0 0 0 0 1 -1 -1 -1 1 0 0 0 1 -1 -1 -1 -1 1 0 0 1 -1 -1 -1 -1 -1 1 0 1 -1 -1 -1 -1 -1 -1 1 1 -1 -1 -1 -1 -1 -1 -1 1

Gaussian elimination with partial pivoting does not actually do any pivoting with this particular matrix. The first row is added to each of the other rows to introduce zeroes in the first column. This produces twos in the last column. As similar steps are repeated to create an upper triangular `U`, elements in the last column double with each step. The element growth in the final triangular factor `U` is `2^(n-1)`.

[L,U] = lu(A); U

U = 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 2 0 0 1 0 0 0 0 4 0 0 0 1 0 0 0 8 0 0 0 0 1 0 0 16 0 0 0 0 0 1 0 32 0 0 0 0 0 0 1 64 0 0 0 0 0 0 0 128

So if the matrix order `n` is set to the number of bits in the floating point word (plus one), the last diagonal element of `U` will grow to be `1/eps`.

n = 53 A = gfpp(n); [L,U] = lu(A); unn = U(n,n)

n = 53 unn = 4.5036e+15

If we try to use backslash to solve a linear system involving this matrix and a random right hand side, the back substitution will begin with `U(n,n)`. The roundoff errors can be expected to be as large as `eps(U(n,n))` and swamp the solution itself.

b = randn(n,1); x = A\b;

Ordinarily, we expect the solution computed by Gaussian elimination to have a residual on the order of roundoff error. But in this case, the residual is many orders of magnitude larger. Backslash has failed.

r = b - A*x; relative_residual = norm(r)/(norm(A)*norm(x))

relative_residual = 3.7579e-05

Anything we do to this matrix to provoke pivoting during the elimination will prevent the growth of the elements in `U`. One possibility is to swap the first two rows.

A = gfpp(n); A([1 2],:) = A([2 1],:); [L,U] = lu(A); unn = U(n,n)

unn = 2

That's much better. Now use backslash and look at the residual.

x = A\b; r = b - A*x; relative_residual = norm(r)/(norm(A)*norm(x))

relative_residual = 1.7761e-17

So the residual is now on the order of roundoff relative to `A` and `x`. Backslash works as expected.

Another way to provoke pivoting is to perturb the matrix with a little white noise.

A = gfpp(n); A = A + randn(n,n); [L,U] = lu(A); unn = U(n,n)

unn = -3.8699

So `U` is well behaved. Let's see about backslash.

x = A\b; r = b - A*x; relative_residual = norm(r)/(norm(A)*norm(x))

relative_residual = 1.2870e-16

Again the residual is on the order of roundoff and backslash works as expected.

Suppose we are using Gaussian elimination with any pivoting process to solve a system of linear equations involving a matrix $A$. Let $a_{i,j}^{(k)}$ denote the elements in the matrix after the $k$ th step of the elimination. The *growth factor* for that pivoting process is the quantity

$$ \rho = { \max_{i,j,k} |a_{i,j}^{(k)}| \over \max_{i,j} |a_{i,j}| } $$

This growth factor is a crucial quantity in Wilkinson's roundoff error analysis. If $\rho$ is not too large, then the elimination algorithm is stable. Unfortunately, the matrix we have just seen, `gfpp`, shows that the growth factor for partial pivoting is at least $2^{n-1}$.

At the time he was presenting his analysis in early 1970s, Wilkinson knew about this matrix. But he said it was a very special case. He reported:

It is our experience that any substantial increase in size of elements of successive An is extremely uncommon even with partial pivoting. ... No example which has arisen naturally has in my experience given an increase as large as 16.

During the development of LINPACK in the late 1970s, a colleague at Argonne, J. T. Goodman, and I did some experiments with random matrices to check for element growth with partial pivoting. We used several different element distributions and what were then matrices of modest order, namely values of $n$ between 10 and 50. The largest growth factor we found was for a matrix of 0's and 1's of order 40. The value was $\rho$ = 23. We never saw growth as large as linear, $\rho = n$, let alone exponential, $\rho = 2^{n-1}$.

Nick Trefethen and Rob Schreiber published an important paper, "Average-case Stability of Gaussian Elimination", in 1990. They present both theoretical and probabilistic models, and the results of extensive experiments. They show that for many different types of matrices, the typical growth factor for partial pivoting does not increase exponentially with matrix order $n$. It does not even grow linearly. It actually grows like $n^{2/3}$. Their final section, "Conclusions", begins:

Is Gaussian elimination with partial pivoting stable on average? Everything we know on the subject indicates that the answer is emphatically yes, and that one needs no hypotheses beyond statistical properties to account for the success of this algorithm during nearly half a century of digital computation.

The Higham Brothers, Nick and Des Higham, published a paper in 1989 about growth factors in Gaussian elimination. Most of their paper applies to any kind of pivoting and involves growth that is $O(n)$. But they do have one theorem that applies only to partial pivoting and that characterizes all matrices that have $\rho = 2^{n-1}$. Essentially these matrices have to have the behavior we have seen in `gfpp`, namely no actual pivoting and each step adds a row to all the later rows, doubling the values in the last column in the process.

In two different papers, Stephen Wright in 1993 and Les Foster in 1994, the authors present classes of matrices encountered in actual computational practice for which Gaussian elimination with partial pivoting blows up with exponential growth in the elements in the last few columns of `U`. Wright's application is a two-point boundary value problem for an ordinary differential equation. The discrete approximation produces a band matrix. Foster's application is a Volterra integral equation model for population dynamics. The discrete approximation produces a lower trapezoidal matrix. In both cases the matrices are augmented by several final columns. Certain choices of parameters causes the elimination to proceed with no pivoting and results in exponential in growth in the final columns. The behavior is similar to `gfpp` and the worst case growth in the Highams theorem.

Complete pivoting requires $O(n^3)$ comparisons instead of the $O(n^2)$ required by partial pivoting. More importantly, the repeated access to the entire matrix complete changes the memory access patterns. With modern computer storage hierarchy, including cache and virtual memory, this is an all-important consideration.

The growth factor $\rho$ with complete pivoting is interesting and I plan to investigate it in my next blog.

`lugui` is one of the demonstration programs included with Numerical Computing with MATLAB. I will also describe it in my next blog. For now, I will just use it to generate a graphic for this blog. Here is the result of a factorization, the lower triangular factor `L` in green and the upper triangular factor `U` in blue.

```
A = gfpp(6);
lugui(A,'partial')
```

Nicholas J. Higham and Desmond J. Higham, "Large Growth Factors in Gaussian Elimination with Pivoting", <http://www.maths.manchester.ac.uk/~higham/papers/hihi89.pdf>, SIAM J. Matrix Anal. Appl. 10, 155-164, 1989.

Lloyd N. Trefethen and Robert S. Schreiber, "Average-case Stability of Gaussian Elimination", https://courses.engr.illinois.edu/cs591mh/trefethen/trefethen_schreiber.pdf, SIAM J. Matrix Anal. Appl. 11, 335-360, 1990.

Stephen J. Wright, A Collection of Problems for Which Gaussian "Elimination with Partial Pivoting is Unstable", <ftp://info.mcs.anl.gov/pub/tech_reports/reports/P266.ps.Z>, SIAM J. Sci. Statist. Comput. 14, 231-238, 1993.

Leslie Foster, "Gaussian Elimination Can Fail In Practice", <http://www.math.sjsu.edu/~foster/geppfail.pdf>, SIAM J. Matrix Anal. Appl. 15, 1354-1362, 1994.

Get
the MATLAB code

Published with MATLAB® R2014a

Today's blog post comes from the stream of image processing questions over at MATLAB Answers.

User arjun wanted to know how to remove circles from an image by specifying their center and radius.

Famous MATLAB Answer man Image Analyst quickly gave a helpful reply.

I thought it might be worth a little discussion here (plus, I wanted to point out the value of MATLAB Answers.)

Typically, the process of answering the question started out with trying to understand the question better. "What does *remove* mean to you?" asked Image Analyst. The answer came back, "I want to make the pixels in the circle be black."

OK, let's give that a try.

```
I = imread('coins.png');
imshow(I)
```

(Hmm. Guess I didn't have any quarters in my pocket when I took this picture.)

Suppose we want to set to black all the pixels centered at (X = 175, Y = 120) and with radius 35.

I like Image Analyst's recommended procedure:

- Compute a
*mask image*. Generally, the term*mask*or*mask image*means a binary image whose foreground pixels correspond to a set of pixels of interest. - Use logical indexing with the mask image to modify the desired pixels.

Start with creating a mask image.

```
xc = 175;
yc = 120;
r = 33;
x = 1:size(I,2);
y = 1:size(I,1);
[xx,yy] = meshgrid(x,y);
mask = hypot(xx - xc, yy - yc) <= r;
imshow(mask)
title('Mask image')
```

Next, use logical indexing to modify the pixels of `I` corresponding to the foreground pixels of `mask`.

```
I2 = I;
I2(mask) = 0;
imshow(I2)
title('Image with some pixels set to black')
```

And we just lost 5 cents.

Image Analyst pointed out that you can use this technique to set pixel values in the circle to any value you want. How about white:

```
I2(mask) = 255;
imshow(I2)
title('Image with some pixels set to white')
```

For fun, how about setting the pixels to the background color. How do we determine that? I might start by thresholding the original image.

```
bw = im2bw(I,graythresh(I));
imshow(bw)
title('Thresholded image')
```

Now fill in the holes.

bw = imfill(bw,'holes'); imshow(bw) title('Thresholded image with holes filled')

Compute the mean value of the background pixels (using logical indexing again!).

bg_mean = mean(I(~bw))

bg_mean = 66.9959

Replace the pixels in the circle with the computed average background value.

```
I2(mask) = bg_mean;
imshow(I2)
title('Circle pixels filled with average background')
```

Let me point out two related Image Processing Toolbox functions. The `roifill` function can fill a region based on a mask; it fills pixels "smoothly" from the adjacent background pixels.

```
I3 = roifill(I,mask);
imshow(I3)
title('Circle pixels replaced using roifill')
```

I also want to mention `imellipse`. You can use this function to draw circles interactively using the mouse. Then you can use the `createMask` function to get the mask image corresponding to the circle you drew.

Happy circle filling!

Get
the MATLAB code

Published with MATLAB® R2014a

Here is the result:

**Why Simulink Code Inspector?**

You are designing a high-integrity software application using Simulink (good idea!). Once you have an excellent design that has been tested and meets the design requirements, you use Embedded Coder to generate C code that you will then compile, link, and load to your embedded hardware target.

You know your Simulink design does exactly what you want it to do - nothing more, nothing less. But you are not going to be running Simulink on the embedded hardware - you will be running an executable built from the automatically generated C code.

So what about that C code - does it still represent your exact design? How can you tell? You could use Processor-in-the-Loop (PIL) and re-run your requirements-based tests on the target processor to verify that the outputs match your simulation outputs. That's a good start. However, this is not necessarily going to prove the absence of unintended code. So, what can?

**So... what is Simulink Code Inspector?**

Simulink Code Inspector automatically compares generated code with its source model. It examines the generated code and the model to determine if they are structurally equivalent. After inspection is complete, it produces detailed reports with model-to-code and code-to-model traceability analysis.

Because Simulink Code Inspector has been implemented completely independently of Embedded Coder, it can be used as a means of compliance for DO-178C/DO-331 certification objectives. These structural equivalence and traceability reports can be submit to certification authorities as evidence of code reviews for high-integrity standards such as DO-178C. Simulink Code Inspector is supported by our DO-178C/DO-331 Qualification Kit, allowing you to obtain certification credit when using it for DO-178C/DO-331 applications.

**How does it do that?**

Ok, magic and abstract syntax trees.

**Working with Simulink Code Inspector**

To highlight what Code Inspector can do for you, let's take a very simple model to be deployed to an embedded target.

Not all Simulink/Stateflow features are supported by Code Inspector. Before running an inspection, you need to check that your model is compatible with Code Inspector. A set of Model Advisor checks, included with Code Inspector, can be run to verify compatibility.

I run these checks on the model and review the Model Advisor report.

No compatibility issues, so let's go ahead with inspection.

Open the Simulink Code Inspector window from the 'Code' menu...

And get this window:

Once code generation and inspection are complete, a report with the results will open.

**Code Inspection Report**

Here's the report from the inspection - we passed!

The report is extremely detailed - let's look at some of the information it includes.

**Model-To-Code Traceability**

Model-to-code traceability - with hyperlinks for navigating to the exact model object for convenience.

**Code-To-Model Traceability**

Depending on your goal, it might be more convenient to know which block corresponds to a specific line of code. The report also includes a code-to-model traceability section.

**I guess it's your turn now...**

We've introduced some of the basic features of Simulink Code Inspector. Obviously there's quite a bit more to this tool, but I hope that this overview will inspire you to think about your processes for verifying autogenerated code for high-integrity applications.

Does your organization's development process for high-integrity embedded software include manual code reviews or manual tracing from model to code? Are you interested in seeing more posts about high-integrity software development with Simulink? Let us know what you think by leaving a comment below.

One more thing... as mentioned in a previous post, if you need help with any part of your Model-Based Design process, you can contact our Consulting Servies.

]]>

Photo by Sara Kerens

Nick is what you might call a “heroic user” of MATLAB. He was there at the first class in which Cleve introduced the original FORTRAN MATLAB as a teaching tool. Near the lobby of the main MathWorks building here in Natick, we have a framed copy of the very first purchase order for MATLAB. Who bought that first copy of MATLAB? None other than Nick Trefethen. And it’s not merely a question of precedence or longevity. Across his long career, Nick has been a contributor of the highest order to the ecosystem of mathematical code written in MATLAB. He is the author of many books, an influential educator, and most recently he has been leading an innovative open source software project, a project that grew out of work he initiated back in 2002: Chebfun.

I can personally attest that the examples associated with Chebfun are among the most beautiful and thoughtful you will find documenting any mathematical project on the web. We recently conducted a virtual interview with Nick to learn about the latest release of Chebfun. Here’s what we learned.

Q: When did you first start using MATLAB?

A: I took a course from Cleve Moler when I was a graduate student at Stanford, so I must have typed x = A\b for the first time around 1978. When MathWorks started selling MATLAB, I bought the very first copy! That was in February, 1985. It was about ten years later when MathWorks told me I had been the first.

Q: What is Chebfun?

A: Chebfun is a system written in MATLAB that looks and feels like MATLAB, except that vectors and matrices become functions and operators. Whenever MATLAB has a discrete command, Chebfun aims to have the same command overloaded to execute a continuous analogue. For example, SUM(f) gives you the integral of a chebfun f over its interval of definition, MAX(f) gives you its maximum, ROOTS computes the roots, and so on.

Q: Why is it called Chebfun?

A: Well, Chebfun stands for Chebyshev-functions. Our algorithms are based on representations of functions by piecewise Chebyshev polynomial interpolants. The details are given in my book Approximation Theory and Approximation Practice.

Q: This Chebyshev fellow must have been an interesting character. Can you tell us a little more about how his name came to be associated with your favorite functions?

A: Pafnuty Lvovich Chebyshev was the preeminent Russian mathematician of the 19th century, born near Moscow and a professor in St. Petersburg. He was remarkable for contributing in a big way to both pure and applied mathematics. The crucial thing for Chebfun was that he laid the foundations of how one can approximate nonperiodic functions by polynomials. What Fourier is to periodic functions, Chebyshev is to nonperiodic functions.

Q: A new version of Chebfun has recently been released. What is different in this release?

A: The big thing is that the code has been completely rewritten in a more modular structure and released on GitHub; we are a fully public open-source project. There’s also a beautiful new web site at www.chebfun.org, with a user’s guide and hundreds of examples showing how to use the code. Besides that there are many speedups and a number of new features, notably the capability of taking advantage of periodicity of functions (i.e., Fourier as opposed to Chebyshev methods). And the two-dimensional side of Chebfun, for representing functions f(x,y), keeps advancing.

Q: Why have you released Chebfun on GitHub?

A: We’d like to encourage new users and new developers. Broadening the Chebfun community seems the best way to make the software develop as strongly as possible and ensure its long-term future. (We have thousands of users already, but we don’t know as much about them as we would like.)

Q: What benefit do you see in the MATLAB File Exchange / GitHub integration?

A: So far as we are aware, Chebfun is the most advanced project written in MATLAB. It’s completely natural that it should be easily available from the File Exchange. We are very pleased at this new development.

Q: How might MATLAB users apply Chebfun?

A: We think every MATLAB user is potentially a Chebfun user, for every MATLAB user works with functions. It’s so much simpler and more natural to call SUM, MAX, ROOTS and so on than to worry about the calling sequences of QUAD, FMINBND, FZERO. In 2D, the same commands find integrals and extrema and roots of functions f(x,y). Chebfun is also the easiest way to solve ODE boundary value problems. We’ve overloaded x=A\b to u=N\f, where N is a differential operator with boundary conditions and f is a right-hand side.

The big advantage in all these computations is that we’re always dealing with functions (chebfuns), they all take the same form, and the user doesn’t have to know anything about the underlying grids. Input to SUM or MAX or ROOTS — just a chebfun. Output from \ for ODEs — just another chebfun. This is very different from the traditional style of computing in MATLAB and other languages with codes like BVP4C/BVP5C where for each code you need to learn a calling sequence that may involve specification of an initial grid and the code’s special structures like “DEVAL” for representing output functions.

Q: How can MATLAB users access Chebfun?

A: Just go to the File Exchange, or google Chebfun. Be sure to check out the People page with its photo-bios of Chebfun developers including Nick Hale, Toby Driscoll, Asgeir Birkisson, Alex Townsend, Anthony Austin, Grady Wright, and Hrothgar.

]]>Guest blogger Peter Webb returns with another in an occasional series of postings about application deployment.

I've written a MATLAB application that reports on the presence of contaminants in surface water, using a publically available database of water quality observations maintained by the United States Geological Survey and the Environmental Protection Agency. Many groups of people might be interested in this kind of information. MATLAB made it easy to write the application, and some tools we've added recently make sharing the application easy as well.

I can share the application with other MATLAB users as a MATLAB App. With MATLAB Compiler I can create a standalone program that will run on a Windows, Linux or Mac desktop. And Builder JA lets me turn my application into a Java class for use by other developers.

To follow the rest of this article, please download the example code from MATLAB Central.

You could share your code with other MATLAB users by emailing them your function and data files. But then you'd have to explain how to install, update and uninstall your application -- and the process would be very vulnerable to human error. Or you could create a MATLAB App. Introduced in R2012b, MATLAB Apps are self-contained programs that automate a technical computing task, typically via a graphical user interface. MATLAB displays the installed apps in the App Gallery, which you access from the **Apps** tab in the MATLAB toolstrip. Packaging your code as a MATLAB App creates an automated installer (a `.mlappinstall` file) that integrates your application into the App Gallery. The MATLAB App installer includes only your code; if the App uses any toolboxes, the recepient of the App must have licened copies of these toolboxes in order to use the App. Click on the **Package App** button on the **Apps** tab to begin packaging your program as an app.

The packaging interface consists of three columns, each representing a step in the packaging process:

- Pick main file
- Describe your app
- Package into installation file

All Apps must have a main file -- this is the MATLAB file that starts the graphical user interface or performs the task. Click on the `Add main file` link:

From the downloaded example code, select `swq.m`. The **Files included through analysis** section fills in with the required function files. Since the automated analysis cannot discover data files, you must add them manually. Select the `Add files/folders` in the **Shared resources and helper files** section and add `Contaminants.xml`, `ContaminantLimits.xml` and `swqLogo.png`.

All Apps must also have a name. Type `SurfaceWaterQuality` into the `App Name` field at the top of the **Describe your app** area. You can add your contact info and a description too, if you like, but that's optional. If you want to add an icon and a screenshot, I've provided appropriate images: `sqwIcon.png` for the icon and `swqApp.png` for the screenshot. Click on the screenshot and icon to browse for these images.

Once you're satisfied with the description, press the **Package** button to create the installer: `SurfaceWaterQuality.mlappinstaller`. On any machine with MATLAB installed, double-clicking `SurfaceWaterQuality.mlappinstaller` will install the `SurfaceWaterQuality` app into the MATLAB App Gallery. Click once on the app to start it:

With these inputs, the app produces a report of contaminants recorded within five miles of the MathWorks buildings in Natick, Massachusetts.

MATLAB Apps allow you to share programs with other MATLAB users. But what if your target audience doesn't have MATLAB? MATLAB Compiler packages MATLAB programs to run against the MATLAB Compiler Runtime, which does not require MATLAB. The process of creating a standalone executable with MATLAB Compiler begins on the **Apps** tab of the MATLAB toolstrip. Click the **Application Compiler** button:

Like an App, a MATLAB Compiler-generated standalone executable requires a main file. Click on the `+` next to the `Add main file` field and select `waterQualityReport.m` from the downloaded example files:

`applicationCompiler` derives the application name from the name of the main file and immediately begins determining the files required to run the application. The **Files required for your application to run** section fills with MATLAB function dependencies. You'll need to add data file dependencies manually. Click on the `+` button in that section and add `Contaminants.xml` and `ContaminantLimits.xml`:

Like a MATLAB App, a standalone application may contain contact and description information to be displayed in the installer. You may also add a splash screen, for example `waterQuality.jpg` from the downloaded example files. `applicationCompiler` automatically includes the splash screen image in the installer, so there's no need to add it manually. Once you're done entering this optional information, change the name of the installer to `WQRInstaller_web`.

Press the **Package** button to build the installer for your standalone application.

Packaging creates a platform-specific installer: `waterQualityReport\for_redistribution\WQRInstaller_web.exe` on Windows and `waterQualityReport/for_redistribution/WQRInstaller_web` on Linux and Mac. Copy this file to a machine of the appropriate type, run the installer, and you'll be able to install and run the water quality application without installing MATLAB. Run the installer like this:

`WQRInstaller_web`

Look for the executable in the `application` folder, located in the installation folder. And then run the application like this:

`waterQualityReport 42.2973025 -71.3525913 5 1/1/2004`

On Linux and Mac platforms, you may find it easier to run the application using the generated `run_waterQualityReport.sh` shell script. This command assumes you've installed the MCR in the `/tmp/WQTest/MATLAB_Compiler_Runtime/v83` folder:

./run_waterQualityReport.sh \ /tmp/WQTest/MATLAB_Compiler_Runtime/v83/ \ 42.2973025 -71.3525913 5 1/1/2004

See the MATLAB Compiler documentation for more information about how to run a MATLAB Compiler-generated application.

Builder JA creates a Java component that you integrate into a larger host application. Before using Builder JA, you must configure your environment to ensure Builder JA uses a supported version of the Java compiler.

When you have set up your environment, type `libraryCompiler` at the MATLAB prompt and choose **Java Package** as the application type to begin:

Instead of running a main file, a Java component exports one or more functions into a public API. Use the `+` button to add `detectContamination.m` to the **Exported Functions** list. Then, in **Files required for your application to run** section, add the data files `Contaminants.xml`, `ContaminantLimits.xml` and `swqLogo.png`.

A Java API requires a class with at least one method. Placing the class in a namespace helps prevent class name conflicts. `libraryCompiler` uses generic defaults: no namespace, a class named `Class1` and method names corresponding to the MATLAB functions you chose to export. Change the class name to `SurfaceWaterQuality` and the namespace to `WaterQuality`.

Change the name of the generated installer (as illustrated previously, in **Deploying to the Desktop**) to `DCInstaller_web`. Press the **Package** button to create `WaterQuality.jar`.

Unlike the desktop application, the Java component in `WaterQuality.jar` won't run by itself. It needs to be invoked from a host Java application. The downloaded code contains an example main program, `WaterQuality.java`.

Much like the MATLAB App and the desktop application, this Java host program collects a search area and start date from the user and invokes the web service. In addition to creating a user interface via AWT, the main program must perform three tasks to interact with the Builder JA-generated Java component:

- Import Builder JA generated and utility classes.
- Create a
`WaterQuality.SurfaceWaterQuality`object. - Invoke the
`detectContaminants`method.

The `WaterQuality` namespace contains the generated `SurfaceWaterQuality` class and the `com.mathworks.toolbox.javabuilder` namespace contains Builder JA runtime management and error handling classes such as `MWException`. The Java program imports these namespaces:

import WaterQuality.*; import com.mathworks.toolbox.javabuilder.*;

Builder JA creates object, rather than static, methods, so you must create an instance of the generated class to invoke the methods in your exported API. The constructor might throw an `MWException`, so your Java code must either `catch` `MWException` or declare it in a `throws` clause. The unsophisticated code in the example catches (and ignores) errors that might occur during object creation:

try { swq = new SurfaceWaterQuality(); } catch (MWException me) {}

Pushing the **Report** button invokes the web service via the `detectContamination` method. The first input is the number of expected outputs. Methods in the exported API return an array of Java `Object` instances. The host program converts each result to the appropriate native type, a `String` in this case, before using them. Builder JA includes a robust set of type-conversion routines to make this task easy. `detectContamination` preformats the text for tabular display with a monospaced font such as `Courier`. In the Java code below, `reportData` is a Java `TextArea` object capable of scrolling to display the list of reported contaminants.

Object[] lhs; lhs = swq.detectContamination(1, latitude, longitude, radius, startDate); if (lhs != null) { String result = lhs[0].toString(); reportData.setText(result); }

The download contains scripts to help you build and run the sample application. See the `Readme.txt` for platform-specific instructions. Note that the application does not do much validation of its inputs, and is somewhat fragile as a result.

Managing the interaction with the web service is the most complex part of these three water quality report applications. And in each of them, that part is exactly the same. Only the target-specific wrapper differs, adapting the core functionality to the interface norms of the target environment: MATLAB, the standalone desktop and a Java host program.

MATLAB Apps, `applicationCompiler` and `libraryCompiler` use the same visual conventions and very similar three-step workflows: select functions to deploy, add data files and descriptive data, create an installer. This shared language and process makes it simpler to deploy the same code to multiple target environments.

Do you deploy code to more than one environment? What challenges do you face? Let us know here.

Get
the MATLAB code

Published with MATLAB® R2014a

Denormal floating point numbers and gradual underflow are an underappreciated feature of the IEEE floating point standard. Double precision denormals are so tiny that they are rarely numerically significant, but single precision denormals can be in the range where they affect some otherwise unremarkable computations. Historically, gradual underflow proved to be very controversial during the committee deliberations that developed the standard.

My previous post was mostly about normalized floating point numbers. Recall that normalized numbers can be expressed as

$$ x = \pm (1 + f) \cdot 2^e $$

The *fraction* or *mantissa* $f$ satisfies

$$ 0 \leq f < 1 $$

$f$ must be representable in binary using at most 52 bits for double precision and 23 bits for single precision.

The exponent $e$ is an integer in the interval

$$ -e_{max} < e \leq e_{max} $$

where $e_{max} = 1023$ for double precision and $e_{max} = 127$ for single precision.

The finiteness of $f$ is a limitation on *precision*. The finiteness of $e$ is a limitation on *range*. Any numbers that don't meet these limitations must be approximated by ones that do.

Double precision floating point numbers are stored in a 64-bit word, with 52 bits for $f$, 11 bits for $e$, and 1 bit for the sign of the number. The sign of $e$ is accommodated by storing $e+e_{max}$, which is between $1$ and $2^{11}-2$.

Single precision floating point numbers are stored in a 32-bit word, with 23 bits for $f$, 8 bits for $e$, and 1 bit for the sign of the number. The sign of $e$ is accommodated by storing $e+e_{max}$, which is between $1$ and $2^{8}-2$.

The two extreme values of the exponent field, all zeroes and all ones, are special cases. All zeroes signifies a denormal floating point number, the subject of today's post. All ones, together with a zero fraction, denotes infinity, or `Inf`. And all ones, together with a a nonzero fraction, denotes Not-A-Number, or `NaN`.

My program `floatgui`, available here, shows the distribution of the positive numbers in a model floating point system with variable parameters. The parameter $t$ specifies the number of bits used to store $f$, so $2^t f$ is an integer. The parameters $e_{min}$ and $e_{max}$ specify the range of the exponent.

If you look carefully at the output from `floatgui` shown in the previous post you will see a gap around zero. This is especially apparent in the logarithmic plot, because the logarithmic distribution can never reach zero.

Here is output for slightly different parameters, $t = 3$, $e_{min} = -5$, and $e_{max} = 2$. Howrever, the gap around zero has been filled in with a spot of green. Those are the denormals.

Zoom in on the portion of these toy floating point numbers less than one-half. Now you can see the individual green denormals -- there are eight of them in this case.

Denormal floating point numbers are essentially roundoff errors in normalized numbers near the underflow limit, `realmin`, which is $2^{-e_{max}+1}$. They are equally spaced, with a spacing of `eps*realmin`. Zero is naturally included as the smallest denormal.

Suppose that `x` and `y` are two distinct floating point numbers near to, but larger than, `realmin`. It may well be that their difference, `x - y`, is smaller than `realmin`. For example, in the small `floatgui` system pictured above, `eps = 1/8` and `realmin = 1/32`. The quantities `x = 6/128` and `y = 5/128` are between `1/32` and `1/16`, so they are both above underflow. But `x - y = 1/128` underflows to produce one of the green denormals.

Before the IEEE standard, and even on today's systems that do not comply with the standard, underflows would simply be set to zero. So it would be possible to have the MATLAB expression

x == y

be false, while the expression

x - y == 0

be true.

On machines where underflow flushes to zero and division by zero is fatal, this code fragment can produce a division by zero and crash.

if x ~= y z = 1/(x-y); end

Of course, denormals can also be produced by multiplications and divisions that produce a result between `eps*realmin` and `realmin`. In decimal these ranges are

format compact format short e [eps*realmin realmin] [eps('single')*realmin('single') realmin('single')]

ans = 4.9407e-324 2.2251e-308 ans = 1.4013e-45 1.1755e-38

Denormal floating point numbers are stored without the implicit leading one bit,

$$ x = \pm f \cdot 2^{-emax+1}$$

The *fraction* $f$ satisfies

$$ 0 \leq f < 1 $$

And $f$ is represented in binary using 52 bits for double precision and 23 bits for single recision. Note that zero naturally occurs as a denormal.

When you look at a double precision denormal with `format hex` the situation is fairly clear. The rightmost 13 hex characters are the 52 bits of the fraction. The leading bit is the sign. The other 12 bits of the first three hex characters are all zero because they represent the biased exponent, which is zero because $emax$ and the exponent bias were chosen to complement each other. Here are the two largest and two smallest nonzero double precision denormals.

format hex [(1-eps); (1-2*eps); 2*eps; eps; 0]*realmin format long e ans

ans = 000fffffffffffff 000ffffffffffffe 0000000000000002 0000000000000001 0000000000000000 ans = 2.225073858507201e-308 2.225073858507200e-308 9.881312916824931e-324 4.940656458412465e-324 0

The situation is slightly more complicated with single precision because 23 is not a multiple of four. The fraction and exponent fields of a single precision floating point number -- normal or denormal -- share the bits in the third character of the hex display; the biased exponent gets one bit and the first three bits of the 23 bit fraction get the other three. Here are the two largest and two smallest nonzero single precision denormals.

format hex e = eps('single'); r = realmin('single'); [(1-e); (1-2*e); 2*e; e; 0]*r format short e ans

ans = 007fffff 007ffffe 00000002 00000001 00000000 ans = 1.1755e-38 1.1755e-38 2.8026e-45 1.4013e-45 0

Think of the situation this way. The normalized floating point numbers immediately to the right of `realmin`, between `realmin` and `2*realmin`, are equally spaced with a spacing of `eps*realmin`. If just as many numbers, with just the same spacing, are placed to the left of `realmin`, they fill in the gap between there and zero. These are the denormals. They require a slightly different format to represent and slightly different hardware to process.

The IEEE Floating Point Committee was formed in 1977 in Silicon Valley. The participants included representatives of semiconductor manufacturers who were developing the chips that were to become the basis for the personal computers that are so familiar today. As I said in my previous post, the committee was a remarkable case of cooperation among competitors.

Velvel Kahan was the most prominent figure in the committee meetings. He was not only a professor of math and computer science from UC Berkeley, he was also a consultant to Intel and involved in the design of their math coprocessor, the 8087. Some of Velvel's students, not only from the campus, but also who had graduated and were now working for some of the participating companies, were involved.

A proposed standard, written by Kahan, one of his students at Berkeley, Jerome Coonen, and a visiting professor at Berkeley, Harold Stone, known as the KCS draft, reflected the Intel design and was the basis for much of the committee's work.

The committee met frequently, usually in the evening in conference rooms of companies on the San Francisco peninsula. There were also meetings in Austin, Texas, and on the East Coast. The meetings often lasted until well after midnight.

Membership was based on regular attendance. I was personally involved only when I was visiting Stanford, so I was not an official member. But I do remember telling a colleague from Sweden who was coming to the western United States for the first time that there were three sites that he had to be sure to see: the Grand Canyon, Las Vegas, and the IEEE Floating Point Committee.

The denormals in the KCS draft were something new to most of the committee. Velvel said he had experimented with them at the University of Toronto, but that was all. Standards efforts are intended to regularize existing practice, not introduce new designs. Besides, implementing denormals would reguire additional hardware, and additional transistors were a valuable resource in emerging designs. Some experts claimed that including denormals would slow down all floating point arithmetic.

A mathematician from DEC, Mary Payne, led the opposition to KCS. DEC wanted a looser standard that would embrace the floating point that was already available on their VAX. The VAX format was similar to, but not the same as, the KCS proposal. And it did not include denormals.

Discussion went on for a couple of years. Letters of support for KCS from Don Knuth and Jim Wilkinson did not settle the matter. Finally, DEC engaged G. W. (Pete) Stewart, from the University of Maryland. In what must have been a surprise to DEC, Pete also said he thought that the KCS proposal was a good idea. Eventually the entire committee voted to accept a revised version.

Denormal floating point numbers are still the unwanted step children in today's floating point family. I think it is fair to say that the numerical analysis community has failed to make a strong argument for their importance. It is true that they do make some floating point error analyses more elegant. But with magnitudes around $10^{-308}$ the double precision denormals are hardly ever numerically significant in actual computation. Only the single precision denormals around $10^{-38}$ are potentially important.

Outside of MATLAB itself, we encounter processors that have IEEE format for the floating point numbers, but do not conform to the 754 standard when it comes to processing. These processors usually flush underflow to zero and so we can expect different numerical results for any calculations that might ordinarily produce denormals.

We still see processors today that handle denormals with microcode or software. Execution time of MATLAB programs that encounter denormals can degrade significantly on such processors.

The Wikipedia page on denormals has the macros for setting trap handlers to flush underflows to zero in C or Java programs. I hate to think what might happen to MATLAB Mex files with such macros. Kids, don't try this at home.

William Kahan, Home Page.

Charles Severance, An Interview with the Old Man of Floating-Point, Reminiscences for *IEEE Computer*, February 20, 1998.

Wikipedia page, Denormal number.

Get
the MATLAB code

Published with MATLAB® R2014a

Collecting and tracking health and fitness data with wearable devices is about to go mainstream as the smartphone giants like Apple, Google and Samsung jump into the fray. But if you collect data, what's the point if you don't analyze it?

Today's guest blogger, Toshi Takeuchi, would like to share an analysis of a weight lifting dataset he found in a public repository.

- Motivation, dataset, and prediction accuracy
- Data preprocessing and exploratory analysis
- Predictive Modeling with Random Forest
- Plot misclassification errors by number of trees
- Variable Importance
- Evaluate trade-off with ROC plot
- The reduced model with 12 features
- Conclusion and the next steps - integrate your code into your app

The Human Activity Recognition (HAR) Weight Lifting Exercise Dataset provides measurements to determine "how well an activity was performed". 6 subjects performed 1 set of 10 Unilateral Dumbbell Biceps Curl in 5 different ways.

When I came across this dataset, I immediately thought of building a mobile app to advise end users whether they are performing the exercise correctly, and if not, which common mistakes they are making. I used the powerful 'Random Forest' algorithm to see if I could build a successful predictive model to enable such an app. I was able to achieve **99% prediction accuracy** with this dataset and I would like to share my results with you.

The dataset provides 39,242 samples with 159 variables labeled with 5 types of activity to detect - 1 correct method and 4 common mistakes:

- exactly according to the specification (Class A)
- throwing the elbows to the front (Class B)
- lifting the dumbbell only halfway (Class C)
- lowering the dumbbell only halfway (Class D)
- throwing the hips to the front (Class E)

Sensors were placed on the subjects' belts, armbands, glove and dumbbells, as described below:

**Citation** *Velloso, E.; Bulling, A.; Gellersen, H.; Ugulino, W.; Fuks, H. Qualitative Activity Recognition of Weight Lifting Exercises. Proceedings of 4th International Conference in Cooperation with SIGCHI (Augmented Human '13) . Stuttgart, Germany: ACM SIGCHI, 2013.* Read more: http://groupware.les.inf.puc-rio.br/har#ixzz34dpS6oks

Usually you cannot use raw data directly. Preprocessing is an important part of your analysis workflow that has significant downstream impact.

- Load the dataset and inspect data for missing values
- Partition the dataset for cross validation
- Clean and normalize variables
- Select predictor variables (features)

Among those steps, cross validation is a key step specific to predictive modeling. Roughly speaking, you hold out part of available data for testing later, and build models using the remaining dataset. The held out set is called the 'test set' and the set we use for modeling is called the 'training set'. This makes it more difficult to overfit your model, because you can test your model against the data you didn't use in the modeling process, giving you a realistic idea how the model would actually perform with unknown data.

Exploratory analysis usually begins with visualizing data to get oriented with its nuances. Plots of the first four variables below show that:

- data is organized by class - like 'AAAAABBBBBCCCCC'. This can be an artifact of the way the data was collected and real life data may not be structured like this, so we want to use more realistic data to build our model. We can fix this issue by randomly reshuffling the data.
- data points cluster around a few different mean values - indicating that measurements were taken by devices calibrated in a few different ways
- those variables exhibit a distinct pattern for Class E (colored in magenta) - those variables will be useful to isolate it

Review `preprocess.m` if you are curious about the details.

preprocess

The dataset has some issues with calibration. We could further preprocess the data in order to remove calibration gaps. This time, however, I would like to use a flexible predictive algorithm called Random Forest. In MATLAB, this algorithm is implemented in the TreeBagger class available in Statistics Toolbox.

Random Forest became popular particularly after it was used by a number of winners in Kaggle competitions. It uses a large ensemble of decision trees (thus 'forest') trained on random subsets of data and uses the majority votes of those trees to predict the result. It tends to produce a highly accurate result, but the complexity of the algorithm makes it slow and difficult to interpret.

To accelerate the computation, I will enable the parallel option supported by Parallel Computing Toolbox. You can comment out unnecessary code if you don't use it.

Once the model is built, you will see the confusion matrix that compares the actual class labels to predicted class labels. If everything lines up on a diagonal line, then you achieved 100% accuracy. Off-diagonal numbers are misclassification errors.

The model has a very high prediction accuracy even though we saw earlier that our dataset had some calibration issues.

Initialize parallel option - comment out if you don't use parallel

poolobj = gcp('nocreate'); % don't create a new pool even if no pool exits if isempty(poolobj) parpool('local',2); end opts = statset('UseParallel',true);

Starting parallel pool (parpool) using the 'local' profile ... connected to 2 workers.

Create a Random Forest model with 100 trees, parallel enabled...

rfmodel = TreeBagger(100,table2array(Xtrain),Ytrain,... 'Method','classification','oobvarimp','on','options',opts);

Here's the non-parallel version of the same model rfmodel = TreeBagger(100,table2array(Xtrain),Ytrain,... 'Method','classification','oobvarimp','on');

Predict the class labels for the test set.

[Ypred,Yscore]= predict(rfmodel,table2array(Xtest));

Compute the confusion matrix and prediction accuracy.

C = confusionmat(Ytest,categorical(Ypred)); disp(array2table(C,'VariableNames',rfmodel.ClassNames,'RowNames',rfmodel.ClassNames)) fprintf('Prediction accuracy on test set: %f\n\n', sum(C(logical(eye(5))))/sum(sum(C)))

A B C D E ____ ___ ___ ___ ___ A 1133 1 0 0 0 B 4 728 1 0 0 C 0 3 645 3 0 D 0 0 8 651 0 E 0 0 0 6 741 Prediction accuracy on test set: 0.993374

I happened to pick 100 trees in the model, but you can check the misclassification errors relative to the number of trees used in the prediction. The plot shows that 100 is an overkill - we could use fewer trees and that will make it go faster.

figure plot(oobError(rfmodel)); xlabel('Number of Grown Trees'); ylabel('Out-of-Bag Classification Error');

One major criticism of Random Forest is that it is a black box algorithm and it not easy to understand what it is doing. However, Random Forest can provide the variable importance measure, which corresponds to the change in prediction error with and without the presence of a given variable in the model.

For our hypothetical weight lifting trainer mobile app, Random Forest would be too cumbersome and slow to implement, so you want to use a simpler prediction model with fewer predictor variables. Random Forest can help you with selecting which predictors you can drop without sacrificing the prediction accuracy too much.

Let's see how you can do this with `TreeBagger`.

Get the names of variables

vars = Xtrain.Properties.VariableNames;

Get and sort the variable importance scores. Because we turned `'oobvarimp'` to `'on'`, the model contains `OOBPermutedVarDeltaError` which acts as the variable importance measure.

varimp = rfmodel.OOBPermutedVarDeltaError'; [~,idxvarimp]= sort(varimp); labels = vars(idxvarimp);

Plot the sorted scores.

figure barh(varimp(idxvarimp),1); ylim([1 52]); set(gca, 'YTickLabel',labels, 'YTick',1:numel(labels)) title('Variable Importance'); xlabel('score')

Now let's do the trade-off between the number of predictor variables and prediction accuracy. The Receiver operating characteristic (ROC) plot provides a convenient way to visualize and compare performance of binary classifiers. You plot the false positive rate against the true positive rate at various prediction thresholds to produce the curves. If you get a completely random result, the curve should follow a diagonal line. If you get a 100% accuracy, then the curve should hug the upper left corner. This means you can use the area under the curve (AUC) to evaluate how well each model performs.

Let's plot ROC curves with different sets of predictor variables, using the "C" class as the positive class, since we can only do this one class at a time, and the previous confusion matrix shows more misclassification errors for this class than others. You can use `perfcurve` to compute ROC curves.

Check out `myROCplot.m` to see the details.

```
nFeatures = [3,5,10,15,20,25,52];
myROCplot(Xtrain,Ytrain,Xtest,Ytest,'C',nFeatures,varimp)
```

Based on the previous analysis, it looks like you can achieve a high accuracy rate even if you use as few as 10 features. Let's say we settled for 12 features. We now know you don't have to use the data from the glove for prediction, so that's one less sensor our hypothetical end users would have to buy. Given this result, I may even consider dropping the arm band, and just stick with the belt and dumbbell sensors.

Show which features were included.

disp(table(varimp(idxvarimp(1:12)),'RowNames',vars(idxvarimp(1:12)),... 'VariableNames',{'Importance'}));

Importance __________ accel_belt_y 0.69746 pitch_dumbbell 0.77255 accel_arm_z 0.78941 accel_belt_x 0.81696 magnet_arm_y 0.81946 accel_arm_x 0.87168 magnet_arm_x 0.87897 accel_dumbbell_x 0.92222 magnet_forearm_x 1.0172 total_accel_belt 1.0461 gyros_arm_z 1.1077 gyros_belt_x 1.1235

Shut down the parallel pool.

delete(poolobj);

Despite my initial misgivings about the data, we were able to maintain high prediction accuracy with a Random Forest model with just 12 features. However, Random Forest is probably not an ideal model to implement on a mobile app given its memory foot print and slow response time.

The next step is to find a simpler model, such as logistics regression, that can perform decently. You may need to do more preprocessing of the data to make it work.

Finally, I have never tried this before, but you could generate C code out of MATLAB to incorporate it into a mobile app. Watch this webinar, MATLAB to iPhone Made Easy, for more details.

Do you track your activities with wearable devices? Have you tried analyzing the data? Tell us about your experience here!

Get
the MATLAB code

Published with MATLAB® R2014a

This summer my mother-in-law is renting a house on a lake in New Hampshire. Looking at the calendar, my wife said: "The ten-day forecast makes it look like it's going to be pretty hot up at the lake next week." This led to a more general discussion of the merits of ten-day forecasts.

It's funny how we can make decisions based on long-term predictions of weather even though we rarely go back and verify that the forecast was any good. Somehow the fact that the forecast exists at all gives it value. I'm left pondering this question: how much should we trust a ten-day prediction? As it happens, I have some data that can be useful here. For some time, I have been collecting some relevant data on Trendy: the ten day forecast for Natick, Massachusetts (hometown for MathWorks). So let's run some numbers.

Here's the trend: Ten Day Forecast Highs for Natick, MA.

Once a day this trend collects ten data points: today's high temperature and the predicted high temperature for the next nine days. In MATLAB, we'll be working with a matrix with one row for each day and ten columns.

Let's get the data into MATLAB so we can play around with it. I can retrieve (and so can you) the data from Trendy as a JSON object using the following call:

http://www.mathworks.com/matlabcentral/trendy/trends/1655/trend_data.json

In order to read this into MATLAB, I'm going to use Joe Hicklin's JSON parser.

```
url = 'http://www.mathworks.com/matlabcentral/trendy/trends/1655/trend_data.json';
json = urlread(url);
raw = JSON.parse(json);
```

t = zeros(length(raw),1); d = zeros(length(raw),10); for i = 1:length(raw) t(i) = raw{i}{1}; predictions = raw{i}{2}; for j = 1:10 d(i,j) = str2num(predictions{j}); end end firstTenRows = d(1:10,:)

firstTenRows = 44 51 59 66 46 48 53 50 51 54 50 58 63 49 48 52 48 46 57 54 59 61 49 47 53 49 48 43 41 48 62 49 48 54 49 39 39 44 47 46 49 48 54 50 40 38 39 47 51 54 47 55 50 39 40 48 52 53 53 53 54 50 40 39 48 53 54 52 52 50 49 40 38 50 55 54 56 56 53 49 40 39 50 56 54 52 56 54 47 43 39 50 55 55 55 59 58 40 41 46

Now I have a temperature prediction matrix that's structured like this.

I want to re-order this matrix so that each line shows the prediction trajectory for a single day in time. That means picking off the diagonals highlighted in the diagram above. So let's write some code that does this shift. I'm going to end up with two new matrices, d1 and d2

rowIndex = 1:10; colIndex = 10:-1:1; sz = size(d); len = (size(d,1)-10); d1 = zeros(len,10); d2 = zeros(len,10); t1 = zeros(len,1); for i = 1:len ind = sub2ind(sz,rowIndex+i-1,colIndex); trend = d(ind); d1(i,:) = trend; d2(i,:) = trend-trend(end); t1(i) = t(i+9); end firstTenRows = d1(1:10,:)

firstTenRows = 54 57 43 39 38 40 39 38 39 39 54 41 44 39 48 48 50 50 50 49 48 47 47 52 53 55 56 55 54 57 46 51 53 54 54 54 55 55 57 58 54 53 52 56 52 55 57 57 60 58 53 52 56 56 59 59 61 63 63 64 50 53 54 58 45 42 45 44 44 43 49 47 40 39 37 42 43 43 43 43 43 41 41 42 44 48 47 49 48 47 46 44 48 49 46 52 51 50 49 48

In d1, each row is the evolving temperature prediction for each day. So when we plot the first row of d1, we're getting the predictive arc for November 13th of last year.

i = 1; plot(-9:0,d1(i,:)) title(sprintf('Predicted Temperature for %s',datestr(t1(i),1))) xlabel('Time of Prediction (Offset in Days)') ylabel('Predicted Temperature (Deg. F)')

In d2, we just subtract from each row the last value in each row. Since this last value is the final (and presumably correct) temperature, this difference gives us the predictive error across the ten days. Here's the error for the November 13th prediction.

i = 1; plot(-9:0,d2(i,:)) title(sprintf('Error in Predicted Temperature for %s',datestr(t1(i),1))) xlabel('Time of Prediction (Offset in Days)') ylabel('Prediction Error (Deg. F)')

Notice that it shrinks to zero over time. That's good! Our predictions get more accurate as we approach the actual day in question. But the early predictions were off by as much as 18 degrees. Is that good or bad? You tell me.

Now let's look at all the days.

plot(-9:0,d2','Color',[0.5 0.5 1]) title('Error in Predicted Temperature') xlabel('Time of Prediction (Offset in Days)') ylabel('Prediction Error (Deg. F)')

It's hard to get a sense of the error distribution. So let's finish with a histogram of the absolute value of the error. Out of 240 measurements in this data set, the median error for a ten-day prediction is six degrees.

hist(abs(d2(:,1)),1:25) title('Histogram of Error in the Ten-Day Forecast') xlabel('Error (deg. F)') ylabel('Number of Samples')

That seems pretty good. Most of the time that error is going to be less than seven or so degrees Fahrenheit (or four degrees Celsius). I probably don't need to pack a sweater for the weekend at the lake.

Get
the MATLAB code

Published with MATLAB® R2014a

This is the first part of a two-part series about the single- and double precision floating point numbers that MATLAB uses for almost all of its arithmetic operations. (This post is adapted from section 1.7 of my book Numerical Computing with MATLAB, published by MathWorks and SIAM.)

If you look carefully at the definitions of fundamental arithmetic operations like addition and multiplication, you soon encounter the mathematical abstraction known as real numbers. But actual computation with real numbers is not very practical because it involves notions like limits and infinities. Instead, MATLAB and most other technical computing environments use floating point arithmetic, which involves a finite set of numbers with finite precision. This leads to the phenomena of *roundoff*, *underflow*, and *overflow*. Most of the time, it is possible to use MATLAB effectively without worrying about these details, but, every once in a while, it pays to know something about the properties and limitations of floating point numbers.

Thirty years ago, the situation was far more complicated than it is today. Each computer had its own floating point number system. Some were binary; some were decimal. There was even a Russian computer that used trinary arithmetic. Among the binary computers, some used 2 as the base; others used 8 or 16. And everybody had a different precision. In 1985, the IEEE Standards Board and the American National Standards Institute adopted the ANSI/IEEE Standard 754-1985 for Binary Floating-Point Arithmetic. This was the culmination of almost a decade of work by a 92-person working group of mathematicians, computer scientists, and engineers from universities, computer manufacturers, and microprocessor companies.

A revision of IEEE_754-1985, known as IEEE 754-2008, was published in 2008. The revision combines the binary standards in 754 and the decimal standards in another document, 854. But as far as I can tell, the new features introduced in the revision haven't had much impact yet.

William M. Kahan was the primary architect of IEEE floating point arithmetic. I've always known him by his nickname "Velvel". I met him in the early 1960's when I was a grad student at Stanford and he was a young faculty member at the University of Toronto. He moved from Toronto to UC Berkeley in the late 1960's and spent the rest of his career at Berkeley. He is now an emeritus professor of mathematics and of electrical engineering and computer science.

Velvel is an eminent numerical analyst. Among many other accomplishments, he is the developer along with Gene Golub of the first practical algorithm for computing the singular value decomposition.

In the late 1970's microprocessor manufacturers including Intel, Motorola, and Texas Instruments were developing the chips that were to become the basis for the personal computer and eventually today's electronics. In a remarkable case of cooperation among competitors, they established a committee to develop a common standard for floating point arithmetic.

Velvel was a consult to Intel, working with John Palmer, a Stanford grad who was the principal architect of the 8087 math coprocessor. The 8087 accompanied the 8086, which was to become the CPU of the IBM PC. Kahan and Palmer convinced Intel to allow them to release the specs for their math coprocessor to the IEEE committee. These plans became the basis for the standard.

Velvel received the ACM Turing Award in 1989 for "his fundamental contributions to numerical analysis". He was named an ACM Fellow in 1994, and was inducted into the National Academy of Engineering in 2005.

All computers designed since 1985 use IEEE floating point arithmetic. This doesn't mean that they all get exactly the same results, because there is some flexibility within the standard. But it does mean that we now have a machine-independent model of how floating point arithmetic behaves.

MATLAB has traditionally used the IEEE double precision format. In recent years, the single precision format has also been available. This saves space and can sometimes be faster. There is also an extended precision format, which is not precisely specified by the standard and which may be used for intermediate results by the floating point hardware. This is one source of the lack of uniformity among different machines.

Most nonzero floating point numbers are normalized. This means they can be expressed as

$$ x = \pm (1 + f) \cdot 2^e $$

The quantity $f$ is the *fraction* or *mantissa* and $e$ is the *exponent*. The fraction satisfies

$$ 0 \leq f < 1 $$

The fraction $f$ must be representable in binary using at most 52 bits for double precision and 23 bits for single recision. In other words, for double, $2^{52} f$ is an integer in the interval

$$ 0 \leq 2^{52} f < 2^{52} $$

For single, $2^{23} f$ is an integer in the interval

$$ 0 \leq 2^{23} f < 2^{23} $$

For double precision, the exponent $e$ is an integer in the interval

$$ -1022 \leq e \leq 1023 $$

For single, the exponent $e$ is in the interval

$$ -126 \leq e \leq 127 $$

The finiteness of $f$ is a limitation on *precision*. The finiteness of $e$ is a limitation on *range*. Any numbers that don't meet these limitations must be approximated by ones that do.

Double precision floating point numbers are stored in a 64-bit word, with 52 bits for $f$, 11 bits for $e$, and 1 bit for the sign of the number. The sign of $e$ is accommodated by storing $e+1023$, which is between $1$ and $2^{11}-2$. The 2 extreme values for the exponent field, $0$ and $2^{11}-1$, are reserved for exceptional floating point numbers that I will describe later.

Single precision floating point numbers are stored in a 32-bit word, with 23 bits for $f$, 8 bits for $e$, and 1 bit for the sign of the number. The sign of $e$ is accommodated by storing $e+127$, which is between $1$ and $2^{8}-2$. Again, the 2 extreme values for the exponent field are reserved for exceptional floating point numbers.

The entire fractional part of a floating point number is not $f$, but $1+f$. However, the leading $1$ doesn't need to be stored. In effect, the IEEE formats pack $33$ or $65$ bits of information into a $32$ or $64$-bit word.

My program `floatgui`, available here, shows the distribution of the positive numbers in a model floating point system with variable parameters. The parameter $t$ specifies the number of bits used to store $f$. In other words, $2^t f$ is an integer. The parameters $e_{min}$ and $e_{max}$ specify the range of the exponent, so $e_{min} \leq e \leq e_{max}$. Initially, `floatgui` sets $t = 3$, $e_{min} = -4$, and $e_{max} = 2$ and produces the following distribution.

Within each binary interval $2^e \leq x \leq 2^{e+1}$, the numbers are equally spaced with an increment of $2^{e-t}$. If $e = 0$ and $t = 3$, for example, the spacing of the numbers between $1$ and $2$ is $1/8$. As $e$ increases, the spacing increases.

It is also instructive to display the floating point numbers with a logarithmic scale. Here is the output from `floatgui` with `logscale` checked and $t = 5$, $e_{min} = -4$, and $e_{max} = 3$. With this logarithmic scale, it is more apparent that the distribution in each binary interval is the same.

A very important quantity associated with floating point arithmetic is highlighted in red by `floatgui`. MATLAB calls this quantity `eps`, which is short for *machine epsilon*.

`eps`is the distance from $1$ to the next larger floating point number.

For the `floatgui` model floating point system, `eps` = $2^{-t}$.

Before the IEEE standard, different machines had different values of `eps`. Now, for IEEE double precision the approximate decimal value of `eps` is

format short format compact eps_d = eps

eps_d = 2.2204e-16

And for single precision,

```
eps_s = eps('single')
```

eps_s = 1.1921e-07

Either `eps/2` or `eps` can be called the roundoff level. The maximum relative error incurred when the result of an arithmetic operation is rounded to the nearest floating point number is `eps/2`. The maximum relative spacing between numbers is `eps`. In either case, you can say that the roundoff level is about 16 decimal digits for double and about 7 decimal digits for single.

Actually `eps` is a function. `eps(x)` is the distance from `abs(x)` to the next larger in magnitude floating point number of the same precision as `x`.

A frequently occurring example is provided by the simple MATLAB statement

t = 0.1

The mathematical value $t$ stored in `t` is not exactly $0.1$ because expressing the decimal fraction $1/10$ in binary requires an infinite series. In fact,

$$ \frac{1}{10} = \frac{1}{2^4} + \frac{1}{2^5} + \frac{0}{2^6} + \frac{0}{2^7} + \frac{1}{2^8} + \frac{1}{2^9} + \frac{0}{2^{10}} + \frac{0}{2^{11}} + \frac{1}{2^{12}} + \cdots $$

After the first term, the sequence of coefficients $1, 0, 0, 1$ is repeated infinitely often. Grouping the resulting terms together four at a time expresses $1/10$ n a base 16, or *hexadecimal*, series.

$$ \frac{1}{10} = 2^{-4} \cdot \left(1 + \frac{9}{16} + \frac{9}{16^2} + \frac{9}{16^3} + \frac{9}{16^{4}} + \cdots\right) $$

Floating-point numbers on either side of $1/10$ are obtained by terminating the fractional part of this series after $52$ binary terms, or 13 hexadecimal terms, and rounding the last term up or down. Thus

$$ t_1 < 1/10 < t_2 $$

where

$$ t_1 = 2^{-4} \cdot \left(1 + \frac{9}{16} + \frac{9}{16^2} + \frac{9}{16^3} + \cdots + \frac{9}{16^{12}} + \frac{9}{16^{13}}\right) $$

$$ t_2 = 2^{-4} \cdot \left(1 + \frac{9}{16} + \frac{9}{16^2} + \frac{9}{16^3} + \cdots + \frac{9}{16^{12}} + \frac{10}{16^{13}}\right) $$

It turns out that $1/10$ is closer to $t_2$ than to $t_1$, so $t$ is equal to $t_2$. In other words,

$$ t = (1 + f) \cdot 2^e $$

where

$$ f = \frac{9}{16} + \frac{9}{16^2} + \frac{9}{16^3} + \cdots + \frac{9}{16^{12}} + \frac{10}{16^{13}} $$

$$ e = -4 $$

The MATLAB command

```
format hex
```

causes `t` to be displayed as

3fb999999999999a

The characters `a` through `f` represent the hexadecimal "digits" 10 through 15. The first three characters, `3fb`, give the hexadecimal representation of decimal $1019$, which is the value of the biased exponent $e+1023$ if $e$ is $-4$. The other 13 characters are the hexadecimal representation of the fraction $f$.

In summary, the value stored in `t` is very close to, but not exactly equal to, $0.1$. The distinction is occasionally important. For example, the quantity

0.3/0.1

is not exactly equal to $3$ because the actual numerator is a little less than $0.3$ and the actual denominator is a little greater than $0.1$.

Ten steps of length `t` are not precisely the same as one step of length 1. MATLAB is careful to arrange that the last element of the vector

0:0.1:1

is exactly equal to $1$, but if you form this vector yourself by repeated additions of $0.1$, you will miss hitting the final $1$ exactly.

What does the floating point approximation to the golden ratio, $\phi$, look like?

```
format hex
phi = (1 + sqrt(5))/2
```

phi = 3ff9e3779b97f4a8

The first hex digit, `3`, is `0011` in binary. The first bit is the sign of the floating point number; `0` is positive, `1` is negative. So `phi` is positive. The remaining bits of the first three hex digits contain $e+1023$. In this example, `3ff` in base 16 is $3 \cdot 16^2 + 15 \cdot 16 + 15 = 1023$ in decimal. So

$$ e = 0 $$

In fact, any floating point number between $1.0$ and $2.0$ has $e = 0$, so its `hex` output begins with `3ff`. The other 13 hex digits contain $f$. In this example,

$$ f = \frac{9}{16} + \frac{14}{16^2} + \frac{3}{16^3} + \cdots + \frac{10}{16^{12}} + \frac{8}{16^{13}} $$

With these values of $f$ and $e$,

$$ (1 + f) 2^e \approx \phi $$

Another example is provided by the following code segment.

```
format long
a = 4/3
b = a - 1
c = 3*b
e = 1 - c
```

a = 1.333333333333333 b = 0.333333333333333 c = 1.000000000000000 e = 2.220446049250313e-16

With exact computation, `e` would be zero. But with floating point, we obtain double precision `eps`. Why?

It turns out that the only roundoff occurs in the division in the first statement. The quotient cannot be exactly $4/3$, except on that Russian trinary computer. Consequently the value stored in `a` is close to, but not exactly equal to, $4/3$. The subtraction `b = a - 1` produces a quantity `b` whose last bit is 0. This means that the multiplication `3*b` can be done without any roundoff. The value stored in `c` is not exactly equal to $1$, and so the value stored in `e` is not 0. Before the IEEE standard, this code was used as a quick way to estimate the roundoff level on various computers.

The roundoff level `eps` is sometimes called ``floating point zero,'' but that's a misnomer. There are many floating point numbers much smaller than `eps`. The smallest positive normalized floating point number has $f = 0$ and $e = -1022$. The largest floating point number has $f$ a little less than $1$ and $e = 1023$. MATLAB calls these numbers `realmin` and `realmax`. Together with `eps`, they characterize the standard system.

Double Precision Binary Decimal eps 2^(-52) 2.2204e-16 realmin 2^(-1022) 2.2251e-308 realmax (2-eps)*2^1023 1.7977e+308

Single Precision Binary Decimal eps 2^(-23) 1.1921e-07 realmin 2^(-126) 1.1755e-38 realmax (2-eps)*2^127 3.4028e+38

If any computation tries to produce a value larger than `realmax`, it is said to *overflow*. The result is an exceptional floating point value called *infinity* or `Inf`. It is represented by taking $e = 1024$ and $f = 0$ and satisfies relations like `1/Inf = 0` and `Inf+Inf = Inf`.

If any computation tries to produce a value that is undefined even in the real number system, the result is an exceptional value known as Not-a-Number, or `NaN`. Examples include `0/0` and `Inf-Inf`. `NaN` is represented by taking $e = 1024$ and $f$ nonzero.

If any computation tries to produce a value smaller than `realmin`, it is said to *underflow*. This involves one of most controversial aspects of the IEEE standard. It produces exceptional *denormal* or *subnormal* floating point numbers in the interval between `realmin` and `eps*realmin`. The smallest positive double precision subnormal number is

format short smallest_d = eps*realmin smallest_s = eps('single')*realmin('single')

smallest_d = 4.9407e-324 smallest_s = 1.4013e-45

Any results smaller than this are set to 0.

The controversy surrounding underflow and denormals will be the subject of my next blog post.

Get
the MATLAB code

Published with MATLAB® R2014a

I teach the MATLAB 101 class to every new hire at MathWorks. Inevitably, someone will ask me how to make a global variable. I then Google up: “How to make MATLAB Doug Cry” and find the above post. I think this is a post I should bring up every now and then because these things should be avoided today just like they should have been when I first wrote it.

I will be back from vacation soon with fresh content.

Thanks,
Doug]]>

The nineteenth Householder Symposium, Householder XIX, was held June 8-13 at Sol Cress, a conference center near Spa, Belgium. If you have been following either the web or the newletter edition of Cleve's Corner you know that the Gatlinburg/Householder series of conferences have played an important role in both my professional life and the history of MATLAB. I attended what turned out to be the third conference in the series, in Gatlinburg, Tennesse, when I was a graduate student in 1964. I have been to all 17 of the conferences that have been held since 1964. Here is a link to my News and Notes article about the Gatlinburg/Householder conferences.

The Householder Symposium has been organized over the years by an evolving volunteer committee. The symposium is not affiliated with any professional society. The organizing committee for Householder XIX was:

- Ilse Ipsen (chair), North Carolina State University, USA
- Jim Demmel, University of California at Berkeley, USA
- Alan Edelman, Massachusetts Institute of Technology, USA
- Heike Fassbender, Technical University of Braunschweig, Germany
- Volker Mehrmann, Technical University of Berlin
- Jim Nagy, Emory University, USA
- Yousef Saad, University of Minnesota, USA
- Valeria Simoncini, University of Bologna, Italy
- Zdenek Strakos, Charles University in Prague, Czech Republic
- Andy Wathen, Oxford University, UK

The local organizers in Belgium were:

- Paul Van Dooren (chair), Universite catholique de Louvain
- Pierre-Antoine Absil, Universite catholique de Louvain
- Karl Meerbergen, Katholieke Universiteit Leuven
- Annick Sartenaer, Universite de Namur
- Marc Van Barel, Katholieke Universiteit Leuven
- Sabine Van Huffel, Katholieke Universiteit Leuven

In order to facilitate interaction, attendance is limited. The organizing committee accepts applications and evaluates resumes. This time only about half of the 300 applicants were selected. This has been a controversial aspect of the conference, but I am in favor of limiting the number of participants.

A little over 100 of the attendees made it to the group photo. A few are wearing neckties and their best dresses in the photo because the conference banquet was scheduled shortly thereafter.

A list of the attendees is available on the symposium web site.

Everybody who attended the conference had an opportunity to give a talk or present a poster during the five day conference. There were a total of 26 half-hour plenary talks. Each day there were two sets of three parallel sessions of less formal 20 minute talks. A couple of days had late afternoon or evening poster sessions.

The complete program is available here. The overall theme is numerical linear algebra. Applications include partial differetial equations, image processing, control theory, graph theory, model reduction, and gyroscopes.

The Householder Prize is given for the best PhD dissertation written during the three-year period preceeding the meeting. A list of the winners of the prize in previous years is available here The committee judging the dissertations for the prize this time was:

- Volker Mehrmann, Technical University of Berlin, Germany (chair)
- Michele Benzi, Emory University, USA
- Inderjit Dhillon, UT Austin, USA
- Howard Elman, University of Maryland, USA
- Francoise Tisseur, University of Manchester, UK
- Stephen Vavasis, University of Waterloo, Canada

Volker reported that 17 theses from 9 countries were submitted. From these, a short list of six finalists were selected:

- Grey Ballard, UC Berkeley
- Saifon Chaturantabut, Rice University
- Cedric Effenberger, EPF Lausanne
- Nicolas Gillis, UC Louvain
- Agnieszka Miedlar, TU Berlin
- Yuji Nakatsukasa, UC Davis

From this short list, the two winners of the 2014 Householder Prize were announced:

- Nicolas Gillis, UC Louvain
- Yuji Nakatsukasa, UC Davis

Nicolas is now an Assistant Professor at Universite de Mons in Belgium and Yuji is an Assistant Professor at the University of Tokyo.

The two winners gave talks about their thesis work in the last days of the meeting. They shared prize money of a little over $2000 obtained by passing a hat at the banquet dinner during Householder XVIII three years ago in Lake Tahoe. A hat was passed again at this meeting to fund the next Householder Prize. The location for the next Householder meeting has not yet been decided.

Traditionally the Wednesday afternoon of the Householder meeting is devoted to an excursion of some kind. The XIX excursion was to "Notre Dame du Val-Dideu", an abbey founded in 1216 by Cistercian monks. There are no longer any monks living at the abbey -- it is now best known as a brewery. Their beer is Val-Dieu. We toured the abbey and the tiny brewery, which is operated by just five people. We then had the opportunity to taste three different beers and their wonderful cheese. Belgian abbey beers have nine percent alcohol. It was a very pleasant afternoon.

Get
the MATLAB code

Published with MATLAB® R2014a

Stiffness is a subtle concept that plays an important role in assessing the effectiveness of numerical methods for ordinary differential equations. (This article is adapted from section 7.9, "Stiffness", in *Numerical Computing with MATLAB*.)

Stiffness is a subtle, difficult, and important concept in the numerical solution of ordinary differential equations. It depends on the differential equation, the initial conditions, and the numerical method. Dictionary definitions of the word "stiff" involve terms like "not easily bent," "rigid," and "stubborn." We are concerned with a computational version of these properties.

A problem is stiff if the solution being sought varies slowly, but there are nearby solutions that vary rapidly, so the numerical method must take small steps to obtain satisfactory results.

Stiffness is an efficiency issue. If we weren't concerned with how much time a computation takes, we wouldn't be concerned about stiffness. Nonstiff methods can solve stiff problems; they just take a long time to do it.

Stiff ode solvers do more work per step, but take bigger steps. And we're not talking about modest differences here. For truly stiff problems, a stiff solver can be orders of magnitude more efficient, while still achieving a given accuracy.

A model of flame propagation provides an example. I learned about this example from Larry Shampine, one of the authors of the MATLAB ordinary differential equation suite. If you light a match, the ball of flame grows rapidly until it reaches a critical size. Then it remains at that size because the amount of oxygen being consumed by the combustion in the interior of the ball balances the amount available through the surface. The simple model is

$$ \dot{y} = y^2 - y^3 $$

$$ y(0) = \delta $$

$$ 0 \leq t \leq {2 \over \delta} $$.

The scalar variable $y(t)$ represents the radius of the ball. The $y^2$ and $y^3$ terms come from the surface area and the volume. The critical parameter is the initial radius, $\delta$, which is "small." We seek the solution over a length of time that is inversely proportional to $\delta$. You can see a dramatization by downloading `flame.m` from <http://www.mathworks.com/moler/ncmfilelist.html>.

At this point, I suggest that you start up MATLAB and actually run our examples. It is worthwhile to see them in action. I will start with `ode45`, the workhorse of the MATLAB ode suite. If $\delta$ is not very small, the problem is not very stiff. Try $\delta$ = 0.02 and request a relative error of $10^{-4}$.

```
delta = 0.02;
F = @(t,y) y^2 - y^3;
opts = odeset('RelTol',1.e-4);
ode45(F,[0 2/delta],delta,opts);
```

With no output arguments, `ode45` automatically plots the solution as it is computed. You should get a plot of a solution that starts at $y$ = 0.01, grows at a modestly increasing rate until $t$ approaches 50, which is $1/\delta$, then grows rapidly until it reaches a value close to 1, where it remains.

Now let's see stiffness in action. Decrease $\delta$ by more than two orders of magnitude. (If you run only one example, run this one.)

delta = 0.0001; ode45(F,[0 2/delta],delta,opts);

You should see something like this plot, although it will take several seconds to complete the plot. If you get tired of watching the agonizing progress, click the stop button in the lower left corner of the window. Turn on zoom, and use the mouse to explore the solution near where it first approaches steady state. You should see something like the detail obtained with this `axis` command.

axis([.995e4 1.03e4 0.9998 1.0002]) last_plot = getframe(gcf);

Notice that `ode45` is doing its job. It's keeping the solution within $10^{-4}$ of its nearly constant steady state value. But it certainly has to work hard to do it. If you want an even more dramatic demonstration of stiffness, decrease $\delta$ or decrease the tolerance to $10^{-5}$ or $10^{-6}$.

This problem is not stiff initially. It only becomes stiff as the solution approaches steady state. This is because the steady state solution is so "rigid." Any solution near $y(t) = 1$ increases or decreases rapidly toward that solution. (I should point out that "rapidly" here is with respect to an unusually long time scale.)

What can be done about stiff problems? You don't want to change the differential equation or the initial conditions, so you have to change the numerical method. Methods intended to solve stiff problems efficiently do more work per step, but can take much bigger steps. Stiff methods are *implicit*. At each step they use MATLAB matrix operations to solve a system of simultaneous linear equations that helps predict the evolution of the solution. For our flame example, the matrix is only 1 by 1, but even here, stiff methods do more work per step than nonstiff methods.

Let's compute the solution to our flame example again, this time with `ode23s`. The " *s* " in the name is for "stiff."

ode23s(F,[0 2/delta],delta,opts);

Here is the zoom detail.

axis([.995e4 1.03e4 0.9998 1.0002])

You can see that *ode23s* takes many fewer steps than *ode45*. This is actually an easy problem for a stiff solver. In fact, *ode23s* takes only 99 steps and uses just 412 function evaluations, while *ode45* takes 3040 steps and uses 20179 function evaluations. Stiffness even affects graphical output. The print files for the *ode45* figures are much larger than those for the *ode23s* figures.

Imagine you are returning from a hike in the mountains. You are in a narrow canyon with steep slopes on either side. An explicit algorithm would sample the local gradient to find the descent direction. But following the gradient on either side of the trail will send you bouncing back and forth across the canyon, as with #ode45#. You will eventually get home, but it will be long after dark before you arrive. An implicit algorithm would have you keep your eyes on the trail and anticipate where each step is taking you. It is well worth the extra concentration.

All numerical methods for stiff odes are *implicit*. The simplest example is the *backward Euler* method, which involves finding the unknown $v$ in

$$ v = y_n + h f(t_{n+1},v) $$

and then setting $y_{n+1}$ equal to that $v$. This is usually a nonlinear system of equations whose solution requires at least an approximation to the Jacobian, the matrix of partial derivatives

$$ J = {\partial F \over \partial y} $$

By default the partial derivatives in the Jacobian are computed by finite differences. This can be quite costly in terms of function evaluations. If a procedure for computing the Jacobian is available, it can be provided. Or, if the sparsity pattern is known, it can be specified. The blocks in a Simulink diagram, for example, are only sparsely connected to each other. Specifying a sparse Jacobian initiates sparse linear equation solving.

Newton's method for computing the $v$ in the backward Euler method is an iteration. Start perhaps with $v^0 = y_n$. Then, for $k = 0,1,...$, solve the linear system of equations

$$ (I - hJ) u^k = v^k - y_n - h f(t_{n+1},v^k) $$

for the correction $u^k$ . Set

$$ v^{k+1} = v^k + u^k $$

When you are satisfied that the $v^k$ have converged, let $y_{n+1}$ be the limit.

Stiff ODE solvers may not use Newton's method itself, but whatever method is used to find the solution, $y_{n+1}$, at the forward time step can ultimately be traced back to Newton's method.

`ode15s` employs two variants of a method that is quite different from the single step methods that I've described so far in this series on ode solvers. Linear multistep methods save solution values from several time steps and use all of them to advance to the next step.

Actually, `ode15s` can be compared to the other multistep method in the suite, `ode113`. One saves values of the solution, $y_n$, while the other saves values of the function, $F(t_n,y_n)$. One includes the unknown value at the new time step $y_{n+1}$ in the formulation, thereby making it implicit, while the other does not. Both methods can vary the order as well as the step size. As their names indicate, `ode15s` allows the order to vary between 1 and 5, while `ode113s` allows the order to vary between 1 and 13.

A property specified via `odeset` switches `ode15s` between two variants of a linear multistep method, BDF, Backward Differentiation Formulas, and NDF, Numerical Differentiation Formulas. BDFs were introduced for stiff odes in 1971 by Bill Gear. Gear's student, Linda Petzold, extended the ideas to DAEs, differential-algebraic equations, and produced DASSL, software whose successors are still in widespread use today. NDFs, which are the default for `ode15s`, include an additional term in the memory and consequently can take larger steps with the same accuracy, especially at lower order.

`ode23s` is a single step, implicit method described in the paper by Shampine and Reichelt referenced below. It uses a second order formula to advance the step and a third order formula to estimate the error. It recomputes the Jacobian with each step, thereby making it quite expensive in terms of function evaluations. If you can supply an analytic Jacobian then `ode23s` is a competitive choice.

`ode23t` and `ode23tb` are implicit methods based on the trapezoid rule and the second order BDF. The origins of the methods go back to the 1985 paper referenced below by a group at the old Bell Labs working on electronic device and circuit simulation. Mike Hosea and Larry Shampine made extensive modifications and improvements described in their 1996 paper when they implemented the methods in MATLAB.

Stiff ODE solvers are not actually using MATLAB's iconic backslash operator on a full system of linear equations, but they are using its component parts, LU decomposition and solution of the resulting triangular systems.

Let's look at the statistics generated by `ode23` when it solves the flame problem. We'll run it again, avoiding the plot by asking for output, but then ignoring the output, and just looking at the stats.

opts = odeset('stats','on','reltol',1.e-4); [~,~] = ode23s(F,[0 2/delta],delta,opts);

99 successful steps 7 failed attempts 412 function evaluations 99 partial derivatives 106 LU decompositions 318 solutions of linear systems

We see that at every step `ode23s` is computing a Jacobian, finding the LU decomposition of a matrix involving that Jacobian, and then using L and U to solve three linear systems.

Now how about the primary stiff solver, `ode15s`.

[~,~] = ode15s(F,[0 2/delta],delta,opts);

140 successful steps 39 failed attempts 347 function evaluations 2 partial derivatives 53 LU decompositions 342 solutions of linear systems

We see that `ode15s` takes more steps than `ode23s`, but requires only two Jacobians. It does only half as many LU decompositions, but then uses each LU decomposition for twice as many linear equation solutions.

We certainly can't draw any conclusions about the relative merits of these two solvers from this one example, especially since the Jacobian in this case is only a 1-by-1 matrix.

Cleve Moler, *Numerical Computing with MATLAB*, Electronic Edition, MathWorks, <http://www.mathworks.com/moler/index_ncm.html>,

Print Edition, SIAM Revised Reprint, SIAM, 2008, 334 pp., <http://www.ec-securehost.com/SIAM/ot87.html>.

Lawrence F. Shampine and Mark W. Reichelt, "The MATLAB ODE Suite", SIAM Journal on Scientific Computing, 18 (1997), pp.1-22, <http://www.mathworks.com/help/pdf_doc/otherdocs/ode_suite.pdf>

Lawrence F. Shampine, Mark W. Reichelt, and Jacek A. Kierzenka, "Solving Index-1 DAEs in MATLAB and Simulink", SIAM Review, 41 (1999), pp. 538-552. <http://epubs.siam.org/doi/abs/10.1137/S003614459933425X>

M. E. Hosea and L. F. Shampine, "Analysis and Implementawtion of TR-BDF2", Applied Numerical Mathematicss, 20 (1996), pp. 21-37, <http://www.sciencedirect.com/science/article/pii/0168927495001158>

R. E. Bank, W. M. Coughran, Jr., W. Fichtner., E. H. Grosse, D. J. Rose, and R. K. Smith, "Transient simulation of silicon devices and circuits", IEEE Transactions on Computer-Aided Design CAD-4 (1985), 4, pp. 436-451.

I want to repeat this plot because it represents the post on the lead-in page at MATLAB Central.

imshow(last_plot.cdata,'border','tight')

Get
the MATLAB code

Published with MATLAB® R2014a

*by Gareth Thomas*

Today I want to talk about sharing tables in reports (a table is a relatively new data type in MATLAB suitable for column-oriented or tabular data). In the current release of MATLAB (R2014a) it can be a little tricky to represent a table variable in an HTML document. So I’m going to show you how to take the reporting capabilities that MATLAB currently offers, combine them with some HTML 5 and JavaScript, and end up with a beautiful report that functions nicely as a standalone document.

Today there are basically two different ways to create HTML reports from MATLAB.

- MATLAB’s publish command
- The MATLAB Report Generator

The publish command ships with MATLAB, while the Report Generator is a product dedicated to the creation of sophisticated reports from MATLAB. Beyond this, the key difference between the two methods really comes down to when you want to deploy your algorithm. That is, do you want the report to update itself in real time? If you want to create a standalone executable, you can really only use the Report Generator.

I’m going to focus on MATLAB’s publish command. Suppose you want to pass a lot of MATLAB-generated HTML into your published document. It turns out that you can use the disp command to add that HTML code into the generated document. However it is not really easy to pass arguments, as this HTML is hard coded at publish time. To get around this I created a function (makeHTMLTableFilter) that creates HTML code that, when published, is also dynamic and responsive. It achieves this by adding Javascript code. This allows you to do things like filter the table interactively.

See the example below:

This is a very nice way of sharing your results in a form of a table while allowing others to filter the fields that they wish.

One question that I get is, “Okay, but how would I deploy this?” As it happens, this idea of using

`disp(makeHTMLTableFilter(mytable))`

is also supported in the MATLAB Report Generator.

So download makeHTMLTableFilter and give it a try. You’ll soon be making interactive reports of your own.

]]>Here’s a quick example to show you how it works. Let’s suppose you have some data. Say, the per capita consumption of margarine in the US (in pounds). Mmmm. Margarine.

`yrs = 2000:2009;`

margarineConsumption = [8.2 7 6.5 5.3 5.2 4 4.6 4.5 4.2 3.7];

plot(yrs,margarineConsumption)

As soon as you see the margarine data, you think to yourself “Wait a second! Where have I seen that trend before?” A moment of reflection, and suddenly it hits you: Maine divorce rates! That margarine plot is a dead ringer for divorce rates in Maine. Let’s do a plotyy.

`maineDivorceRate = [5 4.7 4.6 4.4 4.3 4.1 4.2 4.2 4.2 4.1];`

plotyy(yrs,margarineConsumption,yrs,maineDivorceRate)

Sure enough, the fit is remarkable*. Now let’s say we want to share our plot as an interactive web page. After adding the Plotly API code to your path, just type

`fig2plotly`

and you get a link to a web page like this:

Cool! Follow the link to try out the interaction.

* The remarkable margarine-divorce correlation was unearthed by Tyler Vigen on his fascinating Spurious Correlations page.

]]>I look up at the sky just after sunset and I see an especially bright star. It's probably a planet. But which one?

This question gives me a good opportunity to play around with MATLAB. Let's do a visualization that shows where the planets are relative to the earth and the sun. In the process, we'll use JSON services, the File Exchange, MATLAB graphics, and 3-D vector mathematics cribbed from Wikipedia.

Here is the basic grade-school illustration of the solar system, the one that shows the planets rolling around the sun like peas on a plate. For simplicity, we're just showing the sun, the earth, the moon, Venus, and Mars.

But we never see anything like this with our own eyes. Instead, we see bright spots on a dark background somewhere "up there." So let's simplify our problem to determining what direction each naked-eye planet is in. This leads to an image like this.

Our goal is to make an accurate up-to-date version of this diagram. Specifically, relative to the sun, where should we look to find the moon and the naked-eye planets (Mercury, Venus, Mars, Jupiter, and Saturn)? To do this, we need to solve a few problems.

- Find the planets
- Find the unit vector pointing from earth to each planet
- Squash all these vectors onto a single plane
- Visualize the resulting disk

First of all, where are the planets? There's a free JSON service for just about everything these days. I found planetary data hosted on Davy Wybiral's site here:

http://davywybiral.blogspot.com/2011/11/planetary-states-api.html

```
url = 'http://www.astro-phys.com/api/de406/states?bodies=sun,moon,mercury,venus,earth,mars,jupiter,saturn';
json = urlread(url);
```

Now we can use Joe Hicklin's JSON parser from the File Exchange. It returns a well-behaved MATLAB structure.

data = JSON.parse(json)

data = date: 2.4568e+06 results: [1x1 struct] unit: 'km'

The payload is in the "results" field. Each entry has three position components and three velocity components.

data.results

ans = mercury: {{1x3 cell} {1x3 cell}} sun: {{1x3 cell} {1x3 cell}} moon: {{1x3 cell} {1x3 cell}} jupiter: {{1x3 cell} {1x3 cell}} mars: {{1x3 cell} {1x3 cell}} earth: {{1x3 cell} {1x3 cell}} venus: {{1x3 cell} {1x3 cell}} saturn: {{1x3 cell} {1x3 cell}}

The distances are in kilometers, and I'm not even sure how this representation is oriented relative to the surrounding galaxy. But it doesn't really matter, because all I care about is the relative positions of the bodies in question.

Side note: We could also have used the Aerospace Toolbox to get the same information.

`[pos,vel] = planetEphemeris(juliandate(now),'Sun','Earth')`

% List of bodies we care about ssList = {'sun','moon','mercury','venus','earth','mars','jupiter','saturn'}; ss = []; for i = 1:length(ssList) ssObjectName = ssList{i}; ss(i).name = ssObjectName; ssData = data.results.(ssObjectName); ss(i).position = [ssData{1}{:}]; ss(i).velocity = [ssData{2}{:}]; end

% Plot in astronomical units au = 149597871; k = 5; cla for i = 1:length(ss) p = ss(i).position/au; line(p(1),p(2),p(3), ... 'Marker','.','MarkerSize',24) text(p(1),p(2),p(3),[' ' ss(i).name]); end view(3) grid on axis equal

This is accurate, but not yet very helpful. Let's now calculate the geocentric position vectors of each planet. To do this, we'll put the earth at the center of the system. Copernicus won't mind, because A) he's dead, and B) we admit this reference frame orbits the sun.

We're also going to use another file from the File Exchange. Georg Stillfried's mArrow3 will help us make nice 3-D arrows in space.

clf pEarth = ss(5).position; for i = 1:length(ss) % pe = position relative to earth % (i.e. a vector pointing from earth to body X) pe = ss(i).position - pEarth; % pne = normalized position relative to earth pne = pe/norm(pe); ss(i).pne = pne; mArrow3([0 0 0],pne, ... 'stemWidth',0.01,'FaceColor',[1 0 0]); text(pne(1),pne(2),pne(3),ss(i).name, ... 'HorizontalAlignment','center'); hold on end light hold off axis equal axis off axis([-1.2 1.2 -0.8 1.1 -0.2 0.8])

These are unit vectors pointing out from the center of the earth towards each of the other objects. It's a little hard to see, but these vectors are all lying in approximately the same plane.

If we change our view point to look at the system edge-on, we can see the objects are not quite co-planar. For simplicity, let's squash them all into the same plane. For this, we'll use the plane defined by the earth's velocity vector crossed with its position relative to the sun. This defines "north" for the solar system.

pEarth = ss(5).position; pSun = ss(1).position; vEarth = ss(5).velocity; earthPlaneNormal = cross(vEarth,pSun - pEarth); % Normalize this vector earthPlaneNormalUnit = earthPlaneNormal/norm(earthPlaneNormal); mArrow3([0 0 0],earthPlaneNormalUnit, ... 'stemWidth',0.01,'FaceColor',[0 0 0]); view(-45,15) axis([-1.2 1.2 -0.8 1.1 -0.2 0.8])

Now we project the vectors onto the plane defined by earth's motion around the sun. I learned what I needed from Wikipedia here: Vector Projection.

Since we are working with the normal, we are technically doing a vector rejection. Using Wikipedia's notation, this is given by

$$ \mathbf{a_2} = \mathbf{a} - \frac{\mathbf{a} \cdot \mathbf{b}}{\mathbf{b} \cdot \mathbf{b}} \mathbf{b} $$

hold on for i = 1:length(ss) pne = ss(i).pne; pneProj = pne - dot(pne,earthPlaneNormalUnit)/dot(earthPlaneNormalUnit,earthPlaneNormalUnit)*earthPlaneNormalUnit; ss(i).pneProj = pneProj; mArrow3([0 0 0],pneProj, ... 'stemWidth',0.01,'FaceColor',[0 0 1]); end hold off axis equal

We're close to the end now. Let's just calculate the angle between the sun and each element. Then we'll place the sun at the 12:00 position of our planar visualization and all the other planets will fall into place around it.

Calculate the angle between the sun and each of the bodies. Again, from the Wikipedia article, we have

$$ cos \theta = \frac{\mathbf{a} \cdot \mathbf{b}}{|\mathbf{a}||\mathbf{b}|} $$

sun = ss(1).pneProj; ss(1).theta = 0; for i = 1:length(ss) pneProj = ss(i).pneProj; cosTheta = dot(sun,pneProj)/(norm(sun)*norm(pneProj)); theta = acos(cosTheta); % The earth-plane normal vector sticks out of the plane. We can use it % to determine the orientation of theta x = cross(sun,pneProj); theta = theta*sign(dot(earthPlaneNormalUnit,x)); ss(i).theta = theta; end

cla k1 = 1.5; k2 = 1.2; for i = 1:length(ss) beta = ss(i).theta + pi/2; x = cos(beta); y = sin(beta); mArrow3([0 0 0],[x y 0], ... 'stemWidth',0.01, ... 'FaceColor',[0 0 1]); line([0 k1*x],[0 k1*y],'Color',0.8*[1 1 1]) text(k1*x,k1*y,ss(i).name,'HorizontalAlignment','center') end t = linspace(0,2*pi,100); line(k2*cos(t),k2*sin(t),'Color',0.8*[1 1 1]) line(0,0,1,'Marker','.','MarkerSize',40,'Color',[0 0 1]) axis equal axis(2*[-1 1 -1 1])

And there you have it: an accurate map of where the planets are in the sky for today's date. In this orientation, planets "following" the sun through the sky are on the left side of the circle. So Jupiter will be high in the sky as the sun sets.

And that is a very satisfying answer to my question, by way of vector math, JSON feeds, MATLAB graphics, and the File Exchange.

Get
the MATLAB code

Published with MATLAB® R2014a