Thailand Excellence Community
The readtable function creates a table in MATLAB from a data file.Additional inputs can help import irregularly formatted files. | hurrs = readtable(“hurricaneData1990s.txt”,… “NumHeaderLines”,5,”CommentStyle”,”##”); | |
You can use dot notation to access variables in a table. The scatter function creates a scatter plot of two vectors. | scatter(hurrs.Windspeed,hurrs.Pressure) | |
The categorical function creates a categorical array from data. | hurrs.Country = categorical(hurrs.Country); | |
By default, the readtable function may import certain variables in the table as datetime.In this example, hurrs.Timestamp is a datetime. | t = hurrs.Timestamp; | |
The hour function returns the hour numbers of the input datetime values. | h = hour(t); histogram(h) |
| When you import data into MATLAB, missing numerical values are replaced with NaN, which stands for Not a Number. | data = readtable(“myfile”)data = 4×2 table Var1 Var2 ____ ____ 7 0.81 1 NaN 9 0.13 10 0.91 | |
| When you calculate statistics on arrays that contain NaNs, the result in another NaN. | v = mean(data.Var2)v = NaN | |
To ignore NaNs in the calculation, use the "omitnan" flag. | v = mean(data.Var2,”omitnan”)v = 0.6167 | |
You can delete rows containing missing data with rmmissing. | cleaned = rmmissing(data)cleaned = 3×2 table Var1 Var2 ____ ____ 7 0.81 9 0.13 10 0.91 |
| Categorical arrays use less less memory and work with many plotting functions. | x = categorical([“medium” “large” “large” “red” “small” “red”]); | |
Use the categories function to get a list of unique categories. | c = categories(x)c = 4×1 cell array {‘large’ } {‘medium’} {‘red’ } {‘small’ } | |
Merge different categories with the mergecats function. | x = mergecats(x,[“small” “medium” “large”],”size”)x = 1×6 categorical array size size size red size red | |
Rename categories with the renamecats function. | x = renamecats(x,”red”,”color”)x = 1×6 categorical array size size size color size color |
Ranges in continuous data can represent categories. Categorize continues data into discrete bins with the discretize function.
>> y = discretize(X,edges,"Categorical",cats)
y | If the "Categorical" option is set, y is a categorical array. Otherwise, y is numeric bin values. |
X | Array of continuous data. X is usually numeric or datetime. |
edges | Consecutive elements in edges form discrete bins. There will be one fewer bins than the number of edges specified.You can use inf in edges to create a bin with no edge. |
"Categorical",cats | Optional input for the name of each bin category. |
plot(x,y)


As a reminder, here is a plot with default properties. There are no markers and the line color is blue.
| |
| |
| |
|
A datastore is a reference to a file or set of files. The datastore function informs where to find the files.
| Code | Description |
|---|---|
| ds = datastore(filename) | Reference a single file |
| ds = datastore(directory) | Reference a folder of files |
| data = read(ds) | Read data incrementally |
| data = readall(ds) | Read all data referenced in datastore |
If your data isn’t formatted the way datastore expects, you can set the datastore properties. Examples of common properties are shown below. You can find all the properties in the the documentation.
>> ds = datastore(filename,"Delimiter","-","TextscanFormats","%D%C%f","SelectedVariableNames",var)
ds | Reference to a collection of data. |
filename | File location. |
"Delimiter","-" | Delimiter is one or more characters that separate data values in the file. |
"TextscanFormats","%D%C%f" | Import variables using the output class in the format specification string. |
"SelectedVariableNames",var | Import only the variables listed in var. |
Once you read in multiple tables, you may want to join them together. You can join two tables in many ways. The various join functions are listed in the table below.
| Function | Example |
|---|---|
joinKey1 in Tright must have unique values and contain every key in Tleft. | |
| innerjoin | |
| outerjoin Two key variables are created. | |
outerjoin with "MergeKeys" on |
The table petdata has two categorical variables, Species and Color.Using these two variables, there are five potential groups:Orange catOrange fishBlack catBlack fishWhite cat | petdata = readtable(“petdata.txt”,”Format”,”%C%C%f”) 5×3 table Species Color Weight _______ ______ ______ cat orange 12 fish orange 0.68 cat black 14 cat white 8 fish black 0.54 | |
The findgroups function will return a group number for each element in an array.The second output is the name associated with each group number. Here, the value 1 means cat. | [grpS,speciesVals] = findgroups(petdata.Species)grpS = 1 2 1 1 2 speciesVals = 2×1 categorical array cat fish | |
The splitapply function will peform a calculation on each inputted group.You can interpret this code as “What is the average weight of each species?” | splitapply(@mean,Weight,grpS)ans = 11.3333 0.6100 | |
findgroups and splitapply are commonly used together. This code answers “What is the minimum weight of each color?”Notice that grpC has values 1, 2, and 3 because there are three different colors in the data. colorVals contains the meaning for each value. | [grpC,colorVals] = findgroups(petdata.Color) splitapply(@min,Weight,grpC)grpC = 2 2 1 3 1 colorVals = 3×1 categorical array black orange white ans = 0.5400 0.6800 8.0000 | |
accumarray calculates a value for all five potential groups.The first input is an array containing both group numbers. The first vector ( grpS) corresponds to the output rows, and the second vector (grpC) corresponds to the output colulmns.Notice that the element in the second column, third row (white fish) is 0 because there’s no data in that group. | maxWeight = accumarray([grpS grpC],Weight,[],@max)maxWeight = 14.0000 12.0000 8.0000 0.5400 0.6800 0 | |
The output of accumarray can be difficult to interpret on its own, but the format is convenient for visualizations or further processing.For example, the output can be passed directly to the bar function. | bar(maxWeight) xticklabels(speciesVals) ylabel(“Weight”) legend(colorVals) |
All graphics objects are part of a hierarchy. Most graphics objects consist of a figure window, containing one or more axes, which contain any number of plot objects.
You can use the graphics object hierarchy to modify specific graphics objects after a plot is created.

If you stored a handle to Figure, you could use the Children properties to modify the Bar plot.
| Images or 3-D plots generally begin with x, y, and z data. In many cases, the x and y data are not evenly spaced on a grid. | data = readtable(“my3Ddata”) plot3(data.x,data.y,data.z,’.’) x y z _________ ________ ___________ 2.2506 -0.30105 0.012974 -1.3443 -0.79976 -0.11638 0.53421 -0.92891 0.16945 -0.070088 -0.67461 -0.044245 … … … | |
To interpolate the data onto a grid, start by defining the grid points. Here, yvec is denser than xvec. | xvec = -2:.2:2; yvec = -2:.05:2; | |
The meshgrid function will convert your vectors into the grid expected by surf and pcolor. | [xgrid,ygrid] = meshgrid(xvec,yvec); | |
Then use the griddata function to interpolate your data onto the grid.Consistent naming of your variables from previous steps will the griddata syntax easier. | zgrid = griddata(data.x,data.y,data.z,xgrid,ygrid); | |
Once your x, y, and z data is gridded, you can visualize it in a variety of ways. surf creates a surface plot.Notice the difference between the x and y axes. This is because xvec and yvec had a different number of grid points. | surf(xgrid,ygrid,zgrid); | |
| You can also visualize your 3-D data as an pseudocolor image. | im = pcolor(xgrid,ygrid,zgrid); im.EdgeAlpha = 0; | |
This scaled image contains the same data, but the first two inputs are the vectors of grid points instead of the output from meshgrid.If you inspect the right yellow shape, you can see that the imagesc plot is vertically flipped from the pcolor plot. | imagesc(xvec,yvec,zgrid); |
To import data from files where the formatting changes and must be inferred from the data itself, you can use functions that allow you to interact directly with files.
Open the file and store the file identifier. You’ll use fid with the other low-level import functions. | fid = fopen(“myfile”); | |
You can import files line-by-line using fgetl.There is a file position indicator that keeps track of where you’re located in the file, so calling fgetl twice will return the first two lines. | firstLine = fgetl(fid)firstLine = ’09/12/2005 Level1 12.34 45 1.23e10 inf’secondLine = fgetl(fid)secondLine = ’10/12/2005 Level2 23.54 60 9e19 -inf 0.001′ | |
| To return back to the beginning of the file, you can rewind the file position indicator. | frewind(fid) | |
If you know the format of the data, you can pass a format specification string to textscan. | formatSpec = “%{MM/dd/uuuu}D %s %f32 %d8 %u %f”; myData = textscan(fid, formatSpec)myData = 1×9 cell array {3×1 datetime} {3×1 cell} {3×1 single} {3×1 int8} {3×1 uint32} {3×1 double} | |
| When you’re finished importing, make sure you close the file connection. | fclose(fid); |