Presentation and options Dialog boxes of the application
Graphic examples Syntax

Presentation and options

This application generates a raster in which each cell value is obtained by interpolation between the values in a numerical field of the database of a points file, or the third coordinate in the case of 3D files (where Z can be topographic, but can also be another variable, such as temperature, concentration of a pollutant, etc). The generated raster corresponds to a continuous surface and, therefore, it would be meaningless to use this method to obtain a raster of categorical values (in this case, for example, the Thiessen polygons generation can be used through the application with that name, in the "Tools | Terrain interpolation and analysis" menu). The points file must be an structured file (PNT format). Some points in the dataset can be excluded according to various global (through selections by attribute, explained below), and local criteria. It is also possible to interpolate between points with more than one record in the database (e.g., water analysis made at different dates in the same place) and in this case the application will work according to the options explained later in "Advanced options".

As for most of the calculation procedures, interpolation between points can give widely different results according to the choice of options in the application. The chosen method, the threshold distance, the weighting applied, the use of exclusion masks, etc., can help to improve results enormously. Knowledge of the data to be processed and the use of trial runs with varying parameters can be a great help in order to select how best to use these parameters. Furthermore, if possible, it is useful to keep back a set of randomly distributed and independent points in order to carry out a quantitative evaluation (test) of the results.

In quantile calculations, such as the median, it is possible to indicate, with the modifier /MEDIANA_EMPAT=, the type of tiebreaker to be used for its calculation when the position of the quantile is between two values of the series. For more information, see general syntax.

In the current version there are five possible interpolation methods: the inverse distance and the spline function, trend surfaces, kriging and neighborhood statistics.

For the inverse distance method, the interpolation of each point (cell) is carried out by assigning weights to the values of the sample points in inverse proportion to the distance that separates each sample point from the center of the pixel. This distance can be calculated as a Euclidean distance, or using the simplified (Manhattan/Taxi Cab) distance. For more information, please consult https://en.wikipedia.org/wiki/Inverse_distance_weighting.

Inverse distance formula
ß is the power of the inverse of the distance
Calculation of the interpolated value using the inverse distance method

For the spline method a regular, continuous and derivable function, or set of functions if the area is divided into different regions, is determined which best adapts to the sample points without loosing its property of continuity. For more information on the analytical expression of this function consult:

Equations for the spline function
T(x,y) is the flattening and φ is the tension
Equations for the spline functions

Regarding the method of generating trend surfaces, it is about adjusting a polynomial function with respect to the X, Y of degree 1, 2 or 3 that will result in a smooth, continuous and derivable surface resulting from the adjustment. This adjustment is done by least squares and will allow to determine the coefficients of each term of the polynomial.

Expressions for polynomial functions
Expressions for linear polynomial functions (grade 1), quadratic (grade 2) and cubic (grade 3)

With the neighborhood statistics method, the designated statistic (mean, median, maximum, minimum, range, standard deviation, mean absolute deviation around the median or number of occurrences, percentile) is assigned to each cell, calculated considering all the points that are within a circle defined by a radius, or within a square defined by one side, located in the center of each cell that is to be interpolated. Unlike other methods of InterPNT, and in particular of the method that makes the interpolation by weighting a user-defined power of inverse distance but also limits this calculation to a certain distance radius (maximum distance), in the neighborhood statistics mode once all the points present within the indicated scope (circular or square) are selected, no weighting is done but the requested calculation is made directly, which can be a measure of the centrality of the different values (average or median), of dispersion of these values (maximum, minimum, range, standard deviation, mean absolute deviation around the median) or, simply, a count of the number of available values (number of occurrences).

This is a typical method for processing lidar data, and particularly for the production of Digital Surface Models (DSM) or Digital Height Models (DHM). In these two applications, the highest elevation obtained by the lidar in the raster cell area to be obtained is usually taken. This area can be the area of the cell if the point density is high enough in relation to the size of the cell, or somewhat larger (for example a radius 1.5 times the side of the cell); please consult the specific options below for more details.

If the point density is very high, instead of the maximum, a 90 percentile, etc. can be requested, which reduces the probability of adopting as a real surface some sensor artifact such as the detection of birds flying at that moment. With other statistics, such as dispersion statistics, interesting data on the vertical structure of vegetation masses can be obtained. Finally, by choosing the minimum as statistics, combined with the selection of points corresponding to unique returns (information available in one of the fields of the attribute table) and high return intensities (data also available in another field), and with very large scan radii (several times the cell side to be obtained), a Digital Elevation Model (DEM) can be obtained. If a DEM is already available, or it has been calculated with lidar data as just described, a DHM can be obtained (the altimetric data of vegetation or human constructions are measured from the ground and not above sea level); in this case the most practical thing is, before running InterPNT, to run the application CombiCap in the form DEM+PNT-->PNT and with the statistical field transfer mode using bilinear interpolation, where the second PNT will obtain the elevations at each point; An empty numeric field is then generated, typically with 2 or 3 decimal places, so that, using CalcImg, it can be filled with the point height (calculated as the subtraction between the altitude obtained from the lidar and the elevation obtained from CombiCap). This height is the one that is submitted to InterPNT to obtain the MDH.

General procedure

First of all, it is necessary to choose whether to use the value of the third coordinate (in the case of 3D files) or to select a field to interpolate the database associated with the points file. In the case of choosing a database field, you must select a numeric field. Although the values of this field can be integers, the interpolator will treat these data as double precision real values in all internal calculations and generate an output raster with real precision.

The selection of records is the first filter that chooses which records, in the points database, will be considered in the interpolation. The default option is 'All records', but often the quality and speed of the interpolation can be improved if those records that we do not want to influence the result are excluded. The choice of records is made by constructing a logical sentence based on any combination of the fields in the database associated with the points file. Notice that the fields involved in the selection do not necessarily have to be used in the interpolation. For example, in a file containing atmospheric emissions one field could indicate the district where each industry is located and, if we wish to interpolate values of a certain pollutant, we could choose to interpolate only the values for a particular district. We recommend using the option that can be opened from the 'Apply selection' option in order to construct the logical sentence, based on a selection of fields, operators, values, links and orders of priority using the buttons and the corresponding pull-down menus. In the section on the 'Syntax' of the command line, the elements of this sentence are detailed. This works in a similar way to queries on MiraMon attributes.

The output file will be an uncompressed real value raster (except when exclusion masks are defined, in which case it will be compressed and real). This raster will have a predefined background, or NoData, value in those cells for which no value could be calculated because at that location it was not possible to satisfy the criteria of the 'Advanced Options', either because the point was inside an exclusion mask or because it did not have a minimum of points within the maximum allowable distance.

It is necessary to define the extent and cell size for the output raster by either giving the corner co-ordinates or by adjusting the extent to the area covered by the file with the points. It should be noted that the speed of the process is very sensitive to these values. A small cell size in a large interpolation area without any defined mask will give very precise values (that vary smoothly as a function of distance), but the number of calculations will be very large and the execution time will be long.

Inverse distance options

It is necessary to define, firstly, how to calculate distances; whether as a Euclidean distance, or as a 'Manhattan' distance (see previous section for the definitions of both distances). The value assigned to each cell is a weighted average of the values of the points under consideration. The weight given to each point is proportional to the inverse of an exponential power of the distance between the point and the center of the pixel. The default value of this exponential power is 2, but this parameter can be changed in the way that the user considers most appropriate. The bigger the value, the more weight given to the points nearest to the pixel and the less weight given to more distant points.

The maximum distance is a commonly used parameter. It establishes the region of influence of each point as a circle centered on the point and with a radius equal to the maximum distance. Thus, for each cell we have a distance beyond which the original points play no part in the interpolation process. It is advisable to always use this parameter since, apart from intuitively making sense in the interpolation process, it also allows the execution time to be reduced considerably.

Another interesting criterion is to define a maximum number of points to use in the interpolation. Evidently these will always be the 'n' points nearest to the center of the pixel. The repeated ranking process needed to apply this criterion slows down the calculation considerably, but the use of this criterion assures that only the 'n' nearest points are taken into account. To make sure that the calculation is done with a sufficient sample it is also possible to define a minimum number of points for the interpolation at each cell. If this minimum is not reached the cell's value will be NoData.

Spline options

Interpolation by splines can be very sensitive to the configuration parameters, depending on the sample points. The tension parameter modulates the smoothing of the function; its default value is 40 and excessively high or low tensions can lead to unexpected effects. The flattening parameter adjusts the noise of the sample, the default value is 0 which implies that the function passes exactly through the sample points (exact interpolation). The analysis block size parameter limits the fit by dividing the area into analysis blocks and restricting the solution of the spline functions to this region. This parameter speeds up the process considerably. It is recommended when there are many points available, but it can produce unwanted discontinuities if the block size is too small (analysis window too local).

The spline interpolation method works optimally for samples without excessive disturbances. However, when points are close to each other and have abrupt variations for the variable to be interpolated this method may result in values which are well outside the expected range. In this case, it is recommended to increase the tension value and to choose the complementary option that enforces the output results to be between a maximum and a minimum. Thus, it will be necessary to define the modifies /MIN_VAL i /MAX_VAL.

If the analysis block size is not defined, a single solution is found for the whole region covered by the raster (the window is the same as the raster extent). If the size is small the area is divided into many independent regions.

Options for trend surfaces

The main parameter of this option is the degree that determines the order of the polynomial to adjust. In the current version it is possible to generate first-degree or linear, second-degree or quadratic and third-degree or cubic surfaces.

In the same way that for spline, it is possible to generate different solutions, not strictly continuous, by dividing the global scope into more local sub-regions. These sub-regions are defined by the side of the analysis block parameter.

Opciones para el kriging

The kriging interpolation method is based on the principles of geostatistics. In order to apply it, it is necessary to have previously generated an adjusted variogram file (with the extension "vam"). This file can be generated with the Vargram application, in which help instructions can be found about how to model the spatial pattern of the variable to be analyzed that gives rise to the variogram that will be applied for interpolation with this method. In the advanced options of InterPNT, it is also possible to limit the number of points closest to each cell that participate in its resulting value, as well as obtain an auxiliary raster of the quality of the estimate, where the standard deviations for each cell will be stored. More information about kriging can be found at https://en.wikipedia.org/wiki/Kriging.

Options for neighborhood statistics

Two parameters must be defined: the statistical function to be used, and the maximum distance at which a point is considered to be part of the set of points that provide the values for the calculation of the statistics to be transferred to the cell of the raster output.

The statistical functions available are the average, the median, the maximum, the minimum, the rank, the standard deviation, the mean absolute deviation around the median and the number of occurrences.

For the maximum distance there are two possible options: maximum distance as maximum radius of a circle, and maximum distance as half of the side of a square. This last option is typically used when it is desired that the lidar points with which the statistical value will be calculated are strictly located within the cell; note that, since the data is half of the square, if a 2-meter-side raster is desired, the distance to be indicated for the maximum distance will be 1 meter. When the calculated statistic value comes from a single point, as in the case of the minimum and maximum, it is possible to generate additional layers (with the same name as the output layer and an appropriate suffix): On one hand, if the lidar data provides, in separate fields, the day, month, and year of the capture of each point, it is possible to generate 3 additional rasters that will contain the day, month, and year of the point providing the value for each pixel. On the other hand, for educational and research purposes, it is possible to generate additional layers containing only the donor points or that connect the donor points with lines to the center of the cell that receives the value.

Advanced options

The advanced options allow finer adjustment of the parameters of some of the global interpolation procedures.

For the case where there is more than one record associated with a chosen point (e.g., where there are various measurements at different times at the same water source), we can choose how to interpolate the value from this point: The options are a) ignore this point as a multiple record (as though it did not form part of the selection); b) select the value of the first record that satisfies the selection conditions; c) take the average of the records; d) take the sum of the values.

Another possibility is to select points so that those that fall outside the area of the raster are ignored. This option is not recommended if the values near the borders of the raster are important.

Finally, we can also decide whether for some regions inside the area of the raster, it is necessary or not to know the results of the interpolation. Using a file that acts as an exclusion mask and so determining which areas are useful and which are not, we can accelerate considerably the calculation process, and we obtain more coherent results for pointless exclusion zones; for example, if we are interpolating elevations from spot heights we can exclude all sea areas. This mask may be a raster in which the regions to be excluded will be those with NoData values in the cells. The mask may also be a structured or unstructured polygon vector file. In this case the area to be excluded will be, by default, the universal polygon, otherwise the mask will include those polygons with attributes equal to or different from some explicitly defined value. For structured vectors it is also necessary to specify which field from the attribute table will define the mask.


Dialog boxes of the application

InterPNT dialog boxes


Graphic examples


Example of a DEM generated by applying the inverse distance method


Example of a DEM generated by applying the spline method


Example of a DEM generated by applying the kriging method


Example of a DEM generated by applying the trend surfaces method

The 4 previous images have been generated by applying the indicated interpolation methods from the same set of coordinates. Source: Lluís Pesquer.


Example of a result with lidar points applying the neighborhood statistics method (in this case the selected statistic was the mean) to generate a raster with a cell side of 2 m (the grid of the sides of the generated cells is shown).
The maximum search distance in this case was a circle with a radius of 2 m. For the cell marked with a red annotation, the program selected 5 points, from which the average of their values will be performed


Example of the result of obtaining a 2 m cell side MDS from lidar data taking the maximum as the statistical criterion of the neighborhood statistics.
The tooltip shows, in addition to the maximum orthometric altitude (m above sea level) of the surface in the cell, the year, month and day of the lidar capture of the point that provides the altitude.
The resulting DSM is displayed with a transparency of 40% on an orthophoto from the Cartographic and Geological Institute of Catalonia.
The following figure can be queried to obtain the height of buildings, trees, etc., above ground level.


Example of the result of obtaining an DHM of 2 m of cell side from lidar data taking the maximum as the statistical criterion of the neighborhood statistics.
The data considered were heights calculated, prior to the execution of InterPNT, in a field of the points' attributes.
This calculation was carried out by subtracting (CalcImg), in each lidar point record, the orthometric altitude of the surface obtained by the sensor minus the elevation obtained from the DEM (CombiCap in bilinear interpolation mode).
The tooltip shows the maximum height of the trees above ground level in the queried cell.
The resulting DHM is displayed with a transparency of 40% on an orthophoto from the Cartographic and Geological Institute of Catalonia.

Source of the 3 previous images: Pons, X.


Syntax

Syntax:

  • InterPNT PNT_file Field IMG_file Xmin Xmax Ymin Ymax Side Duplicates [/COND#_CAMP] [/COND#_OP] [/COND#_VALOR] [/COND#_NEXE] [/COND#_PRIOR] [/EXPONENT] [/EIXAMPLE] [/MAX_DIST] [/MAX_PUNTS] [/MIN_PUNTS] [/TENSIO] [/APLAN] [/MIN_VAL] [/MAX_VAL] [/GRAU] [/IGNORAR_PNTS_EXT] [/ERROR_VC] [/MASCARA] [/OPER_MASC] [/TAULA_MASC] [/CAMP_MASC] [/ATR_MASC] [/REPE] [/ESTAD_VEINATGE] [/MAX_DIST_XY] [/COTA3D] [/CAMP_DATA] [/MEDIANA_EMPAT]

Parameters:

  • PNT_file (PNT file - Input parameter): Structured points file containing the values to be interpolated. In this case, it is possible to indicate, optionally, a set of parameters that define the selection (for instance: /COND1_CAMP=). To know more about the values of these parameters, please read the general syntax document.
  • Field (Field to be interpolated - Input parameter): Field name containing the data to be interpolated in "string to access BD4 field" format. In case of choosing the option "Use the value of the third coordinate", please type "x".
  • IMG_file (IMG file - Output parameter): The resulting raster with the interpolated values in each cell.
  • Xmin (X minimum - Input parameter): Defines the X minimum of the extent of the resulting raster. The values 'x' and 'p' for all four parameters ('x x x x' or 'p p p p') indicate that the extent is the same as that of the points file; 'x' will make sure that this extent squares with the defined pixel size.
  • Xmax (X maximum - Input parameter): Defines the X maximum of the extent of the resulting raster. The values 'x' and 'p' for all four parameters ('x x x x' or 'p p p p') indicate that the extent is the same as that of the points file; 'x' will ensure that this extent fits with the defined pixel size.
  • Ymin (Y minimum - Input parameter): Defines the Y minimum of the extent of the resulting raster. The values 'x' and 'p' for all four parameters ('x x x x' or 'p p p p') indicate that the extent is the same as that of the points file; 'x' will make sure that this extent squares with the defined pixel size.
  • Ymax (Y maximum - Input parameter): Defines the X maximumof the extent of the resulting raster. The values 'x' and 'p' for all four parameters ('x x x x' or 'p p p p') indicate that the extent is the same as that of the points file; 'x' will ensure that this extent fits with the defined pixel size.
  • Side (Side - Input parameter): Size of the pixel in the output raster file.
  • Duplicates (Duplicates - Input parameter): Calculation option when there are multiple records: 0 ignore the point, 1 take the first as valid, 2 take the average and 3 take the sum.

Modifiers:

    /COND#_CAMP= (Field of condition #) Field name of # condition of selection. It may be made up to 100 simple queries and 100 field names (COND#_CAMP) starting at index 1. To know more about the values of this modifier, please read the general syntax document. (Input parameter)
  • /COND#_OP= (Operation of condition #) Operation of # condition of selection. It may be made up to 100 simple queries and 100 operations (COND#_OP) starting at index 1. (Input parameter)
  • /COND#_VALOR= (Value of condition #) Value of # condition of selection. It may be made up to 100 simple queries and 100 values (COND#_VALOR) starting at index 1. (Input parameter)
  • /COND#_NEXE= (Nexus linking of condition #) Nexus use to linking for the successive selections. In this case to linking between condition # and condition #+1. It may be made up to 100 simple queries and 99 nexus (COND#_NEXE) starting at index 1. (Input parameter)
  • /COND#_PRIOR= (Priority of nexus linking of condition #) Priority of nexus use to linking for the successive selections. In this case to linking between condition # and condition #+1. It may be made up to 100 simple queries and 99 priorities (COND#_PRIOR) starting at index 1. This is an optional parameter. If it is not indicated the priority is just the order. (Input parameter)
  • /EXPONENT= (Exponent) Value of the exponent (power) that dictates the dependence of the cell value on the inverse distance to each point. By default, the value is 2, that is an inverse square law by which the weight of each point is inversely proportional to the square of the distance from the center of the pixel. (Input parameter)
  • /EIXAMPLE (EIXAMPLE) Distances are determined 'Manhattan' style (Taxi-cab / city block). This is simpler and faster than calculating the Euclidean distance, the default in the absence of this parameter. (Input parameter)
  • /MAX_DIST= (Maximum distance) For the inverse distance method, the maximum distance from the center of the pixel of points to be considered when performing the interpolation. If not set, all points are considered. For the spline or trend surfaces methods, determines the size of the regions into which the interpolation area may be divided (Input parameter)
  • /MAX_PUNTS= (Maximum points) The maximum number of points (i.e., the n nearest points) used in the calculations. If not set, all points are considered. (Input parameter)
  • /MIN_PUNTS= (Minimum points) The minimum number of points needed to ensure that the interpolated value is valid. (Input parameter)
  • /TENSIO= (smoothness) Modulates the smoothness of the 'spline' function. (Input parameter)
  • /APLAN= (Flattening) Flattening, noise level. (Input parameter)
  • /MIN_VAL= (Inferior saturation value) Lower saturation value to avoid extreme values out of range. (Input parameter)
  • /MAX_VAL= (Superior saturation value) Higher saturation value to avoid extreme values out of range. (Input parameter)
  • /GRAU= (Grade) Level (1, 2, 3) of the polynomic function. (Input parameter)
  • /IGNORAR_PNTS_EXT (Ignore outside points) 1 Indicates that points outside the extents of the output raster should not be considered. (Input parameter)
  • /ERROR_VC (Cross validation error) Quality parameter (RMS) of cross validation method resulting from the interpolation model. (Input parameter)
  • /MASCARA= (Mask) The name of the file that limits the regions of the output raster over which it is necessary to perform the interpolation. If no mask is set, the interpolation is performed for the whole area defined by the raster. (Input parameter)
  • /OPER_MASC= (Operator) Equal-to (EQ) or not-equal-to (NO_EQ) operator; determines which polygons should be included in the interpolation when the mask is not the universal polygon. (Input parameter)
  • /TAULA_MASC= (Mask table) When the mask is an structured vector file, indicates the table of the selected database. For more information on the values of these parameters refer to the general syntax document. (Input parameter)
  • /CAMP_MASC= (Mask field) When the mask is an structured vector file, indicates the field of the selected database. For more information on the values of these parameters refer to the general syntax document. (Input parameter)
  • /ATR_MASC= (Attribute) Attribute that determines which polygons should be included in the interpolation when the mask is not the universal polygon. (Input parameter)
  • /REPE= (Multiple record) When the mask is a structured file with multiple records, indicates which record is used when there is more than one record for a graphic identifier (multiple record). For more information on the values of this parameter refer to the general syntax document. (Input parameter)
  • /ESTAD_VEINATGE= (Neighborhood statistics) Value indicating the statistical function chosen to make neighborhood statistics of the source file. In this case, it is also necessary to indicate one of the parameters /MAX_DIST or /MAX_DIST_XY. The different statistical functions available are:
    • InterPNT_MITJANA: Mean.
    • InterPNT_MEDIANA: Median.
    • InterPNT_MINIM: Minimum.
    • InterPNT_MAXIM: Maximum.
    • InterPNT_RANG: Range.
    • InterPNT_DESVIACIO_EST: Standard deviation.
    • InterPNT_DESVIACIO_MEDIANA: Mean absolute deviation around the median.
    • InterPNT_N_OCURRENCIES: Ocurrences.
    (Input parameter)
  • /MAX_DIST_XY= (Maximum distance as half of the side of a square.) Maximum distance as half of the side of a square to which a point is considered to be part of the set of points that provide the values for the calculation of the statistic that will be transferred to the output raster cell. (Input parameter)
  • /COTA3D (Use 3D dimension of the input points) When the point file is 3D, the 3D dimension of the input points is used instead of using a field in the source database. (Input parameter)
  • /CAMP_DATA= (Field of the main table of points containing the date of the point) This parameter allows to specify the field in the main table of points that contains the date of the point. This option applies to the minimum, maximum and percentile modes of the neighborhood statistics, and it is useful to be able to know, for example, the year, or the month of the year of the height of the vegetation obtained from a lidar file. (Input parameter)
  • /MEDIANA_EMPAT= (Decision for quantiles) If the calculation of a quantile (such as the median, a quartile or a percentile) has been requested, it indicates the tiebreaking criterion to be used for its calculation. To learn more about the values of this parameter, please consult general syntax document. (Input parameter)