Skip to content

Commit 61ad464

Browse files
committed
Edits: languages.pymd & reference.pymd
1 parent 07ee233 commit 61ad464

File tree

2 files changed

+17
-15
lines changed

2 files changed

+17
-15
lines changed

pyrasterframes/src/main/python/docs/languages.pymd

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# API Languages
22

3-
One of the great powers of RasterFrames, afforded by Spark SQL, is the ability to express computation in multiple programming languages. This tutorial is centered around Python because that's the most common language used in data science and GIS analytics. However, Scala (the implementation language of RasterFrames) and SQL are also fully supported. Examples in Python can be mechanically translated into the other two languages without much difficulty once the naming conventions are understood. In the sections below we will show the same example program in each language. We will compute the average NDVI per month for a single _tile_ in Tanzania.
3+
One of the great powers of RasterFrames, afforded by Spark SQL, is the ability to express computation in multiple programming languages. This documentation focuses on Python because it is the most commonly used language in data science and GIS analytics. However, Scala (the implementation language of RasterFrames) and SQL are also fully supported. Examples in Python can be mechanically translated into the other two languages without much difficulty once the naming conventions are understood. In the sections below we will show the same example program in each language. We will compute the average NDVI per month for a single _tile_ in Tanzania.
44

55
```python, imports, echo=False
66
from pyspark.sql.functions import *
@@ -170,4 +170,6 @@ val result = red_nir_tiles_monthly_2017
170170
.agg(rf_agg_stats(rf_normalized_difference($"nir", $"red")) as "ndvi_stats")
171171
.orderBy("month")
172172
.select("month", "ndvi_stats.*")
173+
174+
result.show()
173175
```

pyrasterframes/src/main/python/docs/reference.pymd

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Function Reference
22

3-
RasterFrames provides SQL and Python bindings to many UDFs using the `Tile` column type. In Spark SQL, the functions are already registered in the SQL engine; they are usually prefixed with `rf_`. In Python, they are available in the `pyrasterframes.rasterfunctions` module.
3+
RasterFrames provides a rich set of columnar function for processing geospatial raster data. In Spark SQL, the functions are already registered in the SQL engine; they are usually prefixed with `rf_`. In Python, they are available in the `pyrasterframes.rasterfunctions` module.
44

55
The convention in this document will be to define the function signature as below, with its return type, the function name, and named arguments with their types.
66

@@ -30,7 +30,7 @@ become available for use with DataFrames. You can view all of the available func
3030

3131
## Vector Operations
3232

33-
Various LocationTech GeoMesa user-defined functions (UDFs) to deal with `geomtery` type columns are provided in the SQL engine and within the `pyrasterframes.rasterfunctions` Python module. These are documented in the [LocationTech GeoMesa Spark SQL documentation](https://www.geomesa.org/documentation/user/spark/sparksql_functions.html#). These functions are all prefixed with `st_`.
33+
Various LocationTech GeoMesa user-defined functions (UDFs) dealing with `geomtery` type columns are provided in the SQL engine and within the `pyrasterframes.rasterfunctions` Python module. These are documented in the [LocationTech GeoMesa Spark SQL documentation](https://www.geomesa.org/documentation/user/spark/sparksql_functions.html#). These functions are all prefixed with `st_`.
3434

3535
RasterFrames provides some additional functions for vector geometry operations.
3636

@@ -39,7 +39,7 @@ RasterFrames provides some additional functions for vector geometry operations.
3939
Geometry st_reproject(Geometry geom, String origin_crs, String destination_crs)
4040

4141

42-
Reproject the vector `geom` from `origin_crs` to `destination_crs`. Both `_crs` arguments are either [proj4](https://proj4.org/usage/quickstart.html) strings, [EPSG codes](https://www.epsg-registry.org/) codes or [OGC WKT](https://www.opengeospatial.org/standards/wkt-crs) for coordinate reference systems.
42+
Reproject the vector `geom` from `origin_crs` to `destination_crs`. Both `_crs` arguments are either [proj4](https://proj4.org/usage/quickstart.html) strings, [EPSG codes](https://www.epsg-registry.org/) or [OGC WKT](https://www.opengeospatial.org/standards/wkt-crs) for coordinate reference systems.
4343

4444

4545
### st_extent
@@ -76,7 +76,7 @@ Get the cell type of the `tile`. The cell type can be changed with @ref:[rf_conv
7676

7777
Tile rf_tile(ProjectedRasterTile proj_raster)
7878

79-
Get the fully realized (non-lazy) `tile` from the `ProjectedRasterTile` or `RasterSource` type tile column.
79+
Get the fully realized (non-lazy) `tile` from a `ProjectedRasterTile` struct column.
8080

8181
### rf_extent
8282

@@ -96,7 +96,7 @@ Fetch CRS structure representing the coordinate reference system of a `Projected
9696

9797
Struct rf_mk_crs(String crsText)
9898

99-
Construct a CRS structure from one of its string representations. Three froms are supported:
99+
Construct a CRS structure from one of its string representations. Three forms are supported:
100100

101101
* [EPSG code](https://www.epsg-registry.org/): `EPSG:<integer>`
102102
* [Proj4 string](https://proj.org/): `+proj <proj4 parameters>`
@@ -256,7 +256,7 @@ The binary local map algebra functions have similar variations in the Python API
256256

257257
The SQL API does not require the `rf_local_op_double` or `rf_local_op_int` forms (just `rf_local_op`).
258258

259-
Local map algebra operations for more than two tiles are implemented to work across rows in the data frame. As such, they are @ref:[aggregate functions](reference.md#tile-local-aggregate-statistics).
259+
Local map algebra operations for more than two `tile`s are implemented to work across rows in the DataFrame. As such, they are @ref:[aggregate functions](reference.md#tile-local-aggregate-statistics).
260260

261261
### rf_local_add
262262

@@ -546,43 +546,43 @@ Aggregates over all of the rows in DataFrame of `tile` and returns a count of ea
546546

547547
Local statistics compute the element-wise statistics across a DataFrame or group of `tile`s, resulting in a `tile` that has the same dimension.
548548

549-
When these functions encounter a NoData in a cell location, it will be ignored.
549+
When these functions encounter NoData in a cell location, it will be ignored.
550550

551551
### rf_agg_local_max
552552

553553
Tile rf_agg_local_max(Tile tile)
554554

555-
Compute the cell-local maximum operation over Tiles in a column.
555+
Compute the cell-local maximum operation over `tile`s in a column.
556556

557557
### rf_agg_local_min
558558

559559
Tile rf_agg_local_min(Tile tile)
560560

561-
Compute the cell-local minimum operation over Tiles in a column.
561+
Compute the cell-local minimum operation over `tile`s in a column.
562562

563563
### rf_agg_local_mean
564564

565565
Tile rf_agg_local_mean(Tile tile)
566566

567-
Compute the cell-local mean operation over Tiles in a column.
567+
Compute the cell-local mean operation over `tile`s in a column.
568568

569569
### rf_agg_local_data_cells
570570

571571
Tile rf_agg_local_data_cells(Tile tile)
572572

573-
Compute the cell-local count of data cells over Tiles in a column. Returned `tile` has a cell type of `int32`.
573+
Compute the cell-local count of data cells over `tile`s in a column. Returned `tile` has a cell type of `int32`.
574574

575575
### rf_agg_local_no_data_cells
576576

577577
Tile rf_agg_local_no_data_cells(Tile tile)
578578

579-
Compute the cell-local count of NoData cells over Tiles in a column. Returned `tile` has a cell type of `int32`.
579+
Compute the cell-local count of NoData cells over `tile`s in a column. Returned `tile` has a cell type of `int32`.
580580

581581
### rf_agg_local_stats
582582

583583
Struct[Tile, Tile, Tile, Tile, Tile] rf_agg_local_stats(Tile tile)
584584

585-
Compute cell-local aggregate count, minimum, maximum, mean, and variance for a column of Tiles. Returns a struct of five `tile`s.
585+
Compute cell-local aggregate count, minimum, maximum, mean, and variance for a column of `tile`s. Returns a struct of five `tile`s.
586586

587587

588588
## Converting Tiles
@@ -599,7 +599,7 @@ Create a row for each cell in `tile` columns. Many `tile` columns can be passed
599599

600600
Int, Int, Numeric* rf_explode_tiles_sample(Double sample_frac, Long seed, Tile* tile)
601601

602-
Python only. As with @ref:[`rf_explode_tiles`](reference.md#rf-explode-tiles), but taking a randomly sampled subset of cells. Equivalent to the below, but this implementation is optimized for speed. Parameter `sample_frac` should be between 0.0 and 1.0.
602+
Python only. As with @ref:[`rf_explode_tiles`](reference.md#rf-explode-tiles), but taking a randomly sampled subset of cells. Equivalent to the `rf_explode-tiles`, but allows a random subset of the data to be selected. Parameter `sample_frac` should be between 0.0 and 1.0.
603603

604604
### rf_tile_to_array_int
605605

0 commit comments

Comments
 (0)