GeoTrellis
   HOME

TheInfoList



OR:

GeoTrellis is an
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
, geographic data processing library designed to work with large geospatial raster data sets. It is written in Scala and has an open-source Apache 2.0 license.


Description

GeoTrellis' core competency is raster data processing: enabling distributed processing of large geospatial raster data sets using the techniques of map algebra. In addition to support for raster data operations, GeoTrellis includes some support for operations using
vector Vector most often refers to: * Euclidean vector, a quantity with a magnitude and a direction * Disease vector, an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematics a ...
and
point cloud A point cloud is a discrete set of data Point (geometry), points in space. The points may represent a 3D shape or object. Each point Position (geometry), position has its set of Cartesian coordinates (X, Y, Z). Points may contain data other than ...
data. GeoTrellis leverages
Apache Spark Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of Californ ...
for distributed processing. Distributed processing relies on indexing large datasets based on a multi-dimensional
space-filling curve In mathematical analysis, a space-filling curve is a curve whose Range of a function, range reaches every point in a higher dimensional region, typically the unit square (or more generally an ''n''-dimensional unit hypercube). Because Giuseppe Pea ...
(SFC). SFCs enable the translation of multi-dimensional indices into a single-dimensional one, while maintaining geospatial locality. This allows for efficient reading and writing of large datasets to be performed in parallel across multiple computers. Python bindings have been developed for GeoTrellis as a sub-project called GeoPySpark that enables Python developers to access and use the GeoTrellis library.


Project History

GeoTrellis started as a research project at Azavea, a geospatial software company based in Philadelphia. A precursor software component, DecisionTree, was developed beginning in 2006 with support from a Small Business Innovation Research grant from the U.S. Department of Agriculture. In 2009, with financial support from the William Penn Foundation and Stroud Water Research Center, Azavea embarked on early development of GeoTrellis. GeoTrellis was released as an open source project in 2011 with the goal of supporting fast processing of geospatial raster data at scale. GeoTrellis initially supported distributed computation through Akka, a Scala framework for building concurrent and distributed applications. The need to support additional use cases and features such as caching and sharding datasets across a storage cluster led to a search for a new distribution framework. GeoTrellis moved to
Apache Spark Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of Californ ...
as its distribution engine in 2014 in order to leverage management, scheduling, and other features in the Spark framework. One key use case that drove this phase of development was the need to efficiently process large, spatiotemporal datasets like those used for many earth science applications, such as climate change. The move to Apache Spark enabled efficient support for large climate change forecast datasets published by the
Intergovernmental Panel on Climate Change The Intergovernmental Panel on Climate Change (IPCC) is an intergovernmental body of the United Nations. Its job is to "provide governments at all levels with scientific information that they can use to develop climate policies". The World Met ...
(IPCC). GeoTrellis was submitted to the
Eclipse Foundation The Eclipse Foundation AISBL is an independent, Europe-based not-for-profit organization that acts as a steward of the Eclipse open source software development community, with legal jurisdiction in the European Union. It is an organization supp ...
's LocationTech working group in 2013 and graduated from incubation with a 1.0 release in December 2016. GeoTrellis has been used in a number of geospatial domains including: satellite and aerial image processing, forest growth simulation, agricultural yield predictions, planning, digital humanities, government infrastructure investment, and machine learning to support crime risk forecasting. It is currently integrated into other open source software projects including: Raster Foundry, Raster Frames, and GeoPySpark.


References

{{Reflist


External links


GeoTrellis homepageGeoTrellis SourceAzavea homepageLocationTech homepageRaster Foundry homepageGeoPySpark SourceRaster Frames project
Free software programmed in Scala