Let's explore working with raster data in Sedona and Wherobots Cloud. This notebook covers:
You can find more information on working with rasters in the documentation here.
from sedona.spark import *
from pyspark.sql.functions import expr
from IPython.display import display, HTML
import rasterio
from matplotlib import pyplot
config = SedonaContext.builder(). \
config("spark.hadoop.fs.s3a.bucket.wherobots-examples.aws.credentials.provider","org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"). \
getOrCreate()
sedona = SedonaContext.create(config)
Warning: Ignoring non-Spark config property: fs.s3a.aws.credentials.provider 24/02/29 15:10:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 24/02/29 15:10:43 WARN MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties 24/02/29 15:10:44 WARN S3ABlockOutputStream: Application invoked the Syncable API against stream writing to qjnq6fcbf1/spark-logs/spark-3ca049c8cf2c4c0fa188d596eefdc4d0.inprogress. This is unsupported 24/02/29 15:11:26 WARN SedonaContext: Python files are not set. Sedona will not pre-load Python UDFs.
Raster data is gridded data where each pixel has one or more values associated with it (bands) and references a geographic location. Common raster data include aerial imagery, elevation models, precipitation maps, and population datasets.
When working with raster data in Sedona we store rasters and their associated metadata as rows in a table and query them using Spatial SQL.
One of the tools available to us when analyzing raster data is map algebra: performing calculations using the pixel values.
Sedona supports a number of raster data formats, including Arc/Info ASCII grid, NetCDF, and GeoTiff. See raster loaders documentation page for more.
Rasters can be loaded as "out-of-database" (out-db) rasters from a remote file path, "in-database" (in-db). Out-db allows for managing large raster datasets stored on cloud storage, efficient handling of remote data, while in-db is useful when the raster data needs to managed within the database (data integrity / access efficiency, etc)
RS_FromPath
- Create an out-db raster from remote file path¶Let's load a single high-resolution aerial image from the NEON dataset.
file_url = "s3://wherobots-examples/data/examples/NEON_ortho.tif"
neon_df = sedona.sql(f"SELECT RS_FromPath('{file_url}') AS raster")
neon_df.createOrReplaceTempView("neon")
neon_df.printSchema()
root |-- raster: raster (nullable = true)
neon_df.show(truncate=False)
+---------------------------------------+ |raster | +---------------------------------------+ |LazyLoadOutDbGridCoverage2D[not loaded]| +---------------------------------------+
sedona.sql("SELECT RS_MetaData(raster) from neon").show(truncate=False)
[Stage 4:> (0 + 1) / 1]
+------------------------------------------------------------------------+ |rs_metadata(raster) | +------------------------------------------------------------------------+ |[470000.0, 7228000.0, 1000.0, 1000.0, 1.0, -1.0, 0.0, 0.0, 32606.0, 3.0]| +------------------------------------------------------------------------+
sedona.sql("SELECT RS_NumBands(raster) FROM neon").show()
+-------------------+ |rs_numbands(raster)| +-------------------+ | 3| +-------------------+
htmlDf = sedona.sql("SELECT RS_AsImage(raster, 500) FROM neon")
SedonaUtils.display_image(htmlDf)
rs_asimage(raster, 500) | |
---|---|
0 |
sedona.sql("SELECT RS_Values(raster, Array(ST_SetSRID(ST_Point(470000, 7227880), 32606)), 2) FROM neon").show(truncate=False)
+-------------------------------------------------------------------------+ |rs_values(raster, array(st_setsrid(st_point(470000, 7227880), 32606)), 2)| +-------------------------------------------------------------------------+ |[104.0] | +-------------------------------------------------------------------------+
Performing raster math is one of the most common and powerful raster workflows (used for example to compute spectral indexes). Sedona supports a number of band-specific raster math operations such as:
RS_Add
, RS_Divide
, RS_Mean
, RS_NormalizedDifference
, etcRS_MapAlgebra
RS_MapAlgebra
¶Used to apply a map algebra script on a raster (or multiple rasters). For example to calculate NDVI or AWEI.
Let's calculate the Normalized Difference Greenness Index (Escadafal & Huete 1991) for our raster
\begin{equation*} {NDGI} = \frac{G-R}{G+R} \end{equation*}ndgi_df = sedona.sql("""
SELECT
RS_MapAlgebra(raster, 'D', 'out = ((rast[1] - rast[0]) / (rast[1] + rast[0]));')
AS raster
FROM neon
""")
ndgi_df.createOrReplaceTempView("ndgi")
ndgi_df.printSchema()
root |-- raster: raster (nullable = true)
sedona.sql("SELECT RS_NumBands(raster) FROM ndgi").show()
+-------------------+ |rs_numbands(raster)| +-------------------+ | 1| +-------------------+
sedona.sql("SELECT RS_SummaryStats(raster) FROM ndgi").show(truncate=False)
+----------------------------------------------------------------------------------------------------------------------+ |rs_summarystats(raster) | +----------------------------------------------------------------------------------------------------------------------+ |[1000000.0, -10863.97160595989, -0.010863971605958938, 0.045695423898689726, -0.1493212669683258, 0.18421052631578946]| +----------------------------------------------------------------------------------------------------------------------+
sedona.sql("SELECT RS_Values(raster, Array(ST_SetSRID(ST_Point(470000, 7227880), 32606)), 1) FROM ndgi").show(truncate=False)
+-------------------------------------------------------------------------+ |rs_values(raster, array(st_setsrid(st_point(470000, 7227880), 32606)), 1)| +-------------------------------------------------------------------------+ |[-0.09956709956709957] | +-------------------------------------------------------------------------+
sedona.sql("SELECT RS_Values(raster, Array(ST_SetSRID(ST_Point(470700, 7228000), 32606)), 1) FROM ndgi").show(truncate=False)
+-------------------------------------------------------------------------+ |rs_values(raster, array(st_setsrid(st_point(470700, 7228000), 32606)), 1)| +-------------------------------------------------------------------------+ |[0.03636363636363636] | +-------------------------------------------------------------------------+
binary_raster_df = sedona.sql("SELECT RS_AsGeoTiff(raster) AS raster FROM ndgi")
binary_raster_df.write.format("raster").mode("overwrite").save("ndgi.tif")
Previously we saw loading a single raster into a single row. We typically work with many rasters across many rows. Let's load a raster dataset from the WorldClim historical climate dataset of precipitation from 1970-2000. Each raster file represents the average precipitation across all years for each month of the year.
PREC_URL = "s3://wherobots-examples/data/examples/world_clim/wc2.1_10m_prec" #/wc2.1_10m_prec_01.tif
rawDf = sedona.read.format("binaryFile").load(PREC_URL + "/*.tif")
rawDf.createOrReplaceTempView("rawdf")
rawDf.show()
+--------------------+-------------------+------+--------------------+ | path| modificationTime|length| content| +--------------------+-------------------+------+--------------------+ |s3://wherobots-ex...|2024-02-28 19:59:56|669568|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 19:59:55|642783|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 19:59:57|632536|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 19:59:56|628584|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 19:59:58|624869|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 19:59:58|619008|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 20:00:00|615937|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 19:59:59|615576|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 19:59:54|594226|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 19:59:53|581807|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 20:00:00|576683|[49 49 2A 00 08 0...| |s3://wherobots-ex...|2024-02-28 20:00:01|561700|[49 49 2A 00 08 0...| +--------------------+-------------------+------+--------------------+
rasterDf = sedona.sql("""
SELECT
RS_FromGeoTiff(content) AS raster,
Int(regexp_extract(path, '(.*)([0-9]{2}).tif', 2)) AS month
FROM rawdf
""")
rasterDf.createOrReplaceTempView("prec")
rasterDf.printSchema()
root |-- raster: raster (nullable = true) |-- month: integer (nullable = true)
sedona.sql("SELECT RS_MetaData(raster) FROM prec").show(truncate=False)
+------------------------------------------------------------------------------------------------+ |rs_metadata(raster) | +------------------------------------------------------------------------------------------------+ |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| |[-180.0, 90.0, 2160.0, 1080.0, 0.16666666666666666, -0.16666666666666666, 0.0, 0.0, 4326.0, 1.0]| +------------------------------------------------------------------------------------------------+
sedona.sql("SELECT month, RS_SummaryStats(raster) FROM prec ORDER BY month ASC").show(truncate=False)
[Stage 19:> (0 + 6) / 6]
+-----+---------------------------------------------------------------------------+ |month|rs_summarystats(raster) | +-----+---------------------------------------------------------------------------+ |1 |[808053.0, 3.2816839E7, 40.612235831036806, 71.58202261528564, 0.0, 908.0] | |2 |[808053.0, 3.110572E7, 38.49465319724863, 65.22884915327343, 0.0, 793.0] | |3 |[808053.0, 3.4943215E7, 43.24371668684569, 67.88342263061789, 0.0, 720.0] | |4 |[808053.0, 3.4283833E7, 42.42770338082592, 61.129070691979166, 0.0, 1004.0]| |5 |[808053.0, 3.6639583E7, 45.34304432999431, 63.51998004686114, 0.0, 2068.0] | |6 |[808053.0, 3.8714648E7, 47.9110256382872, 72.94226238575892, 0.0, 2210.0] | |7 |[808053.0, 4.6393435E7, 57.413851566716545, 82.74978674918007, 0.0, 2381.0]| |8 |[808053.0, 4.8515628E7, 60.040155782001115, 76.31015675449926, 0.0, 1674.0]| |9 |[808053.0, 3.9648388E7, 49.06656865321915, 66.55061748894705, 0.0, 1955.0] | |10 |[808053.0, 3.5971104E7, 44.51577309906402, 61.76314020099509, 0.0, 2328.0] | |11 |[808053.0, 3.2569273E7, 40.305862362897145, 61.227354819810515, 0.0, 718.0]| |12 |[808053.0, 3.2873464E7, 40.682311680044606, 67.51048011158271, 0.0, 806.0] | +-----+---------------------------------------------------------------------------+
sedona.sql("""
SELECT
month,
RS_Values(raster, Array(ST_POINT(-113.9940, 46.8721))) AS avg_prec
FROM prec ORDER BY month ASC
""").show(truncate=False)
+-----+--------+ |month|avg_prec| +-----+--------+ |1 |[43.0] | |2 |[27.0] | |3 |[31.0] | |4 |[31.0] | |5 |[50.0] | |6 |[49.0] | |7 |[27.0] | |8 |[31.0] | |9 |[31.0] | |10 |[29.0] | |11 |[35.0] | |12 |[40.0] | +-----+--------+
Zonal statistics involves joining vector geometries to the raster and calculating statistical or aggregating values based on the pixel values within each vector geometry.
countries = ShapefileReader.readToGeometryRDD(sedona, 's3://wherobots-examples/data/examples/natural_earth/ne_10m_admin_0_countries')
countries_df = Adapter.toDf(countries, sedona)
countries_df.createOrReplaceTempView("countries")
countries_df.printSchema()
root |-- geometry: geometry (nullable = true) |-- featurecla: string (nullable = true) |-- scalerank: string (nullable = true) |-- LABELRANK: string (nullable = true) |-- SOVEREIGNT: string (nullable = true) |-- SOV_A3: string (nullable = true) |-- ADM0_DIF: string (nullable = true) |-- LEVEL: string (nullable = true) |-- TYPE: string (nullable = true) |-- TLC: string (nullable = true) |-- ADMIN: string (nullable = true) |-- ADM0_A3: string (nullable = true) |-- GEOU_DIF: string (nullable = true) |-- GEOUNIT: string (nullable = true) |-- GU_A3: string (nullable = true) |-- SU_DIF: string (nullable = true) |-- SUBUNIT: string (nullable = true) |-- SU_A3: string (nullable = true) |-- BRK_DIFF: string (nullable = true) |-- NAME: string (nullable = true) |-- NAME_LONG: string (nullable = true) |-- BRK_A3: string (nullable = true) |-- BRK_NAME: string (nullable = true) |-- BRK_GROUP: string (nullable = true) |-- ABBREV: string (nullable = true) |-- POSTAL: string (nullable = true) |-- FORMAL_EN: string (nullable = true) |-- FORMAL_FR: string (nullable = true) |-- NAME_CIAWF: string (nullable = true) |-- NOTE_ADM0: string (nullable = true) |-- NOTE_BRK: string (nullable = true) |-- NAME_SORT: string (nullable = true) |-- NAME_ALT: string (nullable = true) |-- MAPCOLOR7: string (nullable = true) |-- MAPCOLOR8: string (nullable = true) |-- MAPCOLOR9: string (nullable = true) |-- MAPCOLOR13: string (nullable = true) |-- POP_EST: string (nullable = true) |-- POP_RANK: string (nullable = true) |-- POP_YEAR: string (nullable = true) |-- GDP_MD: string (nullable = true) |-- GDP_YEAR: string (nullable = true) |-- ECONOMY: string (nullable = true) |-- INCOME_GRP: string (nullable = true) |-- FIPS_10: string (nullable = true) |-- ISO_A2: string (nullable = true) |-- ISO_A2_EH: string (nullable = true) |-- ISO_A3: string (nullable = true) |-- ISO_A3_EH: string (nullable = true) |-- ISO_N3: string (nullable = true) |-- ISO_N3_EH: string (nullable = true) |-- UN_A3: string (nullable = true) |-- WB_A2: string (nullable = true) |-- WB_A3: string (nullable = true) |-- WOE_ID: string (nullable = true) |-- WOE_ID_EH: string (nullable = true) |-- WOE_NOTE: string (nullable = true) |-- ADM0_ISO: string (nullable = true) |-- ADM0_DIFF: string (nullable = true) |-- ADM0_TLC: string (nullable = true) |-- ADM0_A3_US: string (nullable = true) |-- ADM0_A3_FR: string (nullable = true) |-- ADM0_A3_RU: string (nullable = true) |-- ADM0_A3_ES: string (nullable = true) |-- ADM0_A3_CN: string (nullable = true) |-- ADM0_A3_TW: string (nullable = true) |-- ADM0_A3_IN: string (nullable = true) |-- ADM0_A3_NP: string (nullable = true) |-- ADM0_A3_PK: string (nullable = true) |-- ADM0_A3_DE: string (nullable = true) |-- ADM0_A3_GB: string (nullable = true) |-- ADM0_A3_BR: string (nullable = true) |-- ADM0_A3_IL: string (nullable = true) |-- ADM0_A3_PS: string (nullable = true) |-- ADM0_A3_SA: string (nullable = true) |-- ADM0_A3_EG: string (nullable = true) |-- ADM0_A3_MA: string (nullable = true) |-- ADM0_A3_PT: string (nullable = true) |-- ADM0_A3_AR: string (nullable = true) |-- ADM0_A3_JP: string (nullable = true) |-- ADM0_A3_KO: string (nullable = true) |-- ADM0_A3_VN: string (nullable = true) |-- ADM0_A3_TR: string (nullable = true) |-- ADM0_A3_ID: string (nullable = true) |-- ADM0_A3_PL: string (nullable = true) |-- ADM0_A3_GR: string (nullable = true) |-- ADM0_A3_IT: string (nullable = true) |-- ADM0_A3_NL: string (nullable = true) |-- ADM0_A3_SE: string (nullable = true) |-- ADM0_A3_BD: string (nullable = true) |-- ADM0_A3_UA: string (nullable = true) |-- ADM0_A3_UN: string (nullable = true) |-- ADM0_A3_WB: string (nullable = true) |-- CONTINENT: string (nullable = true) |-- REGION_UN: string (nullable = true) |-- SUBREGION: string (nullable = true) |-- REGION_WB: string (nullable = true) |-- NAME_LEN: string (nullable = true) |-- LONG_LEN: string (nullable = true) |-- ABBREV_LEN: string (nullable = true) |-- TINY: string (nullable = true) |-- HOMEPART: string (nullable = true) |-- MIN_ZOOM: string (nullable = true) |-- MIN_LABEL: string (nullable = true) |-- MAX_LABEL: string (nullable = true) |-- LABEL_X: string (nullable = true) |-- LABEL_Y: string (nullable = true) |-- NE_ID: string (nullable = true) |-- WIKIDATAID: string (nullable = true) |-- NAME_AR: string (nullable = true) |-- NAME_BN: string (nullable = true) |-- NAME_DE: string (nullable = true) |-- NAME_EN: string (nullable = true) |-- NAME_ES: string (nullable = true) |-- NAME_FA: string (nullable = true) |-- NAME_FR: string (nullable = true) |-- NAME_EL: string (nullable = true) |-- NAME_HE: string (nullable = true) |-- NAME_HI: string (nullable = true) |-- NAME_HU: string (nullable = true) |-- NAME_ID: string (nullable = true) |-- NAME_IT: string (nullable = true) |-- NAME_JA: string (nullable = true) |-- NAME_KO: string (nullable = true) |-- NAME_NL: string (nullable = true) |-- NAME_PL: string (nullable = true) |-- NAME_PT: string (nullable = true) |-- NAME_RU: string (nullable = true) |-- NAME_SV: string (nullable = true) |-- NAME_TR: string (nullable = true) |-- NAME_UK: string (nullable = true) |-- NAME_UR: string (nullable = true) |-- NAME_VI: string (nullable = true) |-- NAME_ZH: string (nullable = true) |-- NAME_ZHT: string (nullable = true) |-- FCLASS_ISO: string (nullable = true) |-- TLC_DIFF: string (nullable = true) |-- FCLASS_TLC: string (nullable = true) |-- FCLASS_US: string (nullable = true) |-- FCLASS_FR: string (nullable = true) |-- FCLASS_RU: string (nullable = true) |-- FCLASS_ES: string (nullable = true) |-- FCLASS_CN: string (nullable = true) |-- FCLASS_TW: string (nullable = true) |-- FCLASS_IN: string (nullable = true) |-- FCLASS_NP: string (nullable = true) |-- FCLASS_PK: string (nullable = true) |-- FCLASS_DE: string (nullable = true) |-- FCLASS_GB: string (nullable = true) |-- FCLASS_BR: string (nullable = true) |-- FCLASS_IL: string (nullable = true) |-- FCLASS_PS: string (nullable = true) |-- FCLASS_SA: string (nullable = true) |-- FCLASS_EG: string (nullable = true) |-- FCLASS_MA: string (nullable = true) |-- FCLASS_PT: string (nullable = true) |-- FCLASS_AR: string (nullable = true) |-- FCLASS_JP: string (nullable = true) |-- FCLASS_KO: string (nullable = true) |-- FCLASS_VN: string (nullable = true) |-- FCLASS_TR: string (nullable = true) |-- FCLASS_ID: string (nullable = true) |-- FCLASS_PL: string (nullable = true) |-- FCLASS_GR: string (nullable = true) |-- FCLASS_IT: string (nullable = true) |-- FCLASS_NL: string (nullable = true) |-- FCLASS_SE: string (nullable = true) |-- FCLASS_BD: string (nullable = true) |-- FCLASS_UA: string (nullable = true)
24/02/29 15:11:44 WARN package: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
world_prec_df = sedona.sql("""
SELECT
sum(RS_ZonalStats(prec.raster, countries.geometry, 1, 'avg', true)) AS yearly_avg_prec,
any_value(countries.geometry),
countries.name
FROM prec, countries
GROUP BY name
ORDER BY yearly_avg_prec DESC
""")
world_prec_df.dropna().show()
+------------------+--------------------+--------------------+ | yearly_avg_prec| any_value(geometry)| name| +------------------+--------------------+--------------------+ | 4937.5|MULTIPOLYGON (((1...| Micronesia| | 3526.0|MULTIPOLYGON (((1...| Palau| | 3378.75|MULTIPOLYGON (((-...| Samoa| | 3345.6875|MULTIPOLYGON (((1...| Brunei| | 3234.474358974359|MULTIPOLYGON (((1...| Solomon Is.| | 3111.0|MULTIPOLYGON (((-...| Saint Helena| | 3008.0|MULTIPOLYGON (((-...|Wallis and Futuna...| | 2988.550872093023|MULTIPOLYGON (((1...| Papua New Guinea| |2934.6285714285714|MULTIPOLYGON (((1...| Vanuatu| |2881.2202505219207|MULTIPOLYGON (((1...| Malaysia| | 2831.0|MULTIPOLYGON (((-...| Costa Rica| | 2807.0|MULTIPOLYGON (((-...| Faeroe Is.| |2733.1463857609974|MULTIPOLYGON (((1...| Indonesia| |2717.5348837209303|MULTIPOLYGON (((-...| Sierra Leone| | 2675.306306306306|MULTIPOLYGON (((-...| Panama| | 2656.390070921986|POLYGON ((-11.476...| Liberia| | 2631.974466806844|MULTIPOLYGON (((-...| Colombia| |2560.3461538461534|MULTIPOLYGON (((-...| Fiji| |2555.6666666666665|MULTIPOLYGON (((6...|São Tomé and Pr...| |2525.6388261851025|MULTIPOLYGON (((1...| Philippines| +------------------+--------------------+--------------------+ only showing top 20 rows
SedonaKepler.create_map(world_prec_df.dropna())
User Guide: https://docs.kepler.gl/docs/keplergl-jupyter
KeplerGl(data={'unnamed': {'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,…
If we have large rasters we typically want to break them up into multiple tiles to improve the efficiency of raster operations. Let's take a look at an example using a dataset from NOAA of night time visible lights.
src = rasterio.open("data/F101993.v4b.global.stable_lights.avg_vis.tif")
array = src.read(1)
pyplot.imshow(array, cmap='pink')
<matplotlib.image.AxesImage at 0x7f6e38c58fa0>
f10_1993_uri = "s3://wherobots-examples/data/examples/DMSP_OLS/F101993.v4b.global.stable_lights.avg_vis.tif"
f10_1993_df = sedona.sql(f"SELECT RS_FromPath('{f10_1993_uri}') AS raster")
f10_1993_df.createOrReplaceTempView("f10_1993")
f10_1993_df.show(truncate=False)
+---------------------------------------+ |raster | +---------------------------------------+ |LazyLoadOutDbGridCoverage2D[not loaded]| +---------------------------------------+
tile_df = sedona.sql("SELECT RS_TileExplode(raster, 256, 256) AS (x, y, tile) FROM f10_1993")
tile_df.show(5)
+---+---+--------------------+ | x| y| tile| +---+---+--------------------+ | 0| 0|OutDbGridCoverage...| | 1| 0|OutDbGridCoverage...| | 2| 0|OutDbGridCoverage...| | 3| 0|OutDbGridCoverage...| | 4| 0|OutDbGridCoverage...| +---+---+--------------------+ only showing top 5 rows
sedona.sql("DROP TABLE IF EXISTS wherobots.test_db.f10_1993")
tile_df.writeTo("wherobots.test_db.f10_1993").create()
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
sedona.table("wherobots.test_db.f10_1993").count()
11154
sedona.table("wherobots.test_db.f10_1993").show()
tiledMap = SedonaKepler.create_map()
SedonaKepler.add_df(tiledMap, sedona.table("wherobots.test_db.f10_1993").withColumn("tile", expr("RS_Envelope(tile)")), name="tiles")
tiledMap
+---+---+--------------------+ | x| y| tile| +---+---+--------------------+ | 0| 0|OutDbGridCoverage...| | 1| 0|OutDbGridCoverage...| | 2| 0|OutDbGridCoverage...| | 3| 0|OutDbGridCoverage...| | 4| 0|OutDbGridCoverage...| | 5| 0|OutDbGridCoverage...| | 6| 0|OutDbGridCoverage...| | 7| 0|OutDbGridCoverage...| | 8| 0|OutDbGridCoverage...| | 9| 0|OutDbGridCoverage...| | 10| 0|OutDbGridCoverage...| | 11| 0|OutDbGridCoverage...| | 12| 0|OutDbGridCoverage...| | 13| 0|OutDbGridCoverage...| | 14| 0|OutDbGridCoverage...| | 15| 0|OutDbGridCoverage...| | 16| 0|OutDbGridCoverage...| | 17| 0|OutDbGridCoverage...| | 18| 0|OutDbGridCoverage...| | 19| 0|OutDbGridCoverage...| +---+---+--------------------+ only showing top 20 rows User Guide: https://docs.kepler.gl/docs/keplergl-jupyter
KeplerGl(data={'tiles': {'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 2…
sedona.sql("SELECT RS_NumBands(raster) FROM f10_1993").show()
+-------------------+ |rs_numbands(raster)| +-------------------+ | 1| +-------------------+
display(HTML(sedona.table("wherobots.test_db.f10_1993").selectExpr("RS_AsImage(tile, 200)", "RS_Envelope(tile)").limit(5).toPandas().to_html(escape=False)))
rs_asimage(tile, 200) | rs_envelope(tile) | |
---|---|---|
0 | POLYGON ((-180.00416666665 72.87083334185, -180.00416666665 75.00416666665, -177.87083334185 75.00416666665, -177.87083334185 72.87083334185, -180.00416666665 72.87083334185)) | |
1 | POLYGON ((-177.87083334185002 72.87083334185, -177.87083334185002 75.00416666665, -175.73750001705 75.00416666665, -175.73750001705 72.87083334185, -177.87083334185002 72.87083334185)) | |
2 | POLYGON ((-175.73750001705 72.87083334185, -175.73750001705 75.00416666665, -173.60416669224998 75.00416666665, -173.60416669224998 72.87083334185, -175.73750001705 72.87083334185)) | |
3 | POLYGON ((-173.60416669225 72.87083334185, -173.60416669225 75.00416666665, -171.47083336744998 75.00416666665, -171.47083336744998 72.87083334185, -173.60416669225 72.87083334185)) | |
4 | POLYGON ((-171.47083336745 72.87083334185, -171.47083336745 75.00416666665, -169.33750004265 75.00416666665, -169.33750004265 72.87083334185, -171.47083336745 72.87083334185)) |
sedona.sql("""
WITH matched_tile AS (
SELECT * FROM wherobots.test_db.f10_1993
WHERE RS_Intersects(tile, ST_POINT(-113.9940, 46.8721))
)
SELECT RS_Values(tile, Array(ST_POINT(-113.9940, 46.8721)))
FROM matched_tile
""").show(truncate=False)
[Stage 44:> (0 + 1) / 1]
+----------------------------------------------------+ |rs_values(tile, array(st_point(-113.9940, 46.8721)))| +----------------------------------------------------+ |[63.0] | +----------------------------------------------------+
counties_shapefile = "s3://wherobots-examples/data/examples/natural_earth/ne_10m_admin_2_counties"
spatialRDD = ShapefileReader.readToGeometryRDD(sedona, counties_shapefile)
counties_df = Adapter.toDf(spatialRDD, sedona)
counties_df.createOrReplaceTempView("counties")
counties_df.printSchema()
[Stage 45:> (0 + 1) / 1]
root |-- geometry: geometry (nullable = true) |-- FEATURECLA: string (nullable = true) |-- SCALERANK: string (nullable = true) |-- ADM2_CODE: string (nullable = true) |-- ISO_3166_2: string (nullable = true) |-- ISO_A2: string (nullable = true) |-- ADM0_SR: string (nullable = true) |-- NAME: string (nullable = true) |-- NAME_ALT: string (nullable = true) |-- NAME_LOCAL: string (nullable = true) |-- TYPE: string (nullable = true) |-- TYPE_EN: string (nullable = true) |-- CODE_LOCAL: string (nullable = true) |-- REGION: string (nullable = true) |-- REGION_COD: string (nullable = true) |-- ABBREV: string (nullable = true) |-- AREA_SQKM: string (nullable = true) |-- SAMEASCITY: string (nullable = true) |-- LABELRANK: string (nullable = true) |-- NAME_LEN: string (nullable = true) |-- MAPCOLOR9: string (nullable = true) |-- MAPCOLOR13: string (nullable = true) |-- FIPS: string (nullable = true) |-- SOV_A3: string (nullable = true) |-- ADM0_A3: string (nullable = true) |-- ADM0_LABEL: string (nullable = true) |-- ADMIN: string (nullable = true) |-- GEONUNIT: string (nullable = true) |-- GU_A3: string (nullable = true) |-- MIN_LABEL: string (nullable = true) |-- MAX_LABEL: string (nullable = true) |-- MIN_ZOOM: string (nullable = true) |-- WIKIDATAID: string (nullable = true) |-- NE_ID: string (nullable = true) |-- latitude: string (nullable = true) |-- longitude: string (nullable = true) |-- NAME_AR: string (nullable = true) |-- NAME_BN: string (nullable = true) |-- NAME_DE: string (nullable = true) |-- NAME_EL: string (nullable = true) |-- NAME_EN: string (nullable = true) |-- NAME_ES: string (nullable = true) |-- NAME_FA: string (nullable = true) |-- NAME_FR: string (nullable = true) |-- NAME_HE: string (nullable = true) |-- NAME_HI: string (nullable = true) |-- NAME_HU: string (nullable = true) |-- NAME_ID: string (nullable = true) |-- NAME_IT: string (nullable = true) |-- NAME_JA: string (nullable = true) |-- NAME_KO: string (nullable = true) |-- NAME_NL: string (nullable = true) |-- NAME_PL: string (nullable = true) |-- NAME_PT: string (nullable = true) |-- NAME_RU: string (nullable = true) |-- NAME_SV: string (nullable = true) |-- NAME_TR: string (nullable = true) |-- NAME_UK: string (nullable = true) |-- NAME_UR: string (nullable = true) |-- NAME_VI: string (nullable = true) |-- NAME_ZH: string (nullable = true) |-- NAME_ZHT: string (nullable = true)
county_light_tiled_df = sedona.sql("""
WITH matched_tile AS (
SELECT tile, geometry, FIPS
FROM wherobots.test_db.f10_1993, counties
WHERE RS_Intersects(tile, counties.geometry)
)
SELECT
sum(RS_ZonalStats(matched_tile.tile, matched_tile.geometry, 'mean')) AS mean_light,
any_value(matched_tile.geometry) AS geometry,
FIPS
FROM matched_tile
GROUP BY FIPS
""")
county_light_tiled_df.createOrReplaceTempView("county_light_1993")
county_light_tiled_df.show(100)
[Stage 51:> (0 + 1) / 1]
+--------------------+--------------------+-------+ | mean_light| geometry| FIPS| +--------------------+--------------------+-------+ | 6.563388510224063|MULTIPOLYGON (((-...|US01003| | 5.303058346553825|POLYGON ((-85.422...|US01019| | 7.112594570538514|POLYGON ((-86.413...|US01021| | 2.5223492723492806|POLYGON ((-88.091...|US01025| | 9.564617731305812|POLYGON ((-85.789...|US01031| | 13.433770014555993|POLYGON ((-88.130...|US01033| | 8.240051020408188|POLYGON ((-86.370...|US01037| | 2.301078582434537|POLYGON ((-86.191...|US01039| | 0.9495387954422182|POLYGON ((-86.499...|US01041| | 8.3112128146453|POLYGON ((-85.770...|US01045| | 2.008212672420753|POLYGON ((-86.916...|US01047| | 4.339487179487201|POLYGON ((-86.699...|US01053| | NaN|POLYGON ((-85.863...|US01071| | 21.28748280605236|POLYGON ((-87.026...|US01073| | 2.8376935397757586|POLYGON ((-88.241...|US01075| | 9.75868121970662|POLYGON ((-87.529...|US01079| | 12.044760935910418|POLYGON ((-85.083...|US01081| | 18.568147013782642|POLYGON ((-86.783...|US01089| | 3.4789029535864975|POLYGON ((-88.195...|US01093| | 36.043608495696546|POLYGON ((-86.303...|US01095| | 26.75675321863559|MULTIPOLYGON (((-...|US01097| | 2.6444855206407305|POLYGON ((-86.907...|US01099| | 37.593265075573335|POLYGON ((-86.511...|US01101| | 33.53826026240135|POLYGON ((-86.581...|US01103| | 1.1684303350969931|POLYGON ((-87.421...|US01105| | 9.971524463290796|POLYGON ((-88.301...|US01107| | 13.011129071861175|POLYGON ((-84.998...|US01113| | 4.809910641754596|POLYGON ((-85.593...|US01123| | 8.41151685393288|POLYGON ((-87.065...|US01125| | 1.2630472854640653|POLYGON ((-86.857...|US01131| | 0.10422005744217205|MULTIPOLYGON (((-...|US02013| | 0.7744945567651736|POLYGON ((-157.26...|US02060| | 0.03614775725595106|MULTIPOLYGON (((-...|US02070| | 4.619017951013784|POLYGON ((-147.00...|US02090| | 0.03209213910320792|MULTIPOLYGON (((-...|US02100| |0.029853669307934268|MULTIPOLYGON (((-...|US02105| | 0.04098077587198586|MULTIPOLYGON (((-...|US02150| | 0.21807125968592578|MULTIPOLYGON (((-...|US02158| | 6.29803023696733|MULTIPOLYGON (((-...|US02185| | 0.1582469415182259|POLYGON ((-164.40...|US02188| | 0.15516483516485216|MULTIPOLYGON (((-...|US02195| | 0.10085646780862186|MULTIPOLYGON (((-...|US02220| |0.037185354691075395|POLYGON ((-135.48...|US02230| | 0.11782344122055301|MULTIPOLYGON (((-...|US02275| | 0.24472324900896858|POLYGON ((-141.00...|US02290| | 2.2717580297073106|POLYGON ((-112.54...|US04005| | 15.816996835964616|POLYGON ((-111.49...|US04013| | 1.7759504750453305|POLYGON ((-110.75...|US04017| | 7.688794802992883|POLYGON ((-111.03...|US04021| | 1.5279533719292975|POLYGON ((-91.377...|US05001| | 1.920353982300796|POLYGON ((-91.460...|US05003| | 10.465916837082402|POLYGON ((-94.563...|US05007| | 1.010121457489879|POLYGON ((-91.983...|US05011| | 2.270431427569262|POLYGON ((-93.408...|US05019| | 2.906217616580315|POLYGON ((-91.837...|US05023| | 0.529593412249097|POLYGON ((-91.975...|US05025| | 3.8983805668016687|POLYGON ((-93.238...|US05027| | 11.952818173441436|POLYGON ((-90.379...|US05031| | 7.514086802781834|POLYGON ((-90.304...|US05035| | 1.285983421250936|POLYGON ((-91.457...|US05043| | 12.884167588875034|POLYGON ((-92.252...|US05045| | 3.8769922353902655|POLYGON ((-91.793...|US05063| | 2.9252685660906153|POLYGON ((-91.032...|US05067| | 1.53237742830715|POLYGON ((-93.709...|US05071| | 0.780098955470026|POLYGON ((-93.520...|US05073| | 2.7720979765707727|POLYGON ((-94.407...|US05081| | 7.235225505443205|POLYGON ((-91.678...|US05085| | 9.081934846989215|POLYGON ((-94.041...|US05091| | 6.037962037962009|POLYGON ((-89.723...|US05093| | 1.860642728913338|POLYGON ((-91.059...|US05095| | 1.4942065491183798|POLYGON ((-93.373...|US05099| | 0.7677502138580021|POLYGON ((-93.104...|US05103| | 5.694871794871785|POLYGON ((-91.036...|US05111| | 1.8861226371599893|POLYGON ((-91.377...|US05117| | 1.5684210526316162|POLYGON ((-91.407...|US05121| | NaN|POLYGON ((-90.502...|US05123| | 0.8484433800889477|POLYGON ((-94.455...|US05127| | 14.166481069042172|POLYGON ((-94.450...|US05131| | 2.064942528735636|POLYGON ((-94.475...|US05133| | 1.864905660377377|POLYGON ((-92.240...|US05141| | 4.83930156851134|POLYGON ((-92.111...|US05145| | 1.1251287332646798|POLYGON ((-91.039...|US05147| | 1.696002578981324|POLYGON ((-92.927...|US05149| | 16.7709260687363|POLYGON ((-119.88...|US06003| | 6.375985412736477|POLYGON ((-120.07...|US06005| | 2.9443088203897143|POLYGON ((-120.07...|US06009| | 1.4734786853238142|POLYGON ((-121.88...|US06011| | 94.27982026143789|POLYGON ((-122.31...|US06013| | 1.0074585635359046|POLYGON ((-124.06...|US06015| | 14.33509453996127|POLYGON ((-118.77...|US06019| | 2.1304226242583373|POLYGON ((-122.00...|US06021| | 2.01199374477674|POLYGON ((-124.02...|US06023| | 1.155351130875798|POLYGON ((-115.89...|US06027| | 83.75094670225657|MULTIPOLYGON (((-...|US06037| | 6.0181974791927555|POLYGON ((-119.02...|US06039| | 0.9955249854977457|POLYGON ((-123.53...|US06045| | 14.513000624831733|POLYGON ((-122.62...|US06055| | 68.95424107142858|POLYGON ((-120.00...|US06061| | 1.265862072671394|POLYGON ((-121.07...|US06063| | 31.23753117206894|POLYGON ((-121.57...|US06067| +--------------------+--------------------+-------+ only showing top 100 rows
SedonaKepler.create_map(county_light_tiled_df)
User Guide: https://docs.kepler.gl/docs/keplergl-jupyter
/opt/conda/lib/python3.10/site-packages/jupyter_client/session.py:719: UserWarning: Message serialization failed with: Out of range float values are not JSON compliant Supporting this message is deprecated in jupyter-client 7, please make sure your message is JSON-compliant content = self.pack(content)
KeplerGl(data={'unnamed': {'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,…