Tutorial — Working with Raster Data

Goal: After this tutorial, you can load, manipulate, reproject, and visualize any satellite imagery in Python using rasterio, numpy, and matplotlib.

This is the mechanical foundation. Every analysis technique in this course — change detection, spectral analysis, object detection, terrain analysis — starts with loading raster data and doing math on arrays. Master this and everything else follows.

Step 1: What Is a Raster?

A raster is a grid of pixels, where each pixel holds one or more numeric values (bands), plus metadata that ties the grid to a location on Earth.

Three components:

Pixel values — numpy arrays. One 2D array per band. Values represent reflectance, temperature, elevation, backscatter, etc.
Georeference (affine transform) — maps pixel coordinates (row, col) to geographic coordinates (easting, northing or lon, lat). A 3x3 matrix that encodes origin, pixel size, and rotation.
Coordinate Reference System (CRS) — defines what the geographic coordinates mean. EPSG:4326 = lat/lon on WGS84 ellipsoid. UTM zones = meters in a projected grid.

Pixel (row=100, col=200)
    ↓ affine transform
Geographic (X=24.7523, Y=59.4370) in CRS EPSG:4326
    ↓ CRS interpretation
Physical location: Tallinn, Estonia

Step 2: Load GeoTIFF with Rasterio

import rasterio
import numpy as np
import matplotlib.pyplot as plt
 
# Open a single-band GeoTIFF (e.g., Sentinel-2 Red band)
with rasterio.open("T35VLF_B04_10m.tif") as src:
    # Read band 1 (bands are 1-indexed in rasterio)
    red = src.read(1)  # numpy array, shape (H, W)
 
    # Metadata
    print(f"Shape: {red.shape}")
    print(f"Data type: {red.dtype}")
    print(f"CRS: {src.crs}")
    print(f"Transform:\n{src.transform}")
    print(f"Bounds: {src.bounds}")
    print(f"Resolution: {src.res}")  # (x_res, y_res) in CRS units
    print(f"Number of bands: {src.count}")
    print(f"NoData value: {src.nodata}")
 
# The transform (affine matrix):
# | a  b  c |     a = pixel width (in CRS units, e.g., 10m)
# | d  e  f |     e = pixel height (negative, because row 0 = top)
# | 0  0  1 |     c, f = origin (upper-left corner coordinates)
 
# Convert pixel to geographic coordinates
transform = src.transform
row, col = 100, 200
x, y = rasterio.transform.xy(transform, row, col)
print(f"Pixel ({row}, {col}) -> Geographic ({x:.4f}, {y:.4f})")
 
# Convert geographic to pixel coordinates
row_back, col_back = rasterio.transform.rowcol(transform, x, y)
print(f"Geographic ({x:.4f}, {y:.4f}) -> Pixel ({row_back}, {col_back})")

Load Multiple Bands

import rasterio
import numpy as np
 
def load_sentinel2_bands(band_files, scale_factor=0.0001):
    """
    Load Sentinel-2 bands from individual files.
    L2A products store reflectance as uint16 with scale factor 0.0001.
    Returns: dict of {band_name: numpy_array}, plus metadata.
    """
    bands = {}
    meta = None
 
    for name, filepath in band_files.items():
        with rasterio.open(filepath) as src:
            data = src.read(1).astype(np.float32) * scale_factor
            bands[name] = data
            if meta is None:
                meta = {
                    "transform": src.transform,
                    "crs": src.crs,
                    "shape": data.shape,
                    "profile": src.profile.copy(),
                }
    return bands, meta
 
# Example usage with Sentinel-2 L2A product
band_files = {
    "blue":  "R10m/T35VLF_B02_10m.jp2",
    "green": "R10m/T35VLF_B03_10m.jp2",
    "red":   "R10m/T35VLF_B04_10m.jp2",
    "nir":   "R10m/T35VLF_B08_10m.jp2",
}
 
# bands, meta = load_sentinel2_bands(band_files)

Step 3: Coordinate Reference Systems (CRS)

CRS is the single most common source of errors in geospatial analysis. If your data is in the wrong CRS, your coordinates are wrong, your distances are wrong, and your overlays don’t align.

The Key Systems

CRS	EPSG Code	Units	Use
WGS84 Geographic	4326	Degrees (lat/lon)	GPS, web maps, general reference
UTM Zone 35N	32635	Meters	Estonia, Finland (accurate distance/area)
Web Mercator	3857	Meters (distorted)	Google Maps, web tiles — NEVER use for analysis

Why CRS matters for analysis:

1 degree of latitude = ~111 km everywhere. 1 degree of longitude = ~111 km at equator, ~56 km at 60N (Estonia).

If you compute distance or area in EPSG:4326, you’ll get wrong results because degrees are not uniform. Always reproject to a metric CRS (UTM) for measurements.

Check and Convert CRS

import rasterio
from rasterio.warp import calculate_default_transform, reproject, Resampling
 
def reproject_raster(src_path, dst_path, dst_crs="EPSG:32635"):
    """
    Reproject a raster to a new CRS.
    UTM Zone 35N (EPSG:32635) is correct for Estonia.
    """
    with rasterio.open(src_path) as src:
        # Calculate new transform and dimensions
        transform, width, height = calculate_default_transform(
            src.crs, dst_crs, src.width, src.height, *src.bounds
        )
 
        # Update profile for output
        profile = src.profile.copy()
        profile.update(
            crs=dst_crs,
            transform=transform,
            width=width,
            height=height,
        )
 
        with rasterio.open(dst_path, "w", **profile) as dst:
            for band_idx in range(1, src.count + 1):
                reproject(
                    source=rasterio.band(src, band_idx),
                    destination=rasterio.band(dst, band_idx),
                    src_transform=src.transform,
                    src_crs=src.crs,
                    dst_transform=transform,
                    dst_crs=dst_crs,
                    resampling=Resampling.bilinear,
                )
 
    print(f"Reprojected {src_path} -> {dst_path} (CRS: {dst_crs})")
 
# reproject_raster("input_4326.tif", "output_utm35n.tif")

Find the Right UTM Zone

def utm_zone_from_lon(lon):
    """Get UTM zone number from longitude."""
    zone = int((lon + 180) / 6) + 1
    return zone
 
def utm_epsg(lon, lat):
    """Get EPSG code for the UTM zone containing (lon, lat)."""
    zone = utm_zone_from_lon(lon)
    if lat >= 0:
        return f"EPSG:326{zone:02d}"  # North
    else:
        return f"EPSG:327{zone:02d}"  # South
 
# Estonia
print(utm_epsg(24.75, 59.44))  # EPSG:32635 (UTM Zone 35N)

Step 4: Display Imagery with Matplotlib

True Color (RGB) Composite

import numpy as np
import matplotlib.pyplot as plt
import rasterio
 
def display_rgb(red, green, blue, title="True Color", percentile_stretch=(2, 98)):
    """
    Display a true-color composite with histogram stretching.
    Input bands should be float arrays.
    """
    # Stack into RGB array
    rgb = np.stack([red, green, blue], axis=-1)
 
    # Remove invalid pixels
    rgb = np.where(rgb < 0, 0, rgb)
 
    # Percentile stretch for visualization
    for i in range(3):
        band = rgb[:, :, i]
        valid = band[band > 0]
        if len(valid) == 0:
            continue
        lo = np.percentile(valid, percentile_stretch[0])
        hi = np.percentile(valid, percentile_stretch[1])
        rgb[:, :, i] = np.clip((band - lo) / (hi - lo + 1e-10), 0, 1)
 
    fig, ax = plt.subplots(figsize=(10, 10))
    ax.imshow(rgb)
    ax.set_title(title)
    ax.axis("off")
    plt.tight_layout()
    return fig
 
# Usage:
# display_rgb(bands["red"], bands["green"], bands["blue"])
# plt.savefig("true_color.png", dpi=150)
# plt.show()

False Color Composites

Different band combinations reveal different features:

def display_composite(band_r, band_g, band_b, title="Composite",
                      percentile_stretch=(2, 98)):
    """Generic band composite display."""
    return display_rgb(band_r, band_g, band_b, title, percentile_stretch)
 
# Common composites:
# True color:       Red=B04, Green=B03, Blue=B02
# Color infrared:   Red=B08(NIR), Green=B04(Red), Blue=B03(Green)
#   -> Vegetation appears RED, water appears BLACK, urban appears CYAN
# Agriculture:      Red=B11(SWIR), Green=B08(NIR), Blue=B02(Blue)
#   -> Distinguishes crop types and health
# Urban:            Red=B12(SWIR2), Green=B11(SWIR1), Blue=B04(Red)
#   -> Urban areas appear in distinct colors
 
# Example:
# display_composite(bands["nir"], bands["red"], bands["green"],
#                   title="Color Infrared (NIR-Red-Green)")

Add Scale Bar and North Arrow

import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib.offsetbox import AnchoredText
import numpy as np
 
def add_scalebar(ax, transform, length_m=1000, location="lower right"):
    """
    Add a scale bar to a rasterio-based plot.
    transform: rasterio affine transform
    length_m: scale bar length in meters
    """
    # Pixel size in meters (assumes metric CRS like UTM)
    pixel_size = abs(transform.a)
    length_px = length_m / pixel_size
 
    # Position in axes coordinates
    if location == "lower right":
        x_start = 0.65
        y_pos = 0.05
    elif location == "lower left":
        x_start = 0.05
        y_pos = 0.05
 
    ax_transform = ax.transAxes
    ax.plot(
        [x_start, x_start + 0.25],
        [y_pos, y_pos],
        color="white", linewidth=3,
        transform=ax_transform
    )
    ax.text(
        x_start + 0.125, y_pos + 0.02,
        f"{length_m/1000:.0f} km" if length_m >= 1000 else f"{length_m} m",
        color="white", fontsize=10, ha="center",
        transform=ax_transform,
        fontweight="bold",
        bbox=dict(boxstyle="round", facecolor="black", alpha=0.5),
    )
 
def add_north_arrow(ax, x=0.95, y=0.95, size=0.05):
    """Add a north arrow to the plot."""
    ax.annotate(
        "N", xy=(x, y), xytext=(x, y - size),
        arrowprops=dict(facecolor="white", edgecolor="white", width=2, headwidth=8),
        fontsize=12, color="white", ha="center", fontweight="bold",
        xycoords="axes fraction", textcoords="axes fraction",
    )

Histogram of Band Values

def plot_histogram(bands_dict, title="Band Histograms"):
    """Plot histograms of band values."""
    fig, ax = plt.subplots(figsize=(10, 5))
    colors = {"blue": "blue", "green": "green", "red": "red",
              "nir": "darkred", "swir1": "purple", "swir2": "brown"}
 
    for name, data in bands_dict.items():
        valid = data[data > 0].flatten()
        if len(valid) > 100000:
            valid = np.random.choice(valid, 100000)  # subsample for speed
        ax.hist(valid, bins=200, alpha=0.5, label=name,
                color=colors.get(name, None))
 
    ax.set_xlabel("Reflectance")
    ax.set_ylabel("Count")
    ax.set_title(title)
    ax.legend()
    plt.tight_layout()
    return fig

Step 5: Band Math — NDVI and Beyond

Band math is element-wise arithmetic on numpy arrays. The most common: vegetation indices.

import numpy as np
 
def compute_ndvi(nir, red):
    """
    Normalized Difference Vegetation Index.
    NDVI = (NIR - Red) / (NIR + Red)
    Range: -1 to +1. Vegetation > 0.3, water < 0, bare soil 0.1-0.2.
    """
    nir = nir.astype(np.float32)
    red = red.astype(np.float32)
    denominator = nir + red
    # Avoid division by zero
    ndvi = np.where(denominator > 0, (nir - red) / denominator, 0)
    return ndvi
 
def compute_ndwi(green, nir):
    """
    Normalized Difference Water Index (McFeeters).
    NDWI = (Green - NIR) / (Green + NIR)
    Positive values indicate water.
    """
    green = green.astype(np.float32)
    nir = nir.astype(np.float32)
    denominator = green + nir
    ndwi = np.where(denominator > 0, (green - nir) / denominator, 0)
    return ndwi
 
def compute_ndbi(swir, nir):
    """
    Normalized Difference Built-up Index.
    NDBI = (SWIR - NIR) / (SWIR + NIR)
    Positive values indicate built-up/urban areas.
    """
    swir = swir.astype(np.float32)
    nir = nir.astype(np.float32)
    denominator = swir + nir
    ndbi = np.where(denominator > 0, (swir - nir) / denominator, 0)
    return ndbi
 
# Usage:
# ndvi = compute_ndvi(bands["nir"], bands["red"])
# plt.imshow(ndvi, cmap="RdYlGn", vmin=-0.2, vmax=0.8)

Complete Band Math Example (Synthetic Data)

import numpy as np
import matplotlib.pyplot as plt
 
# Simulate a Sentinel-2 scene with realistic spectral values
np.random.seed(42)
size = 500
 
# Create land cover map
land_cover = np.zeros((size, size), dtype=int)
land_cover[:200, :] = 1     # forest (top)
land_cover[200:300, :250] = 2  # water (middle-left)
land_cover[200:300, 250:] = 3  # urban (middle-right)
land_cover[300:, :] = 4     # agriculture (bottom)
 
# Generate realistic reflectance per land cover type
# Values in surface reflectance units (0-1)
spectral_profiles = {
    # (blue, green, red, nir, swir1)
    0: (0.08, 0.10, 0.14, 0.22, 0.28),  # bare soil
    1: (0.02, 0.05, 0.03, 0.45, 0.15),  # forest
    2: (0.05, 0.04, 0.03, 0.01, 0.00),  # water
    3: (0.08, 0.09, 0.12, 0.18, 0.25),  # urban
    4: (0.03, 0.07, 0.04, 0.50, 0.20),  # agriculture (healthy crop)
}
 
bands = {}
band_names = ["blue", "green", "red", "nir", "swir1"]
for i, name in enumerate(band_names):
    band = np.zeros((size, size), dtype=np.float32)
    for lc, profile in spectral_profiles.items():
        mask = land_cover == lc
        band[mask] = profile[i] + np.random.normal(0, 0.01, np.sum(mask))
    bands[name] = np.clip(band, 0, 1)
 
# Compute indices
ndvi = compute_ndvi(bands["nir"], bands["red"])
ndwi = compute_ndwi(bands["green"], bands["nir"])
ndbi = compute_ndbi(bands["swir1"], bands["nir"])
 
# Display
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
 
# True color
rgb = np.stack([bands["red"], bands["green"], bands["blue"]], axis=-1)
rgb = np.clip(rgb * 5, 0, 1)  # stretch for display
axes[0, 0].imshow(rgb)
axes[0, 0].set_title("True Color")
 
# Color infrared
cir = np.stack([bands["nir"], bands["red"], bands["green"]], axis=-1)
cir = np.clip(cir * 3, 0, 1)
axes[0, 1].imshow(cir)
axes[0, 1].set_title("Color Infrared (NIR-R-G)")
 
# NDVI
im = axes[0, 2].imshow(ndvi, cmap="RdYlGn", vmin=-0.5, vmax=0.8)
axes[0, 2].set_title("NDVI")
plt.colorbar(im, ax=axes[0, 2], fraction=0.046)
 
# NDWI
im = axes[1, 0].imshow(ndwi, cmap="RdBu", vmin=-0.5, vmax=0.5)
axes[1, 0].set_title("NDWI (water positive)")
plt.colorbar(im, ax=axes[1, 0], fraction=0.046)
 
# NDBI
im = axes[1, 1].imshow(ndbi, cmap="RdYlBu_r", vmin=-0.5, vmax=0.5)
axes[1, 1].set_title("NDBI (urban positive)")
plt.colorbar(im, ax=axes[1, 1], fraction=0.046)
 
# Simple classification
classification = np.zeros_like(ndvi, dtype=int)
classification[ndwi > 0.1] = 1          # water
classification[ndvi > 0.3] = 2          # vegetation
classification[(ndbi > 0) & (ndvi < 0.3)] = 3  # urban
# 0 = unclassified/bare soil
 
colors = ["tan", "blue", "green", "gray"]
from matplotlib.colors import ListedColormap
cmap = ListedColormap(colors)
axes[1, 2].imshow(classification, cmap=cmap, vmin=0, vmax=3)
axes[1, 2].set_title("Classification")
import matplotlib.patches as mpatches
legend = [mpatches.Patch(color=c, label=l)
          for c, l in zip(colors, ["Bare soil", "Water", "Vegetation", "Urban"])]
axes[1, 2].legend(handles=legend, loc="lower right")
 
for ax in axes.flat:
    ax.axis("off")
plt.tight_layout()
plt.savefig("band_math_demo.png", dpi=150)
plt.show()

Step 6: Clip to Area of Interest

Real analysis is never on a full 100x100 km tile. You clip to your area of interest (AOI).

import rasterio
from rasterio.mask import mask
import geopandas as gpd
from shapely.geometry import box, mapping
import json
 
def clip_raster_to_bbox(raster_path, output_path, bbox, dst_crs=None):
    """
    Clip a raster to a bounding box.
    bbox: (west, south, east, north) in the raster's CRS, or specify dst_crs.
    """
    west, south, east, north = bbox
    clip_geom = box(west, south, east, north)
 
    with rasterio.open(raster_path) as src:
        # If bbox is in different CRS, reproject geometry
        if dst_crs and str(src.crs) != dst_crs:
            from pyproj import Transformer
            transformer = Transformer.from_crs(dst_crs, str(src.crs), always_xy=True)
            west_t, south_t = transformer.transform(west, south)
            east_t, north_t = transformer.transform(east, north)
            clip_geom = box(west_t, south_t, east_t, north_t)
 
        # Clip
        out_image, out_transform = mask(src, [mapping(clip_geom)], crop=True)
        out_profile = src.profile.copy()
        out_profile.update(
            height=out_image.shape[1],
            width=out_image.shape[2],
            transform=out_transform,
        )
 
        with rasterio.open(output_path, "w", **out_profile) as dst:
            dst.write(out_image)
 
    print(f"Clipped to {output_path}: {out_image.shape}")
 
# Clip to Tallinn old town area (approximate bbox in UTM 35N)
# clip_raster_to_bbox(
#     "sentinel2_full_tile.tif",
#     "tallinn_oldtown.tif",
#     bbox=(540000, 6586000, 545000, 6590000),  # UTM 35N meters
# )

Clip with Shapefile/GeoJSON

import rasterio
from rasterio.mask import mask
import geopandas as gpd
 
def clip_raster_to_shapefile(raster_path, output_path, shapefile_path):
    """Clip a raster using a vector boundary (shapefile or GeoJSON)."""
    # Load vector
    gdf = gpd.read_file(shapefile_path)
 
    with rasterio.open(raster_path) as src:
        # Reproject vector to raster CRS if needed
        if gdf.crs != src.crs:
            gdf = gdf.to_crs(src.crs)
 
        # Get geometries as GeoJSON-like dicts
        geoms = [mapping(geom) for geom in gdf.geometry]
 
        out_image, out_transform = mask(src, geoms, crop=True, nodata=0)
        out_profile = src.profile.copy()
        out_profile.update(
            height=out_image.shape[1],
            width=out_image.shape[2],
            transform=out_transform,
            nodata=0,
        )
 
        with rasterio.open(output_path, "w", **out_profile) as dst:
            dst.write(out_image)

Step 7: Save Results as GeoTIFF

Always preserve georeference when saving derived products.

import rasterio
import numpy as np
 
def save_geotiff(data, output_path, reference_raster_path=None,
                 transform=None, crs=None, nodata=None):
    """
    Save a numpy array as a GeoTIFF, preserving georeference.
    Either provide reference_raster_path OR (transform + crs).
    data: 2D array (single band) or 3D array (bands, height, width)
    """
    if data.ndim == 2:
        data = data[np.newaxis, :, :]  # add band dimension
 
    bands, height, width = data.shape
 
    if reference_raster_path:
        with rasterio.open(reference_raster_path) as ref:
            transform = ref.transform
            crs = ref.crs
 
    profile = {
        "driver": "GTiff",
        "dtype": data.dtype,
        "width": width,
        "height": height,
        "count": bands,
        "crs": crs,
        "transform": transform,
        "compress": "deflate",
    }
    if nodata is not None:
        profile["nodata"] = nodata
 
    with rasterio.open(output_path, "w", **profile) as dst:
        dst.write(data)
 
    print(f"Saved: {output_path} ({bands} bands, {height}x{width}, {data.dtype})")
 
# Save NDVI result
# save_geotiff(ndvi, "ndvi_output.tif",
#              reference_raster_path="sentinel2_B04.tif")

Step 8: Working with Sentinel-2 — Complete Pipeline

import rasterio
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
from rasterio.enums import Resampling
 
def load_sentinel2_l2a(product_dir):
    """
    Load Sentinel-2 L2A product from unzipped SAFE directory.
    Returns dict of bands at 10m resolution (20m bands upsampled).
    """
    product_dir = Path(product_dir)
 
    # Find the GRANULE directory
    granule_dirs = list((product_dir / "GRANULE").iterdir())
    if not granule_dirs:
        raise FileNotFoundError(f"No granule found in {product_dir}")
    granule = granule_dirs[0]
 
    img_dir = granule / "IMG_DATA"
    r10m = img_dir / "R10m"
    r20m = img_dir / "R20m"
 
    bands = {}
    ref_shape = None
    ref_transform = None
    ref_crs = None
 
    # 10m bands
    for band_name, file_suffix in [("B02", "B02_10m"), ("B03", "B03_10m"),
                                     ("B04", "B04_10m"), ("B08", "B08_10m")]:
        matches = list(r10m.glob(f"*{file_suffix}*"))
        if matches:
            with rasterio.open(matches[0]) as src:
                data = src.read(1).astype(np.float32) * 0.0001
                bands[band_name] = data
                if ref_shape is None:
                    ref_shape = data.shape
                    ref_transform = src.transform
                    ref_crs = src.crs
 
    # 20m bands — resample to 10m
    for band_name, file_suffix in [("B05", "B05_20m"), ("B06", "B06_20m"),
                                     ("B07", "B07_20m"), ("B8A", "B8A_20m"),
                                     ("B11", "B11_20m"), ("B12", "B12_20m"),
                                     ("SCL", "SCL_20m")]:
        matches = list(r20m.glob(f"*{file_suffix}*"))
        if matches:
            with rasterio.open(matches[0]) as src:
                # Resample from 20m to 10m
                data = src.read(
                    1,
                    out_shape=ref_shape,
                    resampling=Resampling.bilinear if band_name != "SCL"
                               else Resampling.nearest,
                ).astype(np.float32)
                if band_name != "SCL":
                    data *= 0.0001
                bands[band_name] = data
 
    meta = {"transform": ref_transform, "crs": ref_crs, "shape": ref_shape}
    return bands, meta
 
 
def cloud_mask_scl(scl):
    """
    Create cloud mask from Sentinel-2 Scene Classification Layer.
    SCL classes:
      0: No Data, 1: Saturated/Defective, 2: Dark area,
      3: Cloud shadow, 4: Vegetation, 5: Bare soils,
      6: Water, 7: Unclassified, 8: Cloud (medium prob),
      9: Cloud (high prob), 10: Thin cirrus, 11: Snow/ice
    Returns: boolean mask (True = VALID pixel, False = cloudy/invalid)
    """
    valid_classes = {4, 5, 6, 7, 11}  # vegetation, soil, water, unclassified, snow
    valid = np.isin(scl.astype(int), list(valid_classes))
    return valid
 
 
# Full pipeline example:
# bands, meta = load_sentinel2_l2a("S2A_MSIL2A_20240315T100021_N0510_R122_T35VLF.SAFE")
#
# # Cloud mask
# if "SCL" in bands:
#     valid = cloud_mask_scl(bands["SCL"])
#     for name in bands:
#         if name != "SCL":
#             bands[name] = np.where(valid, bands[name], np.nan)
#
# # Compute NDVI
# ndvi = (bands["B08"] - bands["B04"]) / (bands["B08"] + bands["B04"] + 1e-10)
#
# # Display
# fig, axes = plt.subplots(1, 3, figsize=(18, 6))
# rgb = np.stack([bands["B04"], bands["B03"], bands["B02"]], axis=-1)
# rgb = np.clip(rgb * 3.5, 0, 1)
# axes[0].imshow(rgb)
# axes[0].set_title("True Color")
# axes[1].imshow(ndvi, cmap="RdYlGn", vmin=-0.2, vmax=0.8)
# axes[1].set_title("NDVI")
# axes[2].imshow(valid, cmap="RdYlGn")
# axes[2].set_title("Cloud Mask (green=valid)")
# plt.show()

Try This Next

Exercise 1: Create NDWI Water Index Map

Load Sentinel-2 B03 (Green) and B08 (NIR) for a coastal area
Compute NDWI = (Green - NIR) / (Green + NIR)
Threshold at 0.1 to create binary water mask
Compute total water area in km^2 (multiply pixel count by pixel area)
Save result as GeoTIFF

Exercise 2: Stack Bands from Multiple Files

Load all 10m Sentinel-2 bands (B02, B03, B04, B08) from separate JP2 files
Stack into a single multi-band GeoTIFF
Verify the output has 4 bands and correct georeference

Exercise 3: Overlay Vector on Raster

Load a Sentinel-2 true color composite
Download Estonia administrative boundaries (e.g., from gadm.org)
Overlay county boundaries on the satellite image
Use geopandas + matplotlib for the overlay

Self-Test Questions

What does the affine transform encode? What happens if it’s wrong?
Why should you NEVER use EPSG:3857 (Web Mercator) for area calculations?
What is the Sentinel-2 SCL band and why is it important?
How do you resample a 20m band to 10m? When would you use bilinear vs nearest-neighbor?
What scale factor converts Sentinel-2 L2A uint16 values to surface reflectance?

GEOINT Notes

Explorer

Tutorial - Working with Raster Data