Clear Sky Detection Demo With McClear#

This notebook demonstrates the McClear-based workflow in the bsrn package, including data reading, clear-sky modeling, quality control tests, Reno clear-sky detection, and CSD-point visualization.

1. Import libraries#

[1]:
import os
import random
import pandas as pd
import bsrn

# Config: input file (same pattern as cloud_enhancement_events_detection.ipynb)
INPUT_FILE = "/Volumes/Macintosh Research/Data/bsrn-qc/data/QIQ/qiq0924.dat.gz"

# Set display options
pd.options.display.max_columns = None
%matplotlib inline

2. Load QIQ September data#

[2]:
# Extract station code from filename (e.g. qiq0924.dat.gz -> QIQ)
stn = os.path.basename(INPUT_FILE).split(".")[0][:3].upper()

if os.path.exists(INPUT_FILE):
    df = bsrn.io.reader.read_station_to_archive(INPUT_FILE, logical_records="lr0100")
    print(f"Data for {stn} loaded successfully (September 2024).")
    display(df.head())
else:
    raise FileNotFoundError(f"File not found: {INPUT_FILE}. Current working directory: {os.getcwd()}")

Data for QIQ loaded successfully (September 2024).
ghi bni dhi lwd temp rh pressure
time
2024-09-01 00:00:00+00:00 142.0 0.0 143.0 371.0 22.1 48.0 988.0
2024-09-01 00:01:00+00:00 141.0 0.0 141.0 371.0 22.2 49.0 988.0
2024-09-01 00:02:00+00:00 139.0 0.0 140.0 371.0 22.2 46.0 988.0
2024-09-01 00:03:00+00:00 139.0 0.0 139.0 371.0 22.2 45.0 988.0
2024-09-01 00:04:00+00:00 137.0 0.0 138.0 371.0 22.1 48.0 988.0

3. Clear-sky modeling with McClear#

[4]:
# TODO: replace with your real SoDa / McClear email
mcclear_email = "your_email@example.com"

df_cs = bsrn.modeling.clear_sky.add_clearsky_columns(
    df.copy(),
    station_code=stn,
    model="mcclear",
    mcclear_email=mcclear_email,
)

display(df_cs[["ghi", "ghi_clear", "bni", "bni_clear", "dhi", "dhi_clear"]].head())

ghi ghi_clear bni bni_clear dhi dhi_clear
time
2024-09-01 00:00:00+00:00 142.0 NaN 0.0 NaN 143.0 NaN
2024-09-01 00:01:00+00:00 141.0 446.412 0.0 687.102 141.0 114.546
2024-09-01 00:02:00+00:00 139.0 449.172 0.0 688.938 140.0 114.780
2024-09-01 00:03:00+00:00 139.0 451.926 0.0 690.756 139.0 115.008
2024-09-01 00:04:00+00:00 137.0 454.668 0.0 692.556 138.0 115.236

4. Run QC wrapper#

run_qc runs a suite of BSRN quality control tests and adds flag columns (0 = pass, 1 = fail). By default it runs all six test groups:

Test

Level

Description

Flag columns

ppl

1

Physically Possible Limits — absolute bounds for GHI, BNI, DHI, LWD

flagPPLGHI, flagPPLBNI, flagPPLDHI, flagPPLLWD

erl

2

Extremely Rare Limits — climatological limits for GHI, BNI, DHI, LWD

flagERLGHI, flagERLBNI, flagERLDHI, flagERLLWD

closure

3

Closure — \(G_h\) vs \(B_n \cos Z + D_h\) at low/high zenith

flag3lowSZA, flag3highSZA

diff_ratio

4

Diffuse ratio — \(k\)\(k_t\) and diffuse-fraction checks

flagKKt, flagKlowSZA, flagKhighSZA

k_index

5

K-indices — \(k_b\)/\(k_t\) and clearness-index limits

flagKbKt, flagKb, flagKt

tracker

6

Tracker-off — detects tracking errors using clear-sky/extra references

flagTracker

You can restrict tests via tests=('ppl', 'closure') etc.

[5]:
df_qc = bsrn.qc.run_qc(df_cs.copy(), station_code=stn)

flag_cols = [c for c in df_qc.columns if c.startswith("flag")]
print("QC flag counts (non-zero indicates failure):")
display(df_qc[flag_cols].sum())

QC flag counts (non-zero indicates failure):
flagPPLGHI          0
flagPPLBNI          0
flagPPLDHI          0
flagPPLLWD          0
flagERLGHI      16408
flagERLBNI          3
flagERLDHI      15498
flagERLLWD          0
flag3lowSZA         0
flag3highSZA        0
flagKKt             0
flagKlowSZA         0
flagKhighSZA        0
flagKbKt            0
flagKb              0
flagKt              0
flagTracker       100
dtype: int64

5. Reno clear-sky detection#

[6]:
ghi = df_qc["ghi"].values
ghi_clear = df_qc["ghi_clear"].values
times = df_qc.index

reno_out = bsrn.utils.reno_csd(ghi, ghi_clear, times=times, return_diagnostics=False)

df_csd = df_qc.copy()
df_csd["is_clearsky_reno"] = reno_out["is_clearsky"]
df_csd["cloud_flag_reno"] = reno_out["cloud_flag"]

display(df_csd[["ghi", "ghi_clear", "is_clearsky_reno", "cloud_flag_reno"]].head())

ghi ghi_clear is_clearsky_reno cloud_flag_reno
time
2024-09-01 00:00:00+00:00 142.0 NaN <NA> NaN
2024-09-01 00:01:00+00:00 141.0 446.412 False 1.0
2024-09-01 00:02:00+00:00 139.0 449.172 False 1.0
2024-09-01 00:03:00+00:00 139.0 451.926 False 1.0
2024-09-01 00:04:00+00:00 137.0 454.668 False 1.0

6. Choose a random Reno clear-sky day#

[7]:
cs = df_csd["is_clearsky_reno"]
cs_days = cs[cs == True].index.normalize().unique()

if len(cs_days) == 0:
    raise RuntimeError("No clear-sky points found by Reno in this month.")

random_day = random.choice(list(cs_days))
print(f"Random clear-sky day (Reno): {random_day}")

day_slice = df_csd.loc[str(random_day.date())]
display(day_slice[["ghi", "ghi_clear", "is_clearsky_reno"]].head())

Random clear-sky day (Reno): 2024-09-25 00:00:00+00:00
ghi ghi_clear is_clearsky_reno
time
2024-09-25 00:00:00+00:00 313.0 334.836 False
2024-09-25 00:01:00+00:00 315.0 337.548 False
2024-09-25 00:02:00+00:00 323.0 340.314 False
2024-09-25 00:03:00+00:00 326.0 343.074 False
2024-09-25 00:04:00+00:00 327.0 345.834 False

7. Plot September 2 with Reno clear-sky points#

[8]:
import matplotlib.pyplot as plt

# Choose a specific day in September
sep2_str = "2024-09-02"
if sep2_str not in df_csd.index.strftime("%Y-%m-%d").unique():
    raise ValueError(f"No data found for {sep2_str} in df_csd.")

sep2 = df_csd.loc[sep2_str]

plt.figure(figsize=(10, 4))
plt.plot(sep2.index, sep2["ghi"], label="Measured GHI", color="#E69F00", linewidth=0.8)
plt.plot(sep2.index, sep2["ghi_clear"], label="McClear GHI (clear-sky)", color="#56B4E9", linestyle="--", linewidth=0.8)

# Highlight Reno clear-sky points
if "is_clearsky_reno" in sep2.columns:
    cs_mask = sep2["is_clearsky_reno"] == True
    if cs_mask.any():
        plt.scatter(
            sep2.index[cs_mask],
            sep2["ghi"][cs_mask],
            s=8,
            color="#009E73",
            label="Reno clear-sky",
            zorder=5,
        )

plt.title(f"{stn} GHI and McClear GHI on {sep2_str}")
plt.ylabel("Irradiance [W/m²]")
plt.xlabel("Time (UTC)")
plt.legend(frameon=False)
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()
../_images/tutorials_5.clear_sky_detection_15_0.png