Example: Compare Ground Measurements with CAMS CRS Satellite Data#
示例:对比地面观测与 CAMS CRS 卫星数据#
This notebook loads one station-month with :class:~bsrn.dataset.BSRNDataset, runs solpos → clear_sky (REST2 / MERRA-2 HF) → qc_test → qc_mask, adds CAMS CRS columns from Hugging Face, time-averages with :meth:~bsrn.dataset.BSRNDataset.average, and plots a calendar comparison.
本手册用 :class:~bsrn.dataset.BSRNDataset 加载单月数据,执行 solpos → clear_sky(REST2 / MERRA-2 HF)→ qc_test → qc_mask,从 Hugging Face 加入 CRS 列,再以 :meth:~bsrn.dataset.BSRNDataset.average 做时间平均并绘制日历对比图。
[1]:
import os
import bsrn
from bsrn.io.crs import add_crs_columns
from bsrn.visualization import plot_calendar
1. Parameters / 参数设置#
[2]:
INPUT_FILE = "/Volumes/Macintosh Research/Data/bsrn-qc/data/QIQ/qiq0824.dat.gz"
print(f"--- Processing {INPUT_FILE} ---")
if not os.path.exists(INPUT_FILE):
raise FileNotFoundError(f"File not found: {INPUT_FILE}")
--- Processing /Volumes/Macintosh Research/Data/bsrn-qc/data/QIQ/qiq0824.dat.gz ---
2. Load and pipeline / 加载与管线#
BSRNDataset.from_file validates LR0100; then solpos, clear_sky(model='rest2'), qc_test, and qc_mask mutate the cached frame in place.
BSRNDataset.from_file 校验 LR0100;随后 solpos、clear_sky(model='rest2')、qc_test、qc_mask 原地更新缓存帧。
[3]:
ds = bsrn.BSRNDataset.from_file(INPUT_FILE)
print(f"Station {ds.station_code}: loaded {len(ds.data())} rows.")
print("solpos() …")
ds.solpos()
print("clear_sky(model='rest2') …")
ds.clear_sky(model="rest2")
print("qc_test() …")
ds.qc_test()
print("qc_mask() …")
ds.qc_mask()
Station QIQ: loaded 44640 rows.
solpos() …
clear_sky(model='rest2') …
Fetching MERRA-2 from Hugging Face: qiq/qiq0824_merra2.parquet
qc_test() …
qc_mask() …
[3]:
| ghi | bni | dhi | lwd | temp | rh | pressure | zenith | apparent_zenith | azimuth | bni_extra | ghi_extra | ghi_clear | bni_clear | dhi_clear | lwd_clear | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2024-08-01 00:00:00+00:00 | 105.0 | 0.0 | 106.0 | 421.0 | 25.2 | 70.0 | 981.0 | 54.881 | 54.857 | 102.414 | 1320.650 | 759.739 | 507.439814 | 664.498840 | 125.169227 | 410.715619 |
| 2024-08-01 00:01:00+00:00 | 108.0 | 0.0 | 109.0 | 422.0 | 25.2 | 71.0 | 981.0 | 54.717 | 54.694 | 102.625 | 1320.650 | 762.828 | 509.994881 | 665.753119 | 125.445597 | 411.548727 |
| 2024-08-01 00:02:00+00:00 | 112.0 | 0.0 | 113.0 | 421.0 | 25.2 | 70.0 | 981.0 | 54.553 | 54.530 | 102.836 | 1320.650 | 765.910 | 512.545982 | 666.998615 | 125.720382 | 410.715619 |
| 2024-08-01 00:03:00+00:00 | 113.0 | 0.0 | 114.0 | 421.0 | 25.2 | 70.0 | 981.0 | 54.390 | 54.367 | 103.049 | 1320.650 | 768.968 | 515.077560 | 668.227890 | 125.991932 | 410.715619 |
| 2024-08-01 00:04:00+00:00 | 116.0 | 0.0 | 117.0 | 421.0 | 25.2 | 71.0 | 981.0 | 54.226 | 54.204 | 103.261 | 1320.650 | 772.038 | 517.620641 | 669.456101 | 126.263590 | 411.548727 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-08-31 23:55:00+00:00 | 123.0 | 0.0 | 123.0 | 371.0 | 21.9 | 45.0 | 988.0 | 61.982 | 61.951 | 110.304 | 1335.838 | 627.508 | 430.077633 | 741.828764 | 81.604368 | 359.041392 |
| 2024-08-31 23:56:00+00:00 | 129.0 | 0.0 | 130.0 | 370.0 | 21.9 | 45.0 | 988.0 | 61.825 | 61.794 | 110.521 | 1335.838 | 630.737 | 432.754922 | 743.268714 | 81.808573 | 359.041392 |
| 2024-08-31 23:57:00+00:00 | 135.0 | 0.0 | 136.0 | 370.0 | 21.9 | 46.0 | 988.0 | 61.668 | 61.637 | 110.738 | 1335.838 | 633.962 | 435.429929 | 744.697256 | 82.011590 | 360.170497 |
| 2024-08-31 23:58:00+00:00 | 138.0 | 0.0 | 139.0 | 370.0 | 21.9 | 47.0 | 988.0 | 61.511 | 61.481 | 110.956 | 1335.838 | 637.181 | 438.102614 | 746.114514 | 82.213429 | 361.278756 |
| 2024-08-31 23:59:00+00:00 | 141.0 | 0.0 | 142.0 | 371.0 | 21.9 | 46.0 | 988.0 | 61.354 | 61.324 | 111.174 | 1335.838 | 640.396 | 440.772936 | 747.520613 | 82.414101 | 360.170497 |
44640 rows × 16 columns
3. CRS columns and time averaging / CRS 列与时间平均#
Add 1-minute CRS from Hugging Face before averaging. ds.average delegates to pretty_average and replaces the cache.
在平均之前加入 1 分钟 CRS。ds.average 委托 pretty_average 并替换缓存。
[4]:
print("add_crs_columns (Hugging Face) …")
add_crs_columns(ds.data(), station_code=ds.station_code)
print("average(freq='30min', alignment='center') …")
df_avg = ds.average(
"30min",
alignment="center",
match_ceiling_labels=True,
)
df_avg[["ghi", "ghi_clear", "ghi_crs"]].head()
add_crs_columns (Hugging Face) …
Fetching CRS from Hugging Face: qiq/qiq0824_crs.parquet
average(freq='30min', alignment='center') …
[4]:
| ghi | ghi_clear | ghi_crs | |
|---|---|---|---|
| 2024-08-01 00:30:00+00:00 | 181.666667 | 582.063900 | 200.3826 |
| 2024-08-01 01:00:00+00:00 | 456.650000 | 648.930252 | 420.4916 |
| 2024-08-01 01:30:00+00:00 | 356.782609 | 707.238474 | 414.9592 |
| 2024-08-01 02:00:00+00:00 | 291.666667 | 755.752768 | 392.6008 |
| 2024-08-01 02:30:00+00:00 | 295.133333 | 793.024248 | 375.5578 |
4. Visualization / 可视化#
[5]:
p = plot_calendar(
df=df_avg,
output_file=None,
station_code=ds.station_code,
meas_col="ghi",
clear_col="ghi_clear",
other_cols=["ghi_crs"],
labels=[
"Measured GHI",
"REST2 Clear-sky GHI",
"CAMS CRS Satellite GHI",
],
)
p
/Volumes/Macintosh Research/Data/bsrn-qc/src/bsrn/visualization/calendar.py:98: UserWarning: Automatically sliced input DataFrame to target month: 2024-08
[5]: