Writing a station-to-archive .dat file#

WRMC-style archives use logical records (LR0001, …) and minute blocks (LR0100, LR4000). In Python, bsrn.archive mirrors rBSRN: build objects, call ``get_bsrn_format()``, join the strings.

The first code cell is the metadata part. The second cell follows the rBSRN pattern for LR0100 (yearMonth, then rep(1, n)-style placeholders for selected columns). The full month string is large; we print a short preview only (no temp file).

1. Metadata and instrument LRs#

Same pattern as rBSRN: LR0001$new(...), LR0002$new(...), …, then get_bsrn_format() and concatenate.

[1]:
import bsrn.archive as archive

# LR0001 … LR0008 (same idea as rBSRN LR0001$new(...), LR0002$new(...), …)
lr0001 = archive.LR0001(stationNumber=82, month=7, year=2020, version=1)
lr0002 = archive.LR0002(
    scientistName="John Doe",
    scientistTel="+1 800 123 4567",
    scientistFax="+1 800 123 4567",
    scientistMail="john.doe@example.com",
    scientistAddress="123 Main Street, Anytown, USA 12345",
    deputyName="Jane Doe",
    deputyTel="+1 800 123 4567",
    deputyFax="+1 800 123 4567",
    deputyMail="jane.doe@example.com",
    deputyAddress="123 Main Street, Anytown, USA 12345",
)
lr0003 = archive.LR0003(message="Example commentary (tutorial placeholder).")

lr0004 = archive.LR0004(
    surfaceType=11,
    topographyType=1,
    address="123 Main Street, Anytown, USA 12345",
    telephone="+1 800 123 4567",
    latitude=69.099 + 90.0,
    longitude=235.484,
    altitude=116,
    azimuth="0,15,30",
    elevation="0,0,0",
)
lr0007 = archive.LR0007()

lr0008 = archive.LR0008(
    manufacturer="Kipp & Zonen",
    model="CMP 22",
    serialNumber="140114",
    identification=82001,
    radiationQuantityMeasured=2,
    pyrgeometerDome=2,
    location="PMOD-WRC",
    person="N. Mingard",
    startOfCalibPeriod1="05/16/17",
    endOfCalibPeriod1="06/30/17",
    numOfComp1=1,
    meanCalibCoeff1=8.8000,
    stdErrorCalibCoeff1=0.0600,
    remarksOnCalib1="uV/W.m2",
)

blocks = (
    lr0001.get_bsrn_format(),
    lr0002.get_bsrn_format(),
    lr0003.get_bsrn_format(),
    lr0004.get_bsrn_format(),
    lr0007.get_bsrn_format(synop=lr0004.synop),
    lr0008.get_bsrn_format(printLr=True),
    lr0008.get_bsrn_format(printLr=True, LR0009Format=True),
)
text = "\n".join(b.rstrip("\n") for b in blocks) + "\n"
print(f"{len(text):,} characters (metadata only)")
print(text)
2,093 characters (metadata only)
*C0001
 82  7 2020  1
         2         3         4         5        21        22        23        -1
*U0002
 -1 -1 -1
John Doe                               +1 800 123 4567      +1 800 123 4567
XXX             john.doe@example.com
123 Main Street, Anytown, USA 12345
 -1 -1 -1
Jane Doe                               +1 800 123 4567      +1 800 123 4567
XXX             jane.doe@example.com
123 Main Street, Anytown, USA 12345
*U0003
Example commentary (tutorial placeholder).
*U0004
 -1 -1 -1
 11  1
123 Main Street, Anytown, USA 12345
+1 800 123 4567      XXX
XXX             XXX
 159.099 235.484  116 XXXXX
 -1 -1 -1
   0  0  15  0  30  0  -1 -1  -1 -1  -1 -1  -1 -1  -1 -1  -1 -1  -1 -1  -1 -1
*U0007
 -1 -1 -1
XXX
XXX
XXX
XXX
XXX
N N N N N N
*U0008
 -1 -1 -1 N
Kipp & Zonen                   CMP 22          140114             XXX      82001
XXX
 -1  2  -1.000  -1.000  -1.000  -1.000  -1.000  -1.000 -1 -1
PMOD-WRC                       N. Mingard
05/16/17 06/30/17  1       8.8000       0.0600
XXX      XXX      -1      -1.0000      -1.0000
XXX      XXX      -1      -1.0000      -1.0000
uV/W.m2
XXX
*U0009
 -1 -1 -1         2 82001 -1

2. LR0100 / LR4000 (minute data) and full file string#

Run §1 first. LR0100 in rBSRN: create the object, set yearMonth, then assign columns with rep(1, 44640) (here pd.Series([1] * n) for ghi_avg / dhi_avg). Other minute columns must be present for the formatter: we fill them with missing (NaN) except the two demo series.

Sphinx: pre-rendered outputs; run locally to refresh.

[2]:
# Run §1 first. In production you might write the final string to disk, e.g.:
#   filename = f"{stn}{MM}{YY2}.dat"
#   out_path = os.path.join(input_dir, filename)
#   open(out_path, "w", encoding="ascii", newline="\n").write(text)
# or concatenate all LR strings and use a small helper like cat_to_file(out_path, *archive_blocks).
# Here we only print the full-month example to the screen (no tempfile).

import pandas as pd

import bsrn.archive as archive
from bsrn.archive.specs import LR_SPECS

# LR0100 — same spirit as rBSRN:
#   lr0100 <- LR0100$new(); lr0100$yearMonth <- "2020-07"
#   lr0100$global2_avg <- rep(1, 44640); lr0100$diffuse_avg <- rep(1, 44640)
# Python: set year_month and n, then fill columns (global2_avg -> ghi_avg, diffuse_avg -> dhi_avg).
# Other LR0100 minute columns must be filled for formatting: we use NaN except the two demo columns.
year_month = "2020-07"
n = 44640  # 31 * 1440 for July; same as rep(1, 44640) in R

def missing_col():
    return pd.Series([float('nan')] * n)

lr0100_kw = {"yearMonth": year_month}
for name in LR_SPECS["LR0100"]:
    if name != "yearMonth":
        lr0100_kw[name] = missing_col()
lr0100_kw["ghi_avg"] = pd.Series([1] * n)   # placeholder; replace with your GHI series
lr0100_kw["dhi_avg"] = pd.Series([1] * n)   # placeholder; replace with your diffuse series
lr0100 = archive.LR0100(**lr0100_kw)

# LR4000 — e.g. lr4000$bodyT_down <- rep(1, 44640); here all missing except you add your own
lr4000_kw = {"yearMonth": year_month}
for name in LR_SPECS["LR4000"]:
    if name != "yearMonth":
        lr4000_kw[name] = missing_col()
lr4000 = archive.LR4000(**lr4000_kw)

lr4000const = archive.LR4000CONST(
    serialNumber_Manufacturer="050783",
    serialNumber_WRMC="61008",
    yyyymmdd=20211026,
    manufact="KZ",
    model="CH1",
    C=9.62,
    k0="ND",
    k1=0.02,
    k2=0.9974,
    k3="ND",
    f="ND",
)
const_line = lr4000const.get_bsrn_format(method=2)

blocks = (
    lr0001.get_bsrn_format(),
    lr0002.get_bsrn_format(),
    lr0003.get_bsrn_format(const_line),
    lr0004.get_bsrn_format(),
    lr0007.get_bsrn_format(synop=lr0004.synop),
    lr0008.get_bsrn_format(printLr=True),
    lr0008.get_bsrn_format(printLr=True, LR0009Format=True),
    lr0100.get_bsrn_format(changed=True),
    lr4000.get_bsrn_format(changed=True),
)
text = "\n".join(b.rstrip("\n") for b in blocks) + "\n"
print(f"{len(text):,} characters (full month; preview only below)")
lines = text.splitlines()
print("--- first lines ---")
print("\n".join(lines[:16]))
print("...")
print("--- last lines ---")
print("\n".join(lines[-4:]))
9,376,601 characters (full month; preview only below)
--- first lines ---
*C0001
 82  7 2020  1
         2         3         4         5        21        22        23        -1
*U0002
 -1 -1 -1
John Doe                               +1 800 123 4567      +1 800 123 4567
XXX             john.doe@example.com
123 Main Street, Anytown, USA 12345
 -1 -1 -1
Jane Doe                               +1 800 123 4567      +1 800 123 4567
XXX             jane.doe@example.com
123 Main Street, Anytown, USA 12345
*U0003
Example commentary (tutorial placeholder).
@LR4000CONST,  50783, 61008, CAL_20211026_KZ_CH1_50783_61008, 9.62, ND, 0.02,&
0.9974, ND, ND
...
--- last lines ---
 31 1436 -99.99 -99.99 -99.99 -99.99 -999.9  -99.99 -99.99 -99.99 -99.99 -999.9
 31 1437 -99.99 -99.99 -99.99 -99.99 -999.9  -99.99 -99.99 -99.99 -99.99 -999.9
 31 1438 -99.99 -99.99 -99.99 -99.99 -999.9  -99.99 -99.99 -99.99 -99.99 -999.9
 31 1439 -99.99 -99.99 -99.99 -99.99 -999.9  -99.99 -99.99 -99.99 -99.99 -999.9