Skip to content

Timeseries

pyconvexity.timeseries

High-level timeseries API for PyConvexity.

This module provides the main interface for working with timeseries data, matching the efficient patterns used in the Rust implementation.

Key Features: - Ultra-fast binary serialization (matches Rust exactly) - Array-based data structures for maximum performance - Unified API for getting/setting timeseries data - Backward compatibility with legacy point-based format - Efficient sampling and filtering operations

get_timeseries(db_path: str, component_id: int, attribute_name: str, scenario_id: Optional[int] = None, start_index: Optional[int] = None, end_index: Optional[int] = None, max_points: Optional[int] = None) -> Timeseries

Get timeseries data with efficient array-based format.

This is the main function for retrieving timeseries data. It returns a Timeseries object with values as a flat array for maximum performance.

Parameters:

Name Type Description Default
db_path str

Path to the database file

required
component_id int

Component ID

required
attribute_name str

Name of the attribute (e.g., 'p', 'p_set', 'marginal_cost')

required
scenario_id Optional[int]

Scenario ID (uses master scenario if None)

None
start_index Optional[int]

Start index for range queries (optional)

None
end_index Optional[int]

End index for range queries (optional)

None
max_points Optional[int]

Maximum number of points for sampling (optional)

None

Returns:

Type Description
Timeseries

Timeseries object with efficient array-based data

Example

ts = get_timeseries("model.db", component_id=123, attribute_name="p") print(f"Length: {ts.length}, Values: {ts.values[:5]}") Length: 8760, Values: [100.5, 95.2, 87.3, 92.1, 88.7]

Get a subset of the data

ts_subset = get_timeseries("model.db", 123, "p", start_index=100, end_index=200) print(f"Subset length: {ts_subset.length}") Subset length: 100

Sample large datasets

ts_sampled = get_timeseries("model.db", 123, "p", max_points=1000) print(f"Sampled from {ts.length} to {ts_sampled.length} points")

get_timeseries_metadata(db_path: str, component_id: int, attribute_name: str, scenario_id: Optional[int] = None) -> TimeseriesMetadata

Get timeseries metadata without loading the full data.

This is useful for checking the size and properties of a timeseries before deciding whether to load the full data.

Parameters:

Name Type Description Default
db_path str

Path to the database file

required
component_id int

Component ID

required
attribute_name str

Name of the attribute

required
scenario_id Optional[int]

Scenario ID (uses master scenario if None)

None

Returns:

Type Description
TimeseriesMetadata

TimeseriesMetadata with length and type information

Example

meta = get_timeseries_metadata("model.db", 123, "p") print(f"Length: {meta.length}, Type: {meta.data_type}, Unit: {meta.unit}") Length: 8760, Type: float, Unit: MW

set_timeseries(db_path: str, component_id: int, attribute_name: str, values: Union[List[float], np.ndarray, Timeseries], scenario_id: Optional[int] = None) -> None

Set timeseries data using efficient array-based format.

This is the main function for storing timeseries data. It accepts various input formats and stores them efficiently in the database.

Parameters:

Name Type Description Default
db_path str

Path to the database file

required
component_id int

Component ID

required
attribute_name str

Name of the attribute

required
values Union[List[float], ndarray, Timeseries]

Timeseries values as list, numpy array, or Timeseries object

required
scenario_id Optional[int]

Scenario ID (uses master scenario if None)

None
Example

Set from a list

values = [100.5, 95.2, 87.3, 92.1, 88.7] set_timeseries("model.db", 123, "p_set", values)

Set from numpy array

import numpy as np values = np.random.normal(100, 10, 8760) # Hourly data for a year set_timeseries("model.db", 123, "p_max_pu", values)

Set from existing Timeseries object

ts = get_timeseries("model.db", 456, "p") set_timeseries("model.db", 123, "p_set", ts)

get_multiple_timeseries(db_path: str, requests: List[dict], max_points: Optional[int] = None) -> List[Timeseries]

Get multiple timeseries efficiently in a single database connection.

This is more efficient than calling get_timeseries multiple times when you need to load many timeseries from the same database.

Parameters:

Name Type Description Default
db_path str

Path to the database file

required
requests List[dict]

List of dicts with keys: component_id, attribute_name, scenario_id (optional)

required
max_points Optional[int]

Maximum number of points for sampling (applied to all)

None

Returns:

Type Description
List[Timeseries]

List of Timeseries objects in the same order as requests

Example

requests = [ ... {"component_id": 123, "attribute_name": "p"}, ... {"component_id": 124, "attribute_name": "p"}, ... {"component_id": 125, "attribute_name": "p", "scenario_id": 2} ... ] timeseries_list = get_multiple_timeseries("model.db", requests) print(f"Loaded {len(timeseries_list)} timeseries")

timeseries_to_numpy(timeseries: Timeseries) -> np.ndarray

Convert Timeseries to numpy array for scientific computing.

Parameters:

Name Type Description Default
timeseries Timeseries

Timeseries object

required

Returns:

Type Description
ndarray

numpy array with float32 dtype for memory efficiency

Example

ts = get_timeseries("model.db", 123, "p") arr = timeseries_to_numpy(ts) print(f"Mean: {arr.mean():.2f}, Std: {arr.std():.2f}")

numpy_to_timeseries(array: np.ndarray, data_type: str = 'float', unit: Optional[str] = None, is_input: bool = True) -> Timeseries

Convert numpy array to Timeseries object.

Parameters:

Name Type Description Default
array ndarray

numpy array of values

required
data_type str

Data type string (default: "float")

'float'
unit Optional[str]

Unit string (optional)

None
is_input bool

Whether this is input data (default: True)

True

Returns:

Type Description
Timeseries

Timeseries object

Example

import numpy as np arr = np.random.normal(100, 10, 8760) ts = numpy_to_timeseries(arr, unit="MW") print(f"Created timeseries with {ts.length} points")

validate_timeseries_alignment(db_path: str, values: Union[List[float], np.ndarray, Timeseries]) -> dict

Validate that timeseries data aligns with network time periods.

Parameters:

Name Type Description Default
db_path str

Path to the database file

required
values Union[List[float], ndarray, Timeseries]

Timeseries values to validate

required

Returns:

Type Description
dict

Dictionary with validation results

Example

values = [100.0] * 8760 # Hourly data for a year result = validate_timeseries_alignment("model.db", 1, values) if result["is_valid"]: ... print("Timeseries is properly aligned") ... else: ... print(f"Alignment issues: {result['issues']}")