Timeseries¶
pyconvexity.timeseries
¶
High-level timeseries API for PyConvexity.
This module provides the main interface for working with timeseries data, matching the efficient patterns used in the Rust implementation.
Key Features: - Ultra-fast binary serialization (matches Rust exactly) - Array-based data structures for maximum performance - Unified API for getting/setting timeseries data - Backward compatibility with legacy point-based format - Efficient sampling and filtering operations
get_timeseries(db_path: str, component_id: int, attribute_name: str, scenario_id: Optional[int] = None, start_index: Optional[int] = None, end_index: Optional[int] = None, max_points: Optional[int] = None) -> Timeseries
¶
Get timeseries data with efficient array-based format.
This is the main function for retrieving timeseries data. It returns a Timeseries object with values as a flat array for maximum performance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db_path
|
str
|
Path to the database file |
required |
component_id
|
int
|
Component ID |
required |
attribute_name
|
str
|
Name of the attribute (e.g., 'p', 'p_set', 'marginal_cost') |
required |
scenario_id
|
Optional[int]
|
Scenario ID (uses master scenario if None) |
None
|
start_index
|
Optional[int]
|
Start index for range queries (optional) |
None
|
end_index
|
Optional[int]
|
End index for range queries (optional) |
None
|
max_points
|
Optional[int]
|
Maximum number of points for sampling (optional) |
None
|
Returns:
| Type | Description |
|---|---|
Timeseries
|
Timeseries object with efficient array-based data |
Example
ts = get_timeseries("model.db", component_id=123, attribute_name="p") print(f"Length: {ts.length}, Values: {ts.values[:5]}") Length: 8760, Values: [100.5, 95.2, 87.3, 92.1, 88.7]
Get a subset of the data¶
ts_subset = get_timeseries("model.db", 123, "p", start_index=100, end_index=200) print(f"Subset length: {ts_subset.length}") Subset length: 100
Sample large datasets¶
ts_sampled = get_timeseries("model.db", 123, "p", max_points=1000) print(f"Sampled from {ts.length} to {ts_sampled.length} points")
get_timeseries_metadata(db_path: str, component_id: int, attribute_name: str, scenario_id: Optional[int] = None) -> TimeseriesMetadata
¶
Get timeseries metadata without loading the full data.
This is useful for checking the size and properties of a timeseries before deciding whether to load the full data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db_path
|
str
|
Path to the database file |
required |
component_id
|
int
|
Component ID |
required |
attribute_name
|
str
|
Name of the attribute |
required |
scenario_id
|
Optional[int]
|
Scenario ID (uses master scenario if None) |
None
|
Returns:
| Type | Description |
|---|---|
TimeseriesMetadata
|
TimeseriesMetadata with length and type information |
Example
meta = get_timeseries_metadata("model.db", 123, "p") print(f"Length: {meta.length}, Type: {meta.data_type}, Unit: {meta.unit}") Length: 8760, Type: float, Unit: MW
set_timeseries(db_path: str, component_id: int, attribute_name: str, values: Union[List[float], np.ndarray, Timeseries], scenario_id: Optional[int] = None) -> None
¶
Set timeseries data using efficient array-based format.
This is the main function for storing timeseries data. It accepts various input formats and stores them efficiently in the database.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db_path
|
str
|
Path to the database file |
required |
component_id
|
int
|
Component ID |
required |
attribute_name
|
str
|
Name of the attribute |
required |
values
|
Union[List[float], ndarray, Timeseries]
|
Timeseries values as list, numpy array, or Timeseries object |
required |
scenario_id
|
Optional[int]
|
Scenario ID (uses master scenario if None) |
None
|
Example
Set from a list¶
values = [100.5, 95.2, 87.3, 92.1, 88.7] set_timeseries("model.db", 123, "p_set", values)
Set from numpy array¶
import numpy as np values = np.random.normal(100, 10, 8760) # Hourly data for a year set_timeseries("model.db", 123, "p_max_pu", values)
Set from existing Timeseries object¶
ts = get_timeseries("model.db", 456, "p") set_timeseries("model.db", 123, "p_set", ts)
get_multiple_timeseries(db_path: str, requests: List[dict], max_points: Optional[int] = None) -> List[Timeseries]
¶
Get multiple timeseries efficiently in a single database connection.
This is more efficient than calling get_timeseries multiple times when you need to load many timeseries from the same database.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db_path
|
str
|
Path to the database file |
required |
requests
|
List[dict]
|
List of dicts with keys: component_id, attribute_name, scenario_id (optional) |
required |
max_points
|
Optional[int]
|
Maximum number of points for sampling (applied to all) |
None
|
Returns:
| Type | Description |
|---|---|
List[Timeseries]
|
List of Timeseries objects in the same order as requests |
Example
requests = [ ... {"component_id": 123, "attribute_name": "p"}, ... {"component_id": 124, "attribute_name": "p"}, ... {"component_id": 125, "attribute_name": "p", "scenario_id": 2} ... ] timeseries_list = get_multiple_timeseries("model.db", requests) print(f"Loaded {len(timeseries_list)} timeseries")
timeseries_to_numpy(timeseries: Timeseries) -> np.ndarray
¶
Convert Timeseries to numpy array for scientific computing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
timeseries
|
Timeseries
|
Timeseries object |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
numpy array with float32 dtype for memory efficiency |
Example
ts = get_timeseries("model.db", 123, "p") arr = timeseries_to_numpy(ts) print(f"Mean: {arr.mean():.2f}, Std: {arr.std():.2f}")
numpy_to_timeseries(array: np.ndarray, data_type: str = 'float', unit: Optional[str] = None, is_input: bool = True) -> Timeseries
¶
Convert numpy array to Timeseries object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
array
|
ndarray
|
numpy array of values |
required |
data_type
|
str
|
Data type string (default: "float") |
'float'
|
unit
|
Optional[str]
|
Unit string (optional) |
None
|
is_input
|
bool
|
Whether this is input data (default: True) |
True
|
Returns:
| Type | Description |
|---|---|
Timeseries
|
Timeseries object |
Example
import numpy as np arr = np.random.normal(100, 10, 8760) ts = numpy_to_timeseries(arr, unit="MW") print(f"Created timeseries with {ts.length} points")
validate_timeseries_alignment(db_path: str, values: Union[List[float], np.ndarray, Timeseries]) -> dict
¶
Validate that timeseries data aligns with network time periods.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db_path
|
str
|
Path to the database file |
required |
values
|
Union[List[float], ndarray, Timeseries]
|
Timeseries values to validate |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Dictionary with validation results |
Example
values = [100.0] * 8760 # Hourly data for a year result = validate_timeseries_alignment("model.db", 1, values) if result["is_valid"]: ... print("Timeseries is properly aligned") ... else: ... print(f"Alignment issues: {result['issues']}")