RidePy Tutorial 4: Parallelization
This tutorial covers the SimulationSet
class from the extras
package. Its purpose is to orchestrate multiple simulation runs at different parameters. Parameters scans can be configured, simulations run in parallel, stored to disk, and analyzed automatically.
Configuration
To get started, we first need to create a place to store the data at:
from pathlib import Path
tmp_path = Path("./simulations_tmp").resolve()
tmp_path.mkdir(exist_ok=True)
Now we can proceed with setting up the SimulationSet
.
We configure the base_params
applying to all simulations to simulate 100 requests (n_reqs=100
) and use the BruteForceTotalTravelTimeMinimizingDispatcher
.
product_params
takes iterables of parameters of which all combinations are simulated (Cartesian product). In this case we vary the number of vehicles to be either 10 or 100, and the seat capacity of the vehicles to be either 2 or 8. We will thus have the following four combinations:
10 vehicles at capacity 2
100 vehicles at capacity 2
10 vehicles at capacity 8
100 vehicles at capacity 8
After executing the upcoming cell, those four combinations will be configured and can be executed automatically and at once.
from ridepy.extras.simulation_set import SimulationSet
from ridepy.util.dispatchers_cython import (
BruteForceTotalTravelTimeMinimizingDispatcher as CyBruteForceTotalTravelTimeMinimizingDispatcher,
)
simulation_set = SimulationSet(
base_params={
"general": {"n_reqs": 100},
"dispatcher": {
"dispatcher_cls": CyBruteForceTotalTravelTimeMinimizingDispatcher,
},
},
product_params={
"general": {
"n_vehicles": [10, 100],
"seat_capacity": [2, 8],
},
},
data_dir=tmp_path,
)
Taking the length of the simulation set confirms the four combinations configured:
len(simulation_set)
4
Running the simulations
To execute the simulations, we execute the SimulationSet.run
method:
%time simulation_set.run()
CPU times: user 12.9 ms, sys: 39.4 ms, total: 52.4 ms
Wall time: 1.76 s
This concludes the simulations.
Running analytics
To additionally run the analytics code on the resulting events from all four simulation runs, we execute the SimulationSet.run_analytics
method:
simulation_set.run_analytics(only_stops_and_requests=True)
Inspecting the results
The simulation runs have created a bunch of files in the output directory. For each of the four parameter sets, four files are created by running the simulations and analytics:
<simulation_id>_params.json
, which contains the parameter set/configuration of the simulation run in JSON format<simulation_id>_.jsonl
, which contains the events created by the simulation in JSON Lines format<simulation_id>_stops.pq
, which contains the stops dataframe created by the analytics module in Parquet format<simulation_id>_requests.pq
, which contains the requests dataframe created by the analytics module in Parquet format
The simulation ids can be retrieved from the simulation set object:
simulation_set.simulation_ids
['2bd4ee2599434968f3b60bf644a9735f644728012e3e8418dbb16f60',
'dc8de5e59e303fe67bb48f169b7de5abb42fef40b64d92061a8d3a2c',
'936e17dbb6559dde941cb5a21b5fd22d44872adf2827bef2ab8c392c',
'30465cf282c11d5345f061547abe77a32c9511b8228a44f286d0f9dd']
We will conclude this tutorial with having a brief look at all four files.
Parameter configuration file
Using the read_params_json
function, we can easily retrieve the configuration for the first simulation:
from ridepy.extras.io import read_params_json
params = read_params_json(simulation_set.param_paths[0])
params
{'dispatcher': {'dispatcher_cls': ridepy.util.dispatchers_cython.dispatchers.BruteForceTotalTravelTimeMinimizingDispatcher},
'general': {'fleet_state_cls': ridepy.fleet_state.SlowSimpleFleetState,
'initial_location': (0, 0),
'initial_locations': None,
'n_reqs': 100,
'n_vehicles': 10,
'seat_capacity': 2,
'space': Euclidean2D(velocity=1.0),
't_cutoff': None,
'transportation_request_cls': ridepy.data_structures_cython.data_structures.TransportationRequest,
'vehicle_state_cls': ridepy.vehicle_state_cython.vehicle_state.VehicleState},
'request_generator': {'rate': 1,
'request_generator_cls': ridepy.util.request_generators.RandomRequestGenerator}}
Events file
Similarly, using read_events_json
, we can load the events output by the same simulation:
from ridepy.extras.io import read_events_json, read_params_json
events = read_events_json(simulation_set.event_paths[0])
events[200:203]
[{'event_type': 'RequestSubmissionEvent',
'request_id': 48,
'timestamp': 42.09283479680292,
'origin': [0.2498064478821005, 0.9232655992760128],
'destination': [0.44313074505345695, 0.8613491047618306],
'pickup_timewindow_min': 42.09283479680292,
'pickup_timewindow_max': inf,
'delivery_timewindow_min': 42.09283479680292,
'delivery_timewindow_max': inf},
{'event_type': 'RequestAcceptanceEvent',
'timestamp': 42.09283479680292,
'request_id': 48,
'origin': [0.2498064478821005, 0.9232655992760128],
'destination': [0.44313074505345695, 0.8613491047618306],
'pickup_timewindow_min': 42.09283479680292,
'pickup_timewindow_max': inf,
'delivery_timewindow_min': 42.09283479680292,
'delivery_timewindow_max': inf},
{'event_type': 'RequestSubmissionEvent',
'request_id': 49,
'timestamp': 42.29722339631083,
'origin': [0.5503253124498481, 0.05058832952488124],
'destination': [0.9992824684127266, 0.8360275850799519],
'pickup_timewindow_min': 42.29722339631083,
'pickup_timewindow_max': inf,
'delivery_timewindow_min': 42.29722339631083,
'delivery_timewindow_max': inf}]
Stops file
First, make_file_path
can be used to assemble the parquet filename, which is the readily read by pandas:
from ridepy.extras.simulation_set import make_file_path
import pandas as pd
stops_fpath = make_file_path(simulation_set.simulation_ids[0], tmp_path, "_stops.pq")
stops = pd.read_parquet(stops_fpath)
The result looks as expected:
stops.iloc[5]
timestamp 8.638084
delta_occupancy 1.0
request_id 7
state_duration 0.132314
occupancy 1.0
location [0.8474943663474591, 0.6037260313668911]
dist_to_next 0.132314
time_to_next 0.132314
Name: (0.0, 5), dtype: object
Requests file
Similarly, for the requests:
request_fpath = make_file_path(simulation_set.simulation_ids[0], tmp_path, "_requests.pq")
requests = pd.read_parquet(request_fpath)
requests.iloc[5]
source quantity
accepted delivery_timewindow_max inf
delivery_timewindow_min 6.048299
destination [0.34025051651799104, 0.15547949981178102]
origin [0.8058192518328071, 0.6981393949882261]
pickup_timewindow_max inf
pickup_timewindow_min 6.048299
timestamp 6.048299
inferred relative_travel_time 2.372446
travel_time 1.696314
waiting_time 0.947631
serviced timestamp_dropoff 8.692244
timestamp_pickup 6.99593
vehicle_id 2.0
submitted delivery_timewindow_max inf
delivery_timewindow_min 6.048299
destination [0.34025051651799104, 0.15547949981178102]
direct_travel_distance 0.715006
direct_travel_time 0.715006
origin [0.8058192518328071, 0.6981393949882261]
pickup_timewindow_max inf
pickup_timewindow_min 6.048299
timestamp 6.048299
Name: 5, dtype: object