Skip to content

Commit e9c839f

Browse files
committed
Merge remote-tracking branch 'rael/wecc' into hydro_scenario
2 parents 64cb54b + cbb6276 commit e9c839f

File tree

12 files changed

+820
-132
lines changed

12 files changed

+820
-132
lines changed

docs/Performance.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Performance
2+
3+
Memory use and solve time are two important factors that we try to keep to a minimum in our models. There are multiple
4+
things one can do to improve performance.
5+
6+
## Solving methods
7+
8+
By far the biggest factor that impacts performance is the method used by Gurobi. The fastest method is barrier solve
9+
without crossover (use `--recommended-fast`)
10+
however this method often returns a suboptimal solution. The next fastest is barrier solve followed by crossover and
11+
simplex (use `--recommended`) which almost always works. In some cases barrier solve encounters numerical issues (
12+
see [`Numerical Issues.md`](./Numerical%20Issues.md))
13+
in which case the slower Simplex method must be used (`--solver-options-string method=1`).
14+
15+
## Solver interface
16+
17+
Solver interfaces are how Pyomo communicates with Gurobi (or any solver).
18+
19+
There are two solver interfaces that you should know about: `gurobi` and `gurobi_direct`.
20+
21+
- When using `gurobi`, Pyomo will write the entire model to a temporary text file and then start a *separate Gurobi
22+
process* that will read the file, solve the model and write the results to another temporary text file. Once Gurobi
23+
finishes writing the results Pyomo will read the results text file and load the results back into the Python program
24+
before running post_solve (e.g. generate csv files, create graphs, etc). Note that these temporary text files are
25+
stored in `/tmp` but if you use `--recommended-debug` Pyomo and Gurobi will instead use a `temp` folder in your model.
26+
27+
- `gurobi_direct` uses Gurobi's Python library to create and solve the model directly in Python without the use of
28+
intermediate text files.
29+
30+
In theory `gurobi_direct` should be faster and more efficient however in practice we find that that's not the case. As
31+
such we recommend using `gurobi` and all our defaults do so. If someone has the time they could profile `gurobi_direct`
32+
to improve performance at which point we could make `gurobi_direct` the default (and enable `--save-warm-start` by default, see below).
33+
34+
The `gurobi` interface has the added advantage of separating Gurobi and Pyomo into separate threads. This means that
35+
while Gurobi is solving and Pyomo is idle, the operating system can automatically move Pyomo's memory usage
36+
to [virtual memory](https://serverfault.com/questions/48486/what-is-swap-memory)
37+
which will free up more memory for Gurobi.
38+
39+
## Warm starting
40+
41+
Warm starting is the act of using a solution from a previous similar model to start the solver closer to your expected
42+
solution. Theoretically this can help performance however in practice there are several limitations. For this section, *
43+
previous solution* refers to the results from an already solved model that you are using to warm start the solver. *
44+
Current solution* refers to the solution you are trying to find while using the warm start feature.
45+
46+
- To warm start a model use `switch solve --warm-start <path_to_previous_solution>`.
47+
48+
- Warm starting only works if the previous solution does not break any constraints of the current solution. This usually
49+
only happens if a) the model has the exact same set of variables b)
50+
the previous solution was "harder" (e.g. it had more constraints to satisfy).
51+
52+
- Warm starting always uses the slower Simplex method. This means unless you expect the previous solution and current
53+
solution to be very similar, it may be faster to solve without warm start using the barrier method.
54+
55+
- If your previous solution didn't use crossover (e.g. you used `--recommended-fast`) then warm starting will be even
56+
slower since the solver will need to first run crossover before warm starting.
57+
58+
- Our implementation of warm starting only works if your previous solution has an `outputs/warm_start.pickle`
59+
file. This file is only generated when you use `--save-warm-start`.
60+
61+
- `--save-warm-start` and `--warm-start` both use an extension of the `gurobi_direct` solver interface which is
62+
generally slower than the `gurobi` solver interface (see section above).
63+
64+
## Tools for improving performance
65+
66+
- [Memory profiler](https://pypi.org/project/memory-profiler/) for generating plots of the memory
67+
use over time. Use `mprof run --interval 60 --multiprocess switch solve ...` and once solving is done
68+
run `mprof plot -o profile.png` to make the plot.
69+
70+
- [Fil Profiler](https://pypi.org/project/filprofiler/) is an amazing tool for seeing which parts of the code are
71+
using up memory during peak memory usage.
72+
73+
- Using `switch_model.utilities.StepTimer` to measure how long certain code blocks take to run. See examples
74+
throughout the code.

switch_model/balancing/load_zones.py

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -288,7 +288,7 @@ def get_component_per_year(m, z, p, component):
288288
title="Energy balance duals per period",
289289
note="Note: Outliers and zero-valued duals are ignored.",
290290
)
291-
def graph(tools):
291+
def graph_energy_balance(tools):
292292
load_balance = tools.get_dataframe("load_balance.csv")
293293
load_balance = tools.transform.timestamp(load_balance)
294294
load_balance["energy_balance_duals"] = (
@@ -309,3 +309,44 @@ def graph(tools):
309309
ylabel="Energy balance duals (cents/kWh)",
310310
showfliers=False,
311311
)
312+
313+
314+
@graph("daily_demand", title="Total daily demand", supports_multi_scenario=True)
315+
def demand(tools):
316+
df = tools.get_dataframe("loads.csv", from_inputs=True, drop_scenario_info=False)
317+
df = df.groupby(["TIMEPOINT", "scenario_name"], as_index=False).sum()
318+
df = tools.transform.timestamp(df, key_col="TIMEPOINT", use_timepoint=True)
319+
df = df.groupby(
320+
["season", "hour", "scenario_name", "time_row"], as_index=False
321+
).mean()
322+
df["zone_demand_mw"] /= 1e3
323+
pn = tools.pn
324+
325+
plot = (
326+
pn.ggplot(df)
327+
+ pn.geom_line(pn.aes(x="hour", y="zone_demand_mw", color="scenario_name"))
328+
+ pn.facet_grid("time_row ~ season")
329+
+ pn.labs(x="Hour (PST)", y="Demand (GW)", color="Scenario")
330+
)
331+
tools.save_figure(plot.draw())
332+
333+
334+
@graph("demand", title="Total demand", supports_multi_scenario=True)
335+
def yearly_demand(tools):
336+
df = tools.get_dataframe("loads.csv", from_inputs=True, drop_scenario_info=False)
337+
df = df.groupby(["TIMEPOINT", "scenario_name"], as_index=False).sum()
338+
df = tools.transform.timestamp(df, key_col="TIMEPOINT", use_timepoint=True)
339+
df["zone_demand_mw"] *= df["tp_duration"] / 1e3
340+
df["day"] = df["datetime"].dt.day_of_year
341+
df = df.groupby(["day", "scenario_name", "time_row"], as_index=False)[
342+
"zone_demand_mw"
343+
].sum()
344+
pn = tools.pn
345+
346+
plot = (
347+
pn.ggplot(df)
348+
+ pn.geom_line(pn.aes(x="day", y="zone_demand_mw", color="scenario_name"))
349+
+ pn.facet_grid("time_row ~ .")
350+
+ pn.labs(x="Day of Year", y="Demand (GW)", color="Scenario")
351+
)
352+
tools.save_figure(plot.draw())

switch_model/generators/core/build.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -762,14 +762,16 @@ def graph_capacity(tools):
762762
color=tools.get_colors(len(capacity_df.index)),
763763
)
764764

765+
tools.bar_label()
766+
765767

766768
@graph(
767769
"buildout_gen_per_period",
768770
title="Built Capacity per Period",
769771
supports_multi_scenario=True,
770772
)
771773
def graph_buildout(tools):
772-
build_gen = tools.get_dataframe("BuildGen.csv")
774+
build_gen = tools.get_dataframe("BuildGen.csv", dtype={"GEN_BLD_YRS_1": str})
773775
build_gen = build_gen.rename(
774776
{
775777
"GEN_BLD_YRS_1": "GENERATION_PROJECT",

switch_model/generators/core/dispatch.py

Lines changed: 159 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -695,7 +695,6 @@ def graph_hourly_curtailment(tools):
695695
@graph(
696696
"total_dispatch",
697697
title="Total dispatched electricity",
698-
is_long=True,
699698
)
700699
def graph_total_dispatch(tools):
701700
# ---------------------------------- #
@@ -741,6 +740,165 @@ def graph_total_dispatch(tools):
741740
ylabel="Total dispatched electricity (TWh)",
742741
)
743742

743+
tools.bar_label()
744+
745+
746+
@graph(
747+
"energy_balance",
748+
title="Energy Balance For Every Month",
749+
supports_multi_scenario=True,
750+
is_long=True,
751+
)
752+
def energy_balance(tools):
753+
# Get dispatch dataframe
754+
cols = [
755+
"timestamp",
756+
"gen_tech",
757+
"gen_energy_source",
758+
"DispatchGen_MW",
759+
"scenario_name",
760+
"scenario_index",
761+
"Curtailment_MW",
762+
]
763+
df = tools.get_dataframe("dispatch.csv", drop_scenario_info=False)[cols]
764+
df = tools.transform.gen_type(df)
765+
766+
# Rename and add needed columns
767+
df["Dispatch Limit"] = df["DispatchGen_MW"] + df["Curtailment_MW"]
768+
df = df.drop("Curtailment_MW", axis=1)
769+
df = df.rename({"DispatchGen_MW": "Dispatch"}, axis=1)
770+
# Sum dispatch across all the projects of the same type and timepoint
771+
key_columns = ["timestamp", "gen_type", "scenario_name", "scenario_index"]
772+
df = df.groupby(key_columns, as_index=False).sum()
773+
df = df.melt(
774+
id_vars=key_columns, value_vars=["Dispatch", "Dispatch Limit"], var_name="Type"
775+
)
776+
df = df.rename({"gen_type": "Source"}, axis=1)
777+
778+
discharge = (
779+
df[(df["Source"] == "Storage") & (df["Type"] == "Dispatch")]
780+
.drop(["Source", "Type"], axis=1)
781+
.rename({"value": "discharge"}, axis=1)
782+
)
783+
784+
# Get load dataframe
785+
load = tools.get_dataframe("load_balance.csv", drop_scenario_info=False)
786+
load = load.drop("normalized_energy_balance_duals_dollar_per_mwh", axis=1)
787+
788+
# Sum load across all the load zones
789+
key_columns = ["timestamp", "scenario_name", "scenario_index"]
790+
load = load.groupby(key_columns, as_index=False).sum()
791+
792+
# Subtract storage dispatch from generation and add it to the storage charge to get net flow
793+
load = load.merge(discharge, how="left", on=key_columns, validate="one_to_one")
794+
load["ZoneTotalCentralDispatch"] -= load["discharge"]
795+
load["StorageNetCharge"] += load["discharge"]
796+
load = load.drop("discharge", axis=1)
797+
798+
# Rename and convert from wide to long format
799+
load = load.rename(
800+
{
801+
"ZoneTotalCentralDispatch": "Total Generation (excl. storage discharge)",
802+
"TXPowerNet": "Transmission Losses",
803+
"StorageNetCharge": "Storage Net Flow",
804+
"zone_demand_mw": "Demand",
805+
},
806+
axis=1,
807+
).sort_index(axis=1)
808+
load = load.melt(id_vars=key_columns, var_name="Source")
809+
load["Type"] = "Dispatch"
810+
811+
# Merge dispatch contributions with load contributions
812+
df = pd.concat([load, df])
813+
814+
# Add the timestamp information and make period string to ensure it doesn't mess up the graphing
815+
df = tools.transform.timestamp(df).astype({"period": str})
816+
817+
# Convert to TWh (incl. multiply by timepoint duration)
818+
df["value"] *= df["tp_duration"] / 1e6
819+
820+
FREQUENCY = "1W"
821+
822+
def groupby_time(df):
823+
return df.groupby(
824+
[
825+
"scenario_name",
826+
"period",
827+
"Source",
828+
"Type",
829+
tools.pd.Grouper(key="datetime", freq=FREQUENCY, origin="start"),
830+
]
831+
)["value"]
832+
833+
df = groupby_time(df).sum().reset_index()
834+
835+
# Get the state of charge data
836+
soc = tools.get_dataframe(
837+
"StateOfCharge.csv", dtype={"STORAGE_GEN_TPS_1": str}, drop_scenario_info=False
838+
)
839+
soc = soc.rename(
840+
{"STORAGE_GEN_TPS_2": "timepoint", "StateOfCharge": "value"}, axis=1
841+
)
842+
# Sum over all the projects that are in the same scenario with the same timepoint
843+
soc = soc.groupby(["timepoint", "scenario_name"], as_index=False).sum()
844+
soc["Source"] = "State Of Charge"
845+
soc["value"] /= 1e6 # Convert to TWh
846+
847+
# Group by time
848+
soc = tools.transform.timestamp(
849+
soc, use_timepoint=True, key_col="timepoint"
850+
).astype({"period": str})
851+
soc["Type"] = "Dispatch"
852+
soc = groupby_time(soc).mean().reset_index()
853+
854+
# Add state of charge to dataframe
855+
df = pd.concat([df, soc])
856+
# Add column for day since that's what we really care about
857+
df["day"] = df["datetime"].dt.dayofyear
858+
859+
# Plot
860+
# Get the colors for the lines
861+
colors = tools.get_colors()
862+
colors.update(
863+
{
864+
"Transmission Losses": "brown",
865+
"Storage Net Flow": "cadetblue",
866+
"Demand": "black",
867+
"Total Generation (excl. storage discharge)": "black",
868+
"State Of Charge": "green",
869+
}
870+
)
871+
872+
# plot
873+
num_periods = df["period"].nunique()
874+
pn = tools.pn
875+
plot = (
876+
pn.ggplot(df)
877+
+ pn.geom_line(pn.aes(x="day", y="value", color="Source", linetype="Type"))
878+
+ pn.facet_grid("period ~ scenario_name")
879+
+ pn.labs(y="Contribution to Energy Balance (TWh)")
880+
+ pn.scales.scale_color_manual(
881+
values=colors, aesthetics="color", na_value=colors["Other"]
882+
)
883+
+ pn.scales.scale_x_continuous(
884+
name="Month",
885+
labels=["J", "F", "M", "A", "M", "J", "J", "A", "S", "O", "N", "D"],
886+
breaks=(15, 46, 76, 106, 137, 167, 198, 228, 259, 289, 319, 350),
887+
limits=(0, 366),
888+
)
889+
+ pn.scales.scale_linetype_manual(
890+
values={"Dispatch Limit": "dotted", "Dispatch": "solid"}
891+
)
892+
+ pn.theme(
893+
figure_size=(
894+
pn.options.figure_size[0] * tools.num_scenarios,
895+
pn.options.figure_size[1] * num_periods,
896+
)
897+
)
898+
)
899+
900+
tools.save_figure(plot.draw())
901+
744902

745903
@graph(
746904
"curtailment_per_period",

0 commit comments

Comments
 (0)