switch-model
diff --git a/‎docs/Performance.md‎
Lines changed: 74 additions & 0 deletions b/‎docs/Performance.md‎
Lines changed: 74 additions & 0 deletions
diff --git a/‎switch_model/balancing/load_zones.py‎
Lines changed: 42 additions & 1 deletion b/‎switch_model/balancing/load_zones.py‎
Lines changed: 42 additions & 1 deletion
diff --git a/‎switch_model/generators/core/build.py‎
Lines changed: 3 additions & 1 deletion b/‎switch_model/generators/core/build.py‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎switch_model/generators/core/dispatch.py‎
Lines changed: 159 additions & 1 deletion b/‎switch_model/generators/core/dispatch.py‎
Lines changed: 159 additions & 1 deletion
@@ -0,0 +1,74 @@
+# Performance
+
+Memory use and solve time are two important factors that we try to keep to a minimum in our models. There are multiple
+things one can do to improve performance.
+
+## Solving methods
+
+By far the biggest factor that impacts performance is the method used by Gurobi. The fastest method is barrier solve
+without crossover (use `--recommended-fast`)
+however this method often returns a suboptimal solution. The next fastest is barrier solve followed by crossover and
+simplex (use `--recommended`) which almost always works. In some cases barrier solve encounters numerical issues (
+see [`Numerical Issues.md`](./Numerical%20Issues.md))
+in which case the slower Simplex method must be used (`--solver-options-string method=1`).
+
+## Solver interface
+
+Solver interfaces are how Pyomo communicates with Gurobi (or any solver).
+
+There are two solver interfaces that you should know about: `gurobi` and `gurobi_direct`.
+
+- When using `gurobi`, Pyomo will write the entire model to a temporary text file and then start a *separate Gurobi
+  process* that will read the file, solve the model and write the results to another temporary text file. Once Gurobi
+  finishes writing the results Pyomo will read the results text file and load the results back into the Python program
+  before running post_solve (e.g. generate csv files, create graphs, etc). Note that these temporary text files are
+  stored in `/tmp` but if you use `--recommended-debug` Pyomo and Gurobi will instead use a `temp` folder in your model.
+
+- `gurobi_direct` uses Gurobi's Python library to create and solve the model directly in Python without the use of
+  intermediate text files.
+
+In theory `gurobi_direct` should be faster and more efficient however in practice we find that that's not the case. As
+such we recommend using `gurobi` and all our defaults do so. If someone has the time they could profile `gurobi_direct`
+to improve performance at which point we could make `gurobi_direct` the default (and enable `--save-warm-start` by default, see below).
+
+The `gurobi` interface has the added advantage of separating Gurobi and Pyomo into separate threads. This means that
+while Gurobi is solving and Pyomo is idle, the operating system can automatically move Pyomo's memory usage
+to [virtual memory](https://serverfault.com/questions/48486/what-is-swap-memory)
+which will free up more memory for Gurobi.
+
+## Warm starting
+
+Warm starting is the act of using a solution from a previous similar model to start the solver closer to your expected
+solution. Theoretically this can help performance however in practice there are several limitations. For this section, *
+previous solution* refers to the results from an already solved model that you are using to warm start the solver. *
+Current solution* refers to the solution you are trying to find while using the warm start feature.
+
+- To warm start a model use `switch solve --warm-start <path_to_previous_solution>`.
+
+- Warm starting only works if the previous solution does not break any constraints of the current solution. This usually
+  only happens if a) the model has the exact same set of variables b)
+  the previous solution was "harder" (e.g. it had more constraints to satisfy).
+
+- Warm starting always uses the slower Simplex method. This means unless you expect the previous solution and current
+  solution to be very similar, it may be faster to solve without warm start using the barrier method.
+
+- If your previous solution didn't use crossover (e.g. you used `--recommended-fast`) then warm starting will be even
+  slower since the solver will need to first run crossover before warm starting.
+
+- Our implementation of warm starting only works if your previous solution has an `outputs/warm_start.pickle`
+  file. This file is only generated when you use `--save-warm-start`.
+
+- `--save-warm-start` and `--warm-start` both use an extension of the `gurobi_direct` solver interface which is
+  generally slower than the `gurobi` solver interface (see section above).
+  
+## Tools for improving performance
+
+- [Memory profiler](https://pypi.org/project/memory-profiler/) for generating plots of the memory
+use over time. Use `mprof run --interval 60 --multiprocess switch solve ...` and once solving is done
+  run `mprof plot -o profile.png` to make the plot.
+  
+- [Fil Profiler](https://pypi.org/project/filprofiler/) is an amazing tool for seeing which parts of the code are
+using up memory during peak memory usage.
+  
+- Using `switch_model.utilities.StepTimer` to measure how long certain code blocks take to run. See examples
+throughout the code.
@@ -288,7 +288,7 @@ def get_component_per_year(m, z, p, component):
     title="Energy balance duals per period",
     note="Note: Outliers and zero-valued duals are ignored.",
 )
-def graph(tools):
+def graph_energy_balance(tools):
     load_balance = tools.get_dataframe("load_balance.csv")
     load_balance = tools.transform.timestamp(load_balance)
     load_balance["energy_balance_duals"] = (
@@ -309,3 +309,44 @@ def graph(tools):
             ylabel="Energy balance duals (cents/kWh)",
             showfliers=False,
         )
+
+
+@graph("daily_demand", title="Total daily demand", supports_multi_scenario=True)
+def demand(tools):
+    df = tools.get_dataframe("loads.csv", from_inputs=True, drop_scenario_info=False)
+    df = df.groupby(["TIMEPOINT", "scenario_name"], as_index=False).sum()
+    df = tools.transform.timestamp(df, key_col="TIMEPOINT", use_timepoint=True)
+    df = df.groupby(
+        ["season", "hour", "scenario_name", "time_row"], as_index=False
+    ).mean()
+    df["zone_demand_mw"] /= 1e3
+    pn = tools.pn
+
+    plot = (
+        pn.ggplot(df)
+        + pn.geom_line(pn.aes(x="hour", y="zone_demand_mw", color="scenario_name"))
+        + pn.facet_grid("time_row ~ season")
+        + pn.labs(x="Hour (PST)", y="Demand (GW)", color="Scenario")
+    )
+    tools.save_figure(plot.draw())
+
+
+@graph("demand", title="Total demand", supports_multi_scenario=True)
+def yearly_demand(tools):
+    df = tools.get_dataframe("loads.csv", from_inputs=True, drop_scenario_info=False)
+    df = df.groupby(["TIMEPOINT", "scenario_name"], as_index=False).sum()
+    df = tools.transform.timestamp(df, key_col="TIMEPOINT", use_timepoint=True)
+    df["zone_demand_mw"] *= df["tp_duration"] / 1e3
+    df["day"] = df["datetime"].dt.day_of_year
+    df = df.groupby(["day", "scenario_name", "time_row"], as_index=False)[
+        "zone_demand_mw"
+    ].sum()
+    pn = tools.pn
+
+    plot = (
+        pn.ggplot(df)
+        + pn.geom_line(pn.aes(x="day", y="zone_demand_mw", color="scenario_name"))
+        + pn.facet_grid("time_row ~ .")
+        + pn.labs(x="Day of Year", y="Demand (GW)", color="Scenario")
+    )
+    tools.save_figure(plot.draw())
@@ -762,14 +762,16 @@ def graph_capacity(tools):
         color=tools.get_colors(len(capacity_df.index)),
     )
 
+    tools.bar_label()
+
 
 @graph(
     "buildout_gen_per_period",
     title="Built Capacity per Period",
     supports_multi_scenario=True,
 )
 def graph_buildout(tools):
-    build_gen = tools.get_dataframe("BuildGen.csv")
+    build_gen = tools.get_dataframe("BuildGen.csv", dtype={"GEN_BLD_YRS_1": str})
     build_gen = build_gen.rename(
         {
             "GEN_BLD_YRS_1": "GENERATION_PROJECT",
 
@@ -695,7 +695,6 @@ def graph_hourly_curtailment(tools):
 @graph(
     "total_dispatch",
     title="Total dispatched electricity",
-    is_long=True,
 )
 def graph_total_dispatch(tools):
     # ---------------------------------- #
@@ -741,6 +740,165 @@ def graph_total_dispatch(tools):
         ylabel="Total dispatched electricity (TWh)",
     )
 
+    tools.bar_label()
+
+
+@graph(
+    "energy_balance",
+    title="Energy Balance For Every Month",
+    supports_multi_scenario=True,
+    is_long=True,
+)
+def energy_balance(tools):
+    # Get dispatch dataframe
+    cols = [
+        "timestamp",
+        "gen_tech",
+        "gen_energy_source",
+        "DispatchGen_MW",
+        "scenario_name",
+        "scenario_index",
+        "Curtailment_MW",
+    ]
+    df = tools.get_dataframe("dispatch.csv", drop_scenario_info=False)[cols]
+    df = tools.transform.gen_type(df)
+
+    # Rename and add needed columns
+    df["Dispatch Limit"] = df["DispatchGen_MW"] + df["Curtailment_MW"]
+    df = df.drop("Curtailment_MW", axis=1)
+    df = df.rename({"DispatchGen_MW": "Dispatch"}, axis=1)
+    # Sum dispatch across all the projects of the same type and timepoint
+    key_columns = ["timestamp", "gen_type", "scenario_name", "scenario_index"]
+    df = df.groupby(key_columns, as_index=False).sum()
+    df = df.melt(
+        id_vars=key_columns, value_vars=["Dispatch", "Dispatch Limit"], var_name="Type"
+    )
+    df = df.rename({"gen_type": "Source"}, axis=1)
+
+    discharge = (
+        df[(df["Source"] == "Storage") & (df["Type"] == "Dispatch")]
+        .drop(["Source", "Type"], axis=1)
+        .rename({"value": "discharge"}, axis=1)
+    )
+
+    # Get load dataframe
+    load = tools.get_dataframe("load_balance.csv", drop_scenario_info=False)
+    load = load.drop("normalized_energy_balance_duals_dollar_per_mwh", axis=1)
+
+    # Sum load across all the load zones
+    key_columns = ["timestamp", "scenario_name", "scenario_index"]
+    load = load.groupby(key_columns, as_index=False).sum()
+
+    # Subtract storage dispatch from generation and add it to the storage charge to get net flow
+    load = load.merge(discharge, how="left", on=key_columns, validate="one_to_one")
+    load["ZoneTotalCentralDispatch"] -= load["discharge"]
+    load["StorageNetCharge"] += load["discharge"]
+    load = load.drop("discharge", axis=1)
+
+    # Rename and convert from wide to long format
+    load = load.rename(
+        {
+            "ZoneTotalCentralDispatch": "Total Generation (excl. storage discharge)",
+            "TXPowerNet": "Transmission Losses",
+            "StorageNetCharge": "Storage Net Flow",
+            "zone_demand_mw": "Demand",
+        },
+        axis=1,
+    ).sort_index(axis=1)
+    load = load.melt(id_vars=key_columns, var_name="Source")
+    load["Type"] = "Dispatch"
+
+    # Merge dispatch contributions with load contributions
+    df = pd.concat([load, df])
+
+    # Add the timestamp information and make period string to ensure it doesn't mess up the graphing
+    df = tools.transform.timestamp(df).astype({"period": str})
+
+    # Convert to TWh (incl. multiply by timepoint duration)
+    df["value"] *= df["tp_duration"] / 1e6
+
+    FREQUENCY = "1W"
+
+    def groupby_time(df):
+        return df.groupby(
+            [
+                "scenario_name",
+                "period",
+                "Source",
+                "Type",
+                tools.pd.Grouper(key="datetime", freq=FREQUENCY, origin="start"),
+            ]
+        )["value"]
+
+    df = groupby_time(df).sum().reset_index()
+
+    # Get the state of charge data
+    soc = tools.get_dataframe(
+        "StateOfCharge.csv", dtype={"STORAGE_GEN_TPS_1": str}, drop_scenario_info=False
+    )
+    soc = soc.rename(
+        {"STORAGE_GEN_TPS_2": "timepoint", "StateOfCharge": "value"}, axis=1
+    )
+    # Sum over all the projects that are in the same scenario with the same timepoint
+    soc = soc.groupby(["timepoint", "scenario_name"], as_index=False).sum()
+    soc["Source"] = "State Of Charge"
+    soc["value"] /= 1e6  # Convert to TWh
+
+    # Group by time
+    soc = tools.transform.timestamp(
+        soc, use_timepoint=True, key_col="timepoint"
+    ).astype({"period": str})
+    soc["Type"] = "Dispatch"
+    soc = groupby_time(soc).mean().reset_index()
+
+    # Add state of charge to dataframe
+    df = pd.concat([df, soc])
+    # Add column for day since that's what we really care about
+    df["day"] = df["datetime"].dt.dayofyear
+
+    # Plot
+    # Get the colors for the lines
+    colors = tools.get_colors()
+    colors.update(
+        {
+            "Transmission Losses": "brown",
+            "Storage Net Flow": "cadetblue",
+            "Demand": "black",
+            "Total Generation (excl. storage discharge)": "black",
+            "State Of Charge": "green",
+        }
+    )
+
+    # plot
+    num_periods = df["period"].nunique()
+    pn = tools.pn
+    plot = (
+        pn.ggplot(df)
+        + pn.geom_line(pn.aes(x="day", y="value", color="Source", linetype="Type"))
+        + pn.facet_grid("period ~ scenario_name")
+        + pn.labs(y="Contribution to Energy Balance (TWh)")
+        + pn.scales.scale_color_manual(
+            values=colors, aesthetics="color", na_value=colors["Other"]
+        )
+        + pn.scales.scale_x_continuous(
+            name="Month",
+            labels=["J", "F", "M", "A", "M", "J", "J", "A", "S", "O", "N", "D"],
+            breaks=(15, 46, 76, 106, 137, 167, 198, 228, 259, 289, 319, 350),
+            limits=(0, 366),
+        )
+        + pn.scales.scale_linetype_manual(
+            values={"Dispatch Limit": "dotted", "Dispatch": "solid"}
+        )
+        + pn.theme(
+            figure_size=(
+                pn.options.figure_size[0] * tools.num_scenarios,
+                pn.options.figure_size[1] * num_periods,
+            )
+        )
+    )
+
+    tools.save_figure(plot.draw())
+
 
 @graph(
     "curtailment_per_period",
Original file line number	Diff line number	Diff line change
`@@ -762,14 +762,16 @@ def graph_capacity(tools):`
`762`	`762`	`color=tools.get_colors(len(capacity_df.index)),`
`763`	`763`	`)`
`764`	`764`
	`765`	`+ tools.bar_label()`
	`766`	`+`
`765`	`767`
`766`	`768`	`@graph(`
`767`	`769`	`"buildout_gen_per_period",`
`768`	`770`	`title="Built Capacity per Period",`
`769`	`771`	`supports_multi_scenario=True,`
`770`	`772`	`)`
`771`	`773`	`def graph_buildout(tools):`
`772`		`- build_gen = tools.get_dataframe("BuildGen.csv")`
	`774`	`+ build_gen = tools.get_dataframe("BuildGen.csv", dtype={"GEN_BLD_YRS_1": str})`
`773`	`775`	`build_gen = build_gen.rename(`
`774`	`776`	`{`
`775`	`777`	`"GEN_BLD_YRS_1": "GENERATION_PROJECT",`