switch-model
diff --git a/‎docs/Graphs.md‎
Lines changed: 132 additions & 62 deletions b/‎docs/Graphs.md‎
Lines changed: 132 additions & 62 deletions
diff --git a/‎examples/3zone_toy/outputs/total_cost.txt‎
Lines changed: 1 addition & 1 deletion b/‎examples/3zone_toy/outputs/total_cost.txt‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎setup.py‎
Lines changed: 4 additions & 1 deletion b/‎setup.py‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎switch_model/__main__.py‎
Lines changed: 2 additions & 2 deletions b/‎switch_model/__main__.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎switch_model/balancing/load_zones.py‎
Lines changed: 12 additions & 7 deletions b/‎switch_model/balancing/load_zones.py‎
Lines changed: 12 additions & 7 deletions
diff --git a/‎switch_model/financials.py‎
Lines changed: 17 additions & 5 deletions b/‎switch_model/financials.py‎
Lines changed: 17 additions & 5 deletions
@@ -18,92 +18,128 @@ and `ca_policies` have already been solved).
 
 ## Adding new graphs
 
-Graphs can be defined in any module by adding the following function to the file.
-
+New graphs can be added with the `@graph(...)` annotation.
 ```python
-def graph(tools):
+from switch_model.tools.graph import graph
+
+@graph(
+  name="my_custom_graph",
+  title="An example plot",
+  note="Some optional note to add below the graph",
+  # Other options are possible see code documentation
+)
+def my_graphing_function(tools):
   # Your graphing code
   ...
 ```
 
-In `graph()` you can use the `tools` object to create graphs. Here are some important methods.
-
-- `tools.get_dataframe(csv=filename)` will return a pandas dataframe for the file called `filename`. You can also
-  specify `folder=tools.folders.INPUTS` to load a csv from the inputs directory.
-
-- `tools.get_new_axes(out, title, note)` will return a matplotlib axes. This should be the axes used while
-  graphing. `out` is the name of the `.png` file that will be created with this graph. `title` and `note` are optional
-  and will be the title and footnote for the graph.
-
-- `tools.pd`, `tools.sns`, `tools.np`, `tools.mplt` are references to the pandas, seaborn, numpy and matplotlib
-  libraries. This is useful if your graphing code needs to access these libraries since it doesn't require adding an
+In `my_graphing_function()` you can use the `tools` object to create graphs. Here are some important methods.
+
+- `tools.get_dataframe(filename)` will return a pandas dataframe for the file called `filename`. You can also
+  specify `from_inputs=True` to load a csv from the inputs directory.
+  
+- `tools.get_axes()` or `tools.get_figure()` will return a matplotlib axes or figure
+  that should be used while graphing. When possible, always use `get_axes` instead of `get_figure` since
+  this allows plots from different scenarios to share the same figure.
+  
+- `tools.save_figure(fig)`. Some libraries (e.g. plotnine) 
+  always generate their own figures. In this case we can add the figure
+  to our outputs with this function. When possible, use `tools.get_axes()` instead.
+
+- `tools.pd`, `tools.sns`, `tools.np`, `tools.mplt`, `tools.pn` are references to the pandas, seaborn, numpy, matplotlib
+  and plotnine graphing libraries. This is useful if your graphing code needs to access these libraries since it doesn't require adding an
   import to your file.
 
-- `tools.add_gen_type_column(df)` adds a column called `gen_type` to a dataframe with columns
-  `gen_tech` and `gen_energy_source`. `gen_type` is a user-friendly name for the technology (e.g. Nuclear instead of
-  Uranium). The mapping of `gen_energy_source` and `gen_tech` to `gen_type` is defined in
-  a `inputs/graph_tech_types.csv`. If this file isn't present, a default mapping will be used. You can also use other
-  mappings found in `graph_tech_types.csv` by specifying `map_name=` when calling `add_gen_type_column()`.
-
+- `tools.transform` is a reference to a `TransformTools` object that provides
+  useful helper methods for modyfing a dataframe for graphing. Full documentation
+  can be found in the `TransformTools` class but some examples include.
+  
+  - `tools.transform.build_year(df)` which will convert build years that aren't
+  a period to the string `Pre-existing`.
+    
+  - `tools.transform.gen_type(df)` which adds a column called `gen_type` to the dataframe. 
+    `gen_type` is a user-friendly name for the technology (e.g. Nuclear instead of
+  Uranium) and is determined using the mappings in `inputs/graph_tech_types.csv`.
+    
+  - `tools.transform.timestamp(df)`: which adds columns such as the hour, the timestamp in datetime format
+  in the correct timezone, etc.
+    
+  - `tools.transform.load_zone(df)`: Adds a column called 'region' to the dataframe which
+  normally corresponds to the load zone state.
+    
 - `tools.get_colors()` returns a mapping of `gen_type` to its color. This is useful for graphing and can normally be
-  passed straight to `color=` in standard plotting libraries. You can also specify a different color mapping using a
-  similar process to above (`map_name=`)
+  passed straight to `color=` in standard plotting libraries. The color mapping is based on `inputs/graph_tech_colors.csv`.
 
 ## Adding a comparison graph
 
-By default, `tools.get_dataframe` will return the data for only one scenario (the one you are graphing).
-
-Sometimes, you may wish to create a graph that compares multiple scenarios. To do this create a function
-called `compare`.
+Sometimes you may want to create graphs that compare data from multiple scenarios.
+To do this, add `supports_multi_scenario=True` inside the `@graph()` decorator.
 
 ```python
-def compare(tools):
-  # Your graphing code
-  ...
+from switch_model.tools.graph import graph
+
+@graph(
+  name="my_custom_comparison_graph",
+  title="My Comparison plot",
+  supports_multi_scenario=True,
+  # Instead of supports_multi_scenario, you can use
+  # requires_multi_scenario if you want the graphing function
+  # to *only* be run when we have multiple scenarios.
+  # requires_multi_scenario=True,
+)
+def my_graphing_comparison_function(tools):
+    # Read data from all the scenarios
+    df = tools.get_dataframe("some_file.csv")
+    # Plot data
+    ...
 ```
 
-If you call `tools.get_dataframe(...)` from within `compare`, then
-`tools.get_dataframe` will return a dataframe containing the data from *all*
-the scenarios. The dataframe will contain a column called `scenario` to indicate which rows correspond to which
-scenarios. You can then use this column to create a graph comparing the different scenarios (still
-using `tools.get_new_axes`).
+Now everytime you call `tools.get_dataframe(filename)`, data for *all* the scenarios
+gets returned. The way this works is that the 
+returned dataframe will contain a column called `scenario_name` 
+to indicate which rows correspond to which scenarios. 
+You can then use this column to create a graph comparing the different scenarios (still
+using `tools.get_axes`).
 
-At this point, when you run `switch compare`, your `compare(tools)` function will be called and your comparison graph
+At this point, when you run `switch compare`, your `my_graphing_comparison_function` function will be called and your comparison graph
 will be generated.
 
 ## Example
 
 In this example we create a graph that shows the power capacity during each period broken down by technology.
 
 ```python
+from switch_model.tools.graph import graph
+
+@graph(
+  "capacity_per_period",
+  title="Capacity per period"
+)
 def graph(tools):
-  # Get a dataframe of gen_cap.csv
-  gen_cap = tools.get_dataframe(csv="gen_cap")
-
-  # Add a 'gen_type' column to your dataframe
-  gen_cap = tools.add_gen_type_column(gen_cap)
-
-  # Aggregate the generation capacity by gen_type and PERIOD
-  capacity_df = gen_cap.pivot_table(
-    index='PERIOD',
-    columns='gen_type',
-    values='GenCapacity',
-    aggfunc=tools.np.sum,
-    fill_value=0  # Missing values become 0
-  )
-
-  # Get a new pair of axis to plot onto
-  ax = tools.get_new_axes(out="capacity_per_period")
-
-  # Plot
-  capacity_df.plot(
-    kind='bar',
-    ax=ax, # Notice we pass in the axis
-    stacked=True,
-    ylabel="Capacity Online (MW)",
-    xlabel="Period",
-    color=tools.get_colors(len(capacity_df.index))
-  )
+    # Get a dataframe of gen_cap.csv
+    df = tools.get_dataframe("gen_cap.csv")
+
+    # Add a 'gen_type' column to your dataframe
+    df = tools.transform.gen_type(df)
+
+    # Aggregate the generation capacity by gen_type and PERIOD
+    df = df.pivot_table(
+        index='PERIOD',
+        columns='gen_type',
+        values='GenCapacity',
+        aggfunc=tools.np.sum,
+        fill_value=0  # Missing values become 0
+    )
+
+    # Plot
+    df.plot(
+        kind='bar',
+        ax=tools.get_axes(),
+        stacked=True,
+        ylabel="Capacity Online (MW)",
+        xlabel="Period",
+        color=tools.get_colors(len(df.index))
+    )
 ```
 
 Running `switch graph` would run the `graph()` function above and create 
@@ -112,3 +148,37 @@ Running `switch graph` would run the `graph()` function above and create
 Running `switch compare` would create `capacity_per_period.png` containing
 your plot side-by-side with the same plot but for the scenario you're comparing to.
 
+### Testing your graphs
+
+To test your graphs, you can run `switch graph` or `switch compare`. However,
+this takes quite some time. If you want to test just one graphing function
+you can run `switch graph/compare -f FIGURE`. This will run only the graphing function
+you've defined. Here `FIGURE` should be the name of the graph (the first
+argument in `@graph()`, so `capacity_per_period` in the example above).
+
+### Creating graphs outside of SWITCH
+
+Sometimes you may want to create graphs but don't want to permently add
+them to the switch code. To do this create the following Python file anywhere
+on your computer.
+
+```python
+from switch_model.tools.graph import graph
+from switch_model.tools.graph.cli_graph import main as graph
+
+@graph(
+  ...
+)
+def my_first_graph(tools):
+    ...
+
+@graph(
+  ...
+)
+def my_second_graph(tools):
+  ...
+
+if __name__=="__main__":
+    graph(["--ignore-modules-txt"])
+```
+
@@ -1 +1 @@
-128823582.05
+128823582.05
@@ -76,7 +76,10 @@ def read(*rnames):
         "gurobipy",  # used to provided python bindings for Gurobi for faster solving
         "pyyaml",  # used to read configurations for switch
         "matplotlib",
-        "seaborn"
+        "seaborn",
+        "plotnine",
+        "scipy",
+        "pillow",  # Image processing to make plots stick together
     ],
     extras_require={
         # packages used for advanced demand response, progressive hedging
 
@@ -62,9 +62,9 @@ def main():
         elif cmd == "new":
             from switch_model.tools.new import main
         elif cmd == "graph":
-            from switch_model.tools.graphing.graph import main
+            from switch_model.tools.graph.cli_graph import main
         elif cmd == "compare":
-            from switch_model.tools.graphing.compare import main
+            from switch_model.tools.graph.cli_compare import main
         main()
     else:
         print(
 
@@ -7,6 +7,7 @@
 import os
 from pyomo.environ import *
 from switch_model.reporting import write_table
+from switch_model.tools.graph import graph
 
 dependencies = "switch_model.timescales"
 optional_dependencies = "switch_model.transmission.local_td"
@@ -282,9 +283,14 @@ def get_component_per_year(m, z, p, component):
     )
 
 
+@graph(
+    "energy_balance_duals",
+    title="Energy balance duals per period",
+    note="Note: Outliers and zero-valued duals are ignored.",
+)
 def graph(tools):
-    load_balance = tools.get_dataframe(csv="load_balance")
-    load_balance = tools.add_timestamp_info(load_balance)
+    load_balance = tools.get_dataframe("load_balance.csv")
+    load_balance = tools.transform.timestamp(load_balance)
     load_balance["energy_balance_duals"] = (
         tools.pd.to_numeric(
             load_balance["normalized_energy_balance_duals_dollar_per_mwh"],
@@ -294,13 +300,12 @@ def graph(tools):
     )
     load_balance = load_balance[["energy_balance_duals", "time_row"]]
     load_balance = load_balance.pivot(columns="time_row", values="energy_balance_duals")
+    # Don't include the zero-valued duals
+    load_balance = load_balance.replace(0, tools.np.nan)
     if load_balance.count().sum() != 0:
-        ax = tools.get_new_axes(
-            "energy_balance_duals", title="Energy balance duals per period"
-        )
         load_balance.plot.box(
-            ax=ax,
+            ax=tools.get_axes(),
             xlabel="Period",
             ylabel="Energy balance duals (cents/kWh)",
-            logy=True,
+            showfliers=False,
         )
@@ -11,6 +11,7 @@
 import os
 import pandas as pd
 from switch_model.reporting import write_table
+from switch_model.tools.graph import graph
 
 dependencies = "switch_model.timescales"
 
@@ -337,7 +338,6 @@ def load_inputs(mod, switch_data, inputs_dir):
 def post_solve(instance, outdir):
     m = instance
     # Overall electricity costs
-    # TODO use write_table
     normalized_dat = [
         {
             "PERIOD": p,
@@ -401,16 +401,28 @@ def post_solve(instance, outdir):
     write_table(instance, output_file=os.path.join(outdir, "costs_itemized.csv"), df=df)
 
 
+@graph("costs", title="Itemized costs per period", supports_multi_scenario=True)
 def graph(tools):
-    costs_itemized = tools.get_dataframe(csv="costs_itemized")
+    costs_itemized = tools.get_dataframe("costs_itemized.csv")
     # Remove elements with zero cost
     costs_itemized = costs_itemized[costs_itemized["AnnualCost_Real"] != 0]
+    groupby = "PERIOD" if tools.num_scenarios == 1 else ["PERIOD", "scenario_name"]
     costs_itemized = costs_itemized.pivot(
-        columns="Component", index="PERIOD", values="AnnualCost_Real"
+        columns="Component", index=groupby, values="AnnualCost_Real"
+    )
+    costs_itemized *= 1e-9  # Converting to billions
+    costs_itemized = costs_itemized.rename(
+        {
+            "GenVariableOMCostsInTP": "Variable O & M Generation Costs",
+            "FuelCostsPerPeriod": "Fuel Costs",
+            "StorageEnergyFixedCost": "Storage Energy Capacity Costs",
+            "TotalGenFixedCosts": "Generation Fixed Costs",
+            "TxFixedCosts": "Transmission Costs",
+        },
+        axis=1,
     )
-    costs_itemized *= 1e-9
     costs_itemized = costs_itemized.sort_values(axis=1, by=costs_itemized.index[-1])
-    ax = tools.get_new_axes(out="costs", title="Itemized costs per period")
+    ax = tools.get_axes()
     costs_itemized.plot(
         ax=ax,
         kind="bar",