Skip to content

Commit 69d0a5a

Browse files
committed
Minor fixes to Pandas.md
1 parent 98b671c commit 69d0a5a

File tree

1 file changed

+18
-18
lines changed

1 file changed

+18
-18
lines changed

docs/Pandas.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,20 +5,21 @@
55
In SWITCH, Pandas is mainly used to create graphs and also output files after solving.
66

77
This document gives a brief overview of key concepts and commands
8-
to get one started with Pandas. There are a lot better resources available
9-
online teaching that Pandas including entire online courses.
8+
to get started with Pandas. There are a lot better resources available
9+
online teaching Pandas, including entire online courses.
1010

11-
Finally, the Pandas [documentation](https://pandas.pydata.org/docs/)
11+
Most importantly, the Pandas [documentation](https://pandas.pydata.org/docs/)
1212
and [API reference](https://pandas.pydata.org/docs/reference/index.html#api) should be your go-to
1313
when trying to learn something new about Pandas.
1414

1515
## Key Concepts
1616

1717
### DataFrame
1818

19-
Dataframes are Pandas way of storing tabular data.
20-
They have rows, columns and labelled axes (e.g. row or column names).
21-
Dataframes are the primary pandas data structure. When manipulating data,
19+
Dataframes is the main Pandas data structure and is responsible for
20+
storing tabular data.
21+
Dataframes have rows, columns and labelled axes (e.g. row or column names).
22+
When manipulating data,
2223
the common practice is to store your main dataframe in a variable called `df`.
2324

2425
### Series
@@ -42,8 +43,8 @@ dataframe has 4 columns (A, B, C, D) and a custom index (the date).
4243
```
4344

4445
The same dataframe can be expressed without the custom index as follows.
45-
Here the date is a column just like the others and the index is the default
46-
numbered index.
46+
Here the date is a column just like the others and the index is the
47+
default index (just the row number).
4748

4849
```
4950
date A B C D
@@ -83,10 +84,10 @@ reads a csv file from filepath and returns a dataframe that gets stored
8384
created.
8485

8586
- `df.to_csv(filepath, index=False)`.
86-
This command will write a dataframe to `filepath`. `index=False` makes
87-
sure you don't write the index to the file. This should
88-
be used if your index is just `0, 1, 2, ...` in which case you probably
89-
don't want to write your index to the file.
87+
This command will write a dataframe to `filepath`. `index=False` means
88+
that the index is not written to the file. This should
89+
be used if you're not using custom indexes since you probably don't
90+
want the default index (just the row numbers) to be outputted to your csv.
9091

9192
- `df["column_name"]`: Returns a *Series* containing the values for that column.
9293

@@ -97,15 +98,15 @@ where the condition in the square brackets is met. In this case we filter out
9798
all the rows where the value under `column_name` is not `"some_value"`.
9899

99100
- `df.merge(other_df, on=["key_1", "key_2"])`: Merges `df` with `other_df`
100-
where the columns over which we are merging are columns `key_1` and `key_2`.
101+
where the columns over which we are merging are `key_1` and `key_2`.
101102

102103
- `df.info()`: Prints the columns in the dataframe and some info about each column.
103104

104-
- `df.head()`: Prints the start of the dataframe.
105+
- `df.head()`: Prints the first few rows in the dataframe.
105106

106-
- `df.drop_duplicates()`: Drops duplicates from the dataframe
107+
- `df.drop_duplicates()`: Drops duplicate rows from the dataframe
107108

108-
- `Series.unique()`: Returns a series where duplicates are dropped.
109+
- `Series.unique()`: Returns a series where duplicate values are dropped.
109110

110111
## Example
111112

@@ -116,7 +117,6 @@ of our generation plants from the SWITCH input files.
116117
import pandas as pd
117118

118119
# READ
119-
120120
gen_projects = pd.read_csv("generation_projects_info.csv", index_col=False)
121121
costs = pd.read_csv("gen_build_costs.csv", index_col=False)
122122
predetermined = pd.read_csv("gen_build_predetermined.csv", index_col=False)
@@ -141,5 +141,5 @@ gen_projects = gen_projects.merge(
141141
gen_projects.to_csv("projects.csv", index=False)
142142
```
143143

144-
If you run the following code snippet it will create a `projects.csv` file
144+
If you run the following code snippet in the `inputs folder` it will create a `projects.csv` file
145145
containing the project data, cost data and prebuild data all in one file.

0 commit comments

Comments
 (0)