Skip to content

Commit

Permalink
Merge pull request #177 from josenino95/main
Browse files Browse the repository at this point in the history
Milestone "Working on introduction issues" updates
  • Loading branch information
Bsolodzi authored Sep 2, 2024
2 parents 00a5582 + 814d887 commit 99ef8a1
Show file tree
Hide file tree
Showing 6 changed files with 54 additions and 83 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
episodes/*html
site/*
!site/README.md
*.Rproj

# History files
.Rhistory
Expand Down
35 changes: 7 additions & 28 deletions episodes/00-intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,31 +6,17 @@ exercises: 3

::::::::::::::::::::::::::::::::::::::: objectives

- Understand how to organize data so computers can make the best use of the data
- Define the scope of this lesson
- Describe some drawbacks and advantages of using spreadsheet programs

::::::::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::::::::::::: questions

- What are basic principles for using spreadsheets for good data organization?
- What are spreadsheets useful for in a research project?

::::::::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::::::::::::::: prereq

## Things You'll Need To Complete This Tutorial

#### Spreadsheet Software

To work through this tutorial you will need access to a spreadsheet program.
Many computers come with a pre-installed spreadsheet program like Excel. macOS users who use Apple's Numbers application should note that it does not contain some of the features (particularly data validation) that we will be using. Please use LibreOffice or Microsoft Excel instead.

If you do not have a spreadsheet program, install one using the instructions
in the link below.

- [Instructions to install a spreadsheet program.](../learners/setup.md)

::::::::::::::::::::::::::::::::::::::::::::::::::

Good data organization is the foundation of your research
project. Most researchers have data or do data entry in
Expand Down Expand Up @@ -89,8 +75,8 @@ Nevertheless it is important to be aware of the limitations these data may prese
- How to do *plotting* in a spreadsheet
- How to *write code* in spreadsheet programs

If you're looking to do this, a good reference is
[Head First Excel](https://www.amazon.com/Head-First-Excel-learners-spreadsheets/dp/0596807694/ref=sr_1_1?ie=UTF8&qid=1491594584&sr=8-1&keywords=head+first+excel), published by O'Reilly.
If you're looking to do this, a couple of good references are the
[Excel Cookbook](https://search.worldcat.org/title/1419271899), published by O'Reilly, and the [Microsoft Excel 365 bible](https://search.worldcat.org/en/title/1263023438).


::::::::::::::::::::::::::::::::::::::::::::::::::
Expand Down Expand Up @@ -119,19 +105,12 @@ In this lesson, we will assume that you are most likely using Excel as
your primary spreadsheet program - there are other programs with similar functionality but Excel seems
to be the most commonly used.

In this lesson we're going to talk about:

1. [Formatting data tables in spreadsheets](01-format-data.md)
2. [Formatting problems](02-common-mistakes.md)
3. [Dates as data](03-dates-as-data.md)
4. [Quality control](04-quality-assurance.md)
5. [Exporting data](05-exporting-data.md)



:::::::::::::::::::::::::::::::::::::::: keypoints

- Organizing your data tables according to tidy data principles will make them easier for you and others to use for analysis.
- Good data organization is the foundation of any research project.
- Spreadsheets are good for data entry, but when doing data cleaning or analysis, it's not easy to show or replicate what you did.

::::::::::::::::::::::::::::::::::::::::::::::::::

Expand Down
18 changes: 11 additions & 7 deletions episodes/01-format-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ exercises: 15

:::::::::::::::::::::::::::::::::::::::: questions

- What are some common challenges with formatting data in spreadsheets and how can we avoid them?
- How do we format data in spreadsheets for effective data use?

::::::::::::::::::::::::::::::::::::::::::::::::::

Expand Down Expand Up @@ -72,9 +72,9 @@ what you did when Reviewer #3 asks for a different analysis, you should

Put these principles in to practice today during the exercises.

### Structuring data in spreadsheets
### Tidy data in spreadsheets

The cardinal rules of using spreadsheet programs for data:
The tidy data principles when structuring data in spreadsheets are:

1. Put all your variables in columns - the thing you're measuring,
like 'weight' or 'temperature'.
Expand All @@ -87,6 +87,8 @@ The cardinal rules of using spreadsheet programs for data:
ensures that anyone can use the data, and is required by
most data repositories.

You can understand more easily these principles with the illustrations in the [Tidy Data Series by Lowndes & Horst](https://allisonhorst.com/other-r-fun).

For instance, we're going to be working with data from a study of
agricultural practices among farmers in two countries in eastern
sub-Saharan Africa (Mozambique and Tanzania). Researchers conducted
Expand Down Expand Up @@ -198,15 +200,17 @@ with this data and how you would fix it.

## Handy References

Two excellent references on spreadsheet organization are:
Three excellent references on spreadsheet organization are:

- Hadley Wickham, *Tidy Data*, Vol. 59, Issue 10, Sep 2014, Journal of
Statistical Software. [http://www.jstatsoft.org/v59/i10](https://www.jstatsoft.org/v59/i10)

- Julia Lowndes \& Allison Horst, *Tidy Data Series by Lowndes & Horst*. [https://allisonhorst.com/other-r-fun](https://allisonhorst.com/other-r-fun)

- Karl W. Broman \& Kara H. Woo, *Data Organization in Spreadsheets*, Vol. 72,
Issue 1, 2018, The American Statistician.
[https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375989](https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375989)

- Hadley Wickham, *Tidy Data*, Vol. 59, Issue 10, Sep 2014, Journal of
Statistical Software. [http://www.jstatsoft.org/v59/i10](https://www.jstatsoft.org/v59/i10)


::::::::::::::::::::::::::::::::::::::::::::::::::

Expand Down
2 changes: 1 addition & 1 deletion episodes/02-common-mistakes.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ exercises: 0

:::::::::::::::::::::::::::::::::::::::: questions

- What are some common challenges with formatting data in spreadsheets and how can we avoid them?
- What common mistakes are made when formatting spreadsheets?

::::::::::::::::::::::::::::::::::::::::::::::::::

Expand Down
11 changes: 0 additions & 11 deletions episodes/03-dates-as-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,17 +37,6 @@ One of the other reasons dates can be tricky is that most spreadsheet programs h

The first thing you need to know is that Excel stores dates as numbers - see the last column in the above figure. This serial number represents the number of days from December 31, 1899. In the example, July 2, 2014 is stored as the serial number 41822.

::::::::::::::::::::::::::::::::::::::::: callout

## Excel's date systems

Excel also entertains a second date system, the 1904 date system, as the default in Excel for Macintosh. This system will assign a
different serial number than the [1900 date system](https://support.microsoft.com/en-us/help/214330/differences-between-the-1900-and-the-1904-date-system-in-excel). Because of this,
[dates must be checked for accuracy when exporting data from Excel](https://uc3.cdlib.org/2014/04/09/abandon-all-hope-ye-who-enter-dates-in-excel/) (look for dates that are ~4 years off).


::::::::::::::::::::::::::::::::::::::::::::::::::

Using functions we can add days, months or years to a given date.
Say you had a research plan where you needed to conduct interviews with a
set of informants every ninety days for a year.
Expand Down
70 changes: 34 additions & 36 deletions learners/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,55 +31,53 @@ page](https://www.datacarpentry.org/socialsci-workshop/data).

## Software

To interact with spreadsheets, we can use [LibreOffice](https://www.libreoffice.org/),
Microsoft Excel, [Gnumeric](https://www.gnumeric.org/),
[Onlyoffice](https://www.onlyoffice.com/), [WPS office](https://www.wps.com/)
or other programs. Commands may differ a bit between programs, but
To work through this tutorial you will need access to a spreadsheet program. For this you have many options: [Microsoft Excel](https://www.microsoft.com/en-us/microsoft-365/excel), [LibreOffice](https://www.libreoffice.org/), [Apple Numbers](https://support.apple.com/numbers), [Gnumeric](http://www.gnumeric.org/), [Onlyoffice](https://www.onlyoffice.com/), [WPS office](https://www.wps.com/), among others. Commands may differ a bit between programs, but

Check warning on line 34 in learners/setup.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[needs HTTPS]: [Gnumeric](http://www.gnumeric.org/)

Check warning on line 34 in learners/setup.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[needs HTTPS]: [Gnumeric](http://www.gnumeric.org/)

Check warning on line 34 in learners/setup.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[needs HTTPS]: [Gnumeric](http://www.gnumeric.org/)

Check warning on line 34 in learners/setup.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[needs HTTPS]: [Gnumeric](http://www.gnumeric.org/)

Check warning on line 34 in learners/setup.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[needs HTTPS]: [Gnumeric](http://www.gnumeric.org/)

Check warning on line 34 in learners/setup.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[needs HTTPS]: [Gnumeric](http://www.gnumeric.org/)

Check warning on line 34 in learners/setup.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[needs HTTPS]: [Gnumeric](http://www.gnumeric.org/)

Check warning on line 34 in learners/setup.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[needs HTTPS]: [Gnumeric](http://www.gnumeric.org/)
the general ideas for thinking about spreadsheets are the same.

For this lesson, if you don't have a spreadsheet program already, you can use
LibreOffice. It's a free, open source spreadsheet program.
For this lesson, we encourage you to use LibreOffice or Microsoft Excel, as the tasks we will
be doing have been tested in these programs. If you don't have Microsoft Excel, you can use
LibreOffice. It's a free, open source spreadsheet program. Here are the instructions to install it:


macOS users who use Apple's Numbers application should note that it does not
contain some of the features (particularly data validation) that we will
be using. Please use LibreOffice or Microsoft Excel instead.

#### Windows

- Download the Installer
- Install LibreOffice by going to [the installation
page](https://www.libreoffice.org/download/libreoffice-fresh/). The version
for Windows should automatically be selected. Click Download Version X.X.X
(whichever is the most recent version).
- Install LibreOffice
- Once the installer is downloaded, double click on it and LibreOffice should
- **Download the Installer**
Install LibreOffice by going to the [installation
page](https://www.libreoffice.org/download/download-libreoffice/). The
version for Windows should automatically be selected. Click
**Download**. You will go to a page that asks about a
donation, but you don't need to make one. Your download should begin
automatically.
- **Install LibreOffice**
Once the installer is downloaded, double click on it and it should
install.

#### macOS
#### Mac OS X

- Download the Installer
- Install LibreOffice by going to [the installation
page](https://www.libreoffice.org/download/libreoffice-fresh/). The version
for Mac should automatically be selected. Click Download Version X.X.X
(whichever is the most recent version).
- Install LibreOffice
- Once the installer is downloaded, double click on it and LibreOffice should
install.
- **Download the Installer**
Install LibreOffice by going to the [installation
page](https://www.libreoffice.org/download/download-libreoffice/). The
version for macOS should automatically be selected. Click
**Download**. You will go to a page that asks about a
donation, but you don't need to make one. Your download should begin
automatically.
- **Install LibreOffice**
The file *LibreOffice\_X.X.X\_MacOS\_x86-64* (whichever version of LibreOffice you have selected) should have been
downloaded. Double click on this file, and LibreOffice will be
installed.

#### Linux

- Download the Installer
- Install LibreOffice by going to [the installation
page](https://www.libreoffice.org/download/libreoffice-fresh/). The version
for Linux should automatically be selected. Click Download Version X.X.X
(whichever is the most recent version).
- Install LibreOffice
- Once the installer is downloaded, double click on it and LibreOffice should
- **Download the Installer**
Install LibreOffice by going to the [installation
page](https://www.libreoffice.org/download/download-libreoffice/). The
version for Linux should automatically be selected. Click **Download**. You will go to a page that asks about a donation,
but you don't need to make one. Your download should begin
automatically.
- **Install LibreOffice**
Once the installer is downloaded, double click on it and it should
install.
- package manager option:
- pacman (Arch): `pacman -S libreoffice`
- yum (Fedora, CentOS): `yum install libreoffice`
- apt (Debian, Ubuntu): `apt install libreoffice`


::::::::::::::::::::::::::::::::::::::::::::::::::
Expand Down

0 comments on commit 99ef8a1

Please sign in to comment.