11 Software and other tools
I am compiling a list of useful resources, focusing on open-access and open science in this collaborative Airtable (Hope I remember to keep that updated!)
You can sort and filter by any category especially ‘resource’; I did an ad-hoc rating of ‘relevance to UG and Msc students’ in the final column.
To interact more with this, you can have full ‘commenter’ access link
11.1 Computers and network resources
Accessing your university network, etc
11.2 Document preparation systems
There are a variety of tools, both ‘what you see is what you get’ (WSYIWYG) … such as word processors , and ‘what you see is what you mean’ (such as Html, Latex, etc.), and everything in between. I personally prefer to use a mix of Markdown (R-markdown) and Latex, but these systems do take some time to learn well.
Maths notation
Maths equations can be specified directly and cleanly in both word processing systems and more. Try to use one of these systems rather than messy ‘work-arounds’ involving cutting and pasting.
Equation editor:
Microsoft has an interactive click-based ‘Equation editor’.
LaTeX equation syntax: There is a specific code-based syntax for creating equations and maths.
This is tied to the document preparation system LaTeX. However, the same system can be used across a variety of tools (including MS word and Google Docs, with the right plugins)!
This ends up being easier, faster, and better than click-based tools
There are really only a handful of rules (braces for fractions, backslashes before symbols, super/subscripts etc) and the rest (mainly the symbol names) can be easily looked up.
Overleaf offers a nice guide to this HERE (and Overleaf is also a very good web-based Latex ecosystem, with many other tutorials for LaTeX in general)
Word processors
(e.g., MS Word and the equation editor)
Latex and Latex interfaces
Latex
Overleaf
Lyx, SWP, etc
Markdown-based and ‘dynamic documents’ and notebooks
Thanks to technology, you can now produce ‘dynamic documents’ that integrate the writing of your paper with the code that produces (or at least refers to) the stats
KnitR, Rmarkdown and Bookdown (what this book is created with)
- See especially R-markdown, the definitive guide
Markdown: Basic syntax for creating content with raw text. Simpler than latex; used in the above
Pandoc: A tools that has many many libraries to convert between different document/scripting formats; integral to the above
Other options include
JuPytR notebooks (using any combination of Julia, Python, and R)
Stata 16 is offering some tool that integrates with MS Word, I believe.
11.3 Text editors
Please also see discussion of ‘flat (raw, plain) text files’ above.
A text editor is just what it sounds like. Unlike in a word processor, a text editor is showing you the ‘raw text’, without formatting (but it can do ‘syntax highlighting’ in different colors to make code more readable).
You typically write code within a text editor.
I also use it for taking notes and writing ‘markup’ and markdown.
My favorite text editor is ‘ViM’. Old-school nerds had a rivalry between ViM and Emacs. Both of these have great features, enabling efficient typing, navigation, fancy copy-paste, macros, regular expressions, etc. Vim also has its own language to help you ‘clean’ data, code, and files. The most popular text-editor of today might be Atom.
Programming package interfaces like Stata and R (RStudio) typically have their own text-editors built in, enabling you to write, save, and executer code directly from within these. You can configure these in many ways, changing ‘key bindings’, syntax highlighting, ‘code folding’ etc. (You can also configure other text editors to automatically send the code to programs like Stata and R.)
The ‘terminal’/command window/shell in Windows, Mac, and Unix
(I’ll come back to this)
11.4 Citation management tools
You do not need to enter in all your references and citations manually: there are many tools for this.
Storing and organizing your references
- Zotero, Mendeley, Endnote, Jabref etc.
See the following link: http://jabref.sourceforge.net/. Note that to use Jabref with word you should use Bibtex4Word see the following link: http://www.ee.ic.ac.uk/hp/staff/dmb/perl/index.html
Including citations and ‘bibliographies’ in your paper
Bibtex (for Latex and Markdown)
Plugins for word processors
11.5 Spreadsheets: just say no!
You are not advised to do data cleaning or statistical analysis in spreadsheet tools like Excel. You need a more powerful, systematic tool, that keeps a record of the steps you have taken. You need to ‘write code’ in some way.
11.6 Statistical and coding software
Stata
Stata is used by the majority of empirical economists, but this may be changing. It is fairly easy to learn as well as powerful and adaptable. It is not free or open-source (although people contribute a lot of code and scripts): you need a license, which students can get through (most) universities (including Exeter).
It is basically a language for working with data and doing statistics (especially econometrics). It’s not a ‘real programming language’.
Recommended online resources/guides (including more than just coding tips):
StataCorp resources listed here
R
The language statisticians use and more and more people in social science. It’s also a ‘real programming language’ although most people who use it are working with data/statistics. Very big in the new booming field of ‘data science’.
Completely open-source, collaborative and free.
The cutting-edge statistics and research-methods tools usually come out in R first (e.g., new machine-learning packages, the ‘declaredesign’ package for experiments, etc.).
Some killer features/tools include:
R-markdown, knitr, bookdown: Dynamic documents that can be made into pdfs, web-books, web-pages, web-based slides, etc.
ggplot: the best tool for graphical analysis/presentation of data
‘tidyr’ and the ‘tidyverse’
Rstudio
Recommended online resources/guides (including more than just coding tips):
However, it’s a bit harder to learn than Stata.
Other stats packages and coding tools
Python: Perhaps the most-popular coding language today, supposed to be easy to learn. It’s a ‘general programming language’ but many many statistical tools have been integrated. Very big in the new booming field of ‘data science’, maybe even more important than R.
SAS: Old but known to be very good with large data sets and thus has some popularity again
SPSS (not recommended)
11.7 (Other) Maths software
Matlab (also Maple, Mathematica)
11.8 Software for creating explanatory figures (not data-driven)
Figshare…
11.9 Resources for further study and research
11.10 Project management: Backing up, saving/storing your workflow
Backups
Git/Github and version control/project management
See ‘Happy Git with R’ for a great introduction.
Understanding project management, version control, and Git is a highly valuable skill.